业务需求:爬取阿里品牌数据银行的自定义模块==>>>人群透视==>>>查看报告==>>数据

发现:数据通过websocket接口传递,此类型接口的详细理解大家可以参考:https://segmentfault.com/a/1190000013149749

最终获取页面:

页面获取情况如下:

绿色的就是我们需要模拟的请求,红色朝下的就是请求对应的数据,通过rid参数来找寻对应请求和数据,这个rid和时间戳很类似,没错这就是一个13位的时间戳,和随机数组合而成的:

  1. randomID = str(int(time.time()*1000))+str(self.count).zfill(3)

观察发送的请求:

  1. message = {
      "method": "/iSheetCrowdService/report",
  2. "headers": {"rid": random_id, "type": "PULL"},
  3. "body": {
  4.   "args": {
  5.   "id": "286",
  6. "condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200224\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66466812]}]}",
  7. "tags": ["mobile_brand_name_prefer"],
  8. "bizParam": {
  9.   "databankCrowdId": "50104182",
  10. "bizType": "CUSTOM_ANALYSIS",
  11. "tag_identifier": "all",
  12. "captcha":'%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582700342424%3A0.23973179491554664%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23i5kRCD9eEE%2BqAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO%2F%2BDqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM%2B1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf%2F8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP%2BeqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE%2FaVEWBKYt4%2FavytijGp1%2FQCeg%2F2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x%2BfF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ%2BPvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS%2FhdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT%2FpfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe%2FbdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo%2FhPlYLykr4j14BCvVlwMUtgZbK%2BpgNu7vKERSRkRosaHoNotkt%2B%2BToClNeIRM%2Fk7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4%2B0sYXOjxvChWtv2%2FSMKb9fIeSKsALCP%2FNshczJBF5y1TMo4YIPh7%2BaMuMfcUqmMC%2BWrr1Xm%2FUAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ%2F3id3DwjcdgDENA6%2F8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc%2B09xUvKcueMc%2FPrKKRhC%2FuRLXdx0WzgP4%2F2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y%2BwPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn%2FVUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg%2B6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v%2BIxx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx%2BGrBAj50ha%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D'
  13. },
  14. "insightType": 0,
  15. "interaction": 'false',
  16. "rateParam": {},
  17. "appId": "208"}}}

其中关注几个变化的参数:databankCrowdId参数表示人群ID,condition参数含义未知可能与标签有关,可从https://databank.tmall.com/api/ecapi返回的数据得到;captcha则是先需要URL解码得到是字典格式的数据

  1. {"a":"TSF2",
  2. "c":"1582700342424:0.23973179491554664",
  3. "d":"nvc_register",
  4. "h":{"key1":"code0","nvcCode":400,"umidToken":"T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE"},
  5. "j":{"test":1},
    "b":"122#i5kRCD9eEE+qAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep+FQLpoGUEELWn4yE7SNEEP7ZpERBuDPE+BQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO/+DqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM+1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf/8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP+eqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE/aVEWBKYt4/avytijGp1/QCeg/2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x+fF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ+PvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS/hdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT/pfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe/bdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo/hPlYLykr4j14BCvVlwMUtgZbK+pgNu7vKERSRkRosaHoNotkt++ToClNeIRM/k7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4+0sYXOjxvChWtv2/SMKb9fIeSKsALCP/NshczJBF5y1TMo4YIPh7+aMuMfcUqmMC+Wrr1Xm/UAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ/3id3DwjcdgDENA6/8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc+09xUvKcueMc/PrKKRhC/uRLXdx0WzgP4/2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y+wPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn/VUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg+6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v+Ixx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx+GrBAj50ha",
  6. "e":"HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA"}umidToken

c参数是时间戳+":"+随机数;h.umidToken也是和时间有关的数据,可从um.json页面获取tn参数值;b参数需要复杂计算,但是通过测试可以为空字符串

代码:

  1. # -*- coding:utf-8 -*-
  2.  
  3. import base64
  4. import json
  5. import os
  6. import ssl
  7. import zlib
  8.  
  9. import requests
  10. import websocket
  11. from requests.adapters import HTTPAdapter
  12.  
  13. try:
  14. import thread
  15. except ImportError:
  16. import _thread as thread
  17. import time
  18.  
  19. class Test(object):
  20. def __init__(self, cookie_info):
  21. self.count = 0
  22. self.ws = None
  23. super(Test, self).__init__()def on_message(self, message):
  24. print("####### on_message #######")
  25. print(message)
  26.  
  27. def on_error(self, error):
  28. print("####### on_error #######")
  29. print(error)
  30.  
  31. def on_close(self):
  32. print("####### on_close #######")
  33. print(self)
  34. print("####### closed #######")
  35.  
  36. def on_open(self):
  37. def run(*args):
  38. # for i in range(5):
  39. random_id = str(int(time.time() * 1000)) + str(self.count).zfill(2)
  40. print('random_id-----------------------------', random_id)
  41. message = {"method": "/iSheetCrowdService/report",
  42. "headers": {"rid": random_id, "type": "PULL"},
  43. "body": {
  44. "args": {
  45. "id": "286",
  46. "condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200224\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66466812]}]}",
  47. "tags": ["mobile_brand_name_prefer"],
  48. "bizParam": {
  49. "databankCrowdId": "50104182",
  50. "bizType": "CUSTOM_ANALYSIS",
  51. "tag_identifier": "all",
  52. "captcha":'%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582700342424%3A0.23973179491554664%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23i5kRCD9eEE%2BqAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO%2F%2BDqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM%2B1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf%2F8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP%2BeqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE%2FaVEWBKYt4%2FavytijGp1%2FQCeg%2F2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x%2BfF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ%2BPvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS%2FhdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT%2FpfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe%2FbdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo%2FhPlYLykr4j14BCvVlwMUtgZbK%2BpgNu7vKERSRkRosaHoNotkt%2B%2BToClNeIRM%2Fk7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4%2B0sYXOjxvChWtv2%2FSMKb9fIeSKsALCP%2FNshczJBF5y1TMo4YIPh7%2BaMuMfcUqmMC%2BWrr1Xm%2FUAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ%2F3id3DwjcdgDENA6%2F8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc%2B09xUvKcueMc%2FPrKKRhC%2FuRLXdx0WzgP4%2F2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y%2BwPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn%2FVUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg%2B6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v%2BIxx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx%2BGrBAj50ha%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D'
  53. },
  54. "insightType": 0,
  55. "interaction": 'false',
  56. "rateParam": {},
  57. "appId": "208"}}}
  58. # message = {"method":"/iSheetCrowdService/report","headers":{"rid":"158270035289505","type":"PULL"},"body":{"args":{"id":"286","condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200225\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66466812]}]}","tags":["mobile_brand_name_prefer"],"bizParam":{"databankCrowdId":"50104182","bizType":"CUSTOM_ANALYSIS","tag_identifier":"all","captcha":"%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582700342424%3A0.23973179491554664%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23i5kRCD9eEE%2BqAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO%2F%2BDqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM%2B1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf%2F8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP%2BeqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE%2FaVEWBKYt4%2FavytijGp1%2FQCeg%2F2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x%2BfF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ%2BPvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS%2FhdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT%2FpfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe%2FbdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo%2FhPlYLykr4j14BCvVlwMUtgZbK%2BpgNu7vKERSRkRosaHoNotkt%2B%2BToClNeIRM%2Fk7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4%2B0sYXOjxvChWtv2%2FSMKb9fIeSKsALCP%2FNshczJBF5y1TMo4YIPh7%2BaMuMfcUqmMC%2BWrr1Xm%2FUAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ%2F3id3DwjcdgDENA6%2F8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc%2B09xUvKcueMc%2FPrKKRhC%2FuRLXdx0WzgP4%2F2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y%2BwPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn%2FVUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg%2B6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v%2BIxx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx%2BGrBAj50ha%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D"},"insightType":0,"interaction":'false',"rateParam":{},"appId":"208"}}}
  59. # message = {"method":"/iSheetCrowdService/report","headers":{"rid":"158270351707803","type":"PULL"},"body":{"args":{"id":"286","condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200225\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66021683]}]}","tags":["mobile_brand_name_prefer"],"bizParam":{"databankCrowdId":"50083012","bizType":"CUSTOM_ANALYSIS","tag_identifier":"all","captcha":"%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582703516277%3A0.4058205419923615%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T64FCFB4BB4291AEADBCE045345EFA40062471EC081DBF0B70E8D984C29%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23zKE7CE5OEEx2EDpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj9qbj48oL6JlcpyBfDq9XpD5VanSL1uO5EELXZ8opwJ4EEyFvTkzlvDDpanAG4ul5EEtpr8CpUJ4bEyF3mqW32E5VanSL1uOIEELXZ8oL6JzEEyBfDqMfbDEb3ngL48liEmLXr8yt3%2ByYEmtWNsLigEzj4EOBJ1NftM0%2BL319%2FxuhJLSWywyyApnX897f6NIAyuyM%2BFgpQ9FU32VbYt2axX3Tlt%2BAQRv%2FkGLICcUjd0kuLF22mDbTj6q7rtA8xPM516jlZ9xnr%2FUhA%2Brn4Q7IxoSFpd3G7i03eGV6%2BEGYLr6Y43nfsZi8lWUZwISbYkgdNc%2F%2Bi5j6v4vfXndqyxsa3JOL7zraUIx7w3z%2BznKvXvHVsFFm56M6LrV8OzJQ8%2BkUw4ueUfYGEqGvNW8eTS%2B4SNpX73oQzDtsSgkAWxradYxWP%2F9OzqWvQp6rAkrdjYfnp%2BQQnf0eaGLCgHngMKG8xbPtABa0tUkkm41LFZlZB1olrI1LJYJojWRQXPkJ2qzIZmP4A%2FLeIH6nnGmqG5%2Fdk%2BR85BxFkzRfNvs3F4xKCJY7BLXJ%3D%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D"},"insightType":0,"interaction":'false',"rateParam":{},"appId":"208"}}}
  60. # message = {"method":"/iSheetCrowdService/report","headers":{"rid":"158279331489213","type":"PULL"},"body":{"args":{"id":"286","condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200224\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66477382]}]}","tags":["mobile_brand_name_prefer"],"bizParam":{"databankCrowdId":"50104297","bizType":"CUSTOM_ANALYSIS","tag_identifier":"all","captcha":"%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582793313952%3A0.5572064376494401%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T4B68EC0873FE11C30B608CA1D8453CD51D467E11E58EAC62C1D92D68D3%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D"},"insightType":0,"interaction":'false',"rateParam":{},"appId":"208"}}}
  61. self.ws.send(json.dumps(message))
  62. self.count += 2
  63. # run()
  64. thread.start_new_thread(run, ())
  65.  
  66. def start(self):
  67. randomness = os.urandom(16)
  68. sec_websocket_key = base64.b64encode(randomness).decode('utf-8').strip()
  69. print(sec_websocket_key)
  70. header = {'Cookie': self.cookie,
  71. # 'x_csrf_token': self.x_csrf_token,
  72. 'Host': 'ws-insight-engine.tmall.com',
  73. 'Origin': 'https://insight-engine.tmall.com',
  74. 'Pragma': 'no-cache',
  75. 'Sec-WebSocket-Extensions': 'permessage-deflate; client_max_window_bits',
  76. # 这个参数要进行实时修改
  77. 'Sec-WebSocket-Key': sec_websocket_key, # 'IiVEH4VXg4doluxZ+jjhwA==',
  78. 'Sec-WebSocket-Version': '13',
  79. 'Upgrade': 'websocket',
  80. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'
  81. }
  82. url = 'wss://ws-insight-engine.tmall.com/'
  83. # ob = Test(cookie)
  84. websocket.enableTrace(True)
  85. self.ws = websocket.WebSocketApp(url=url,
  86. on_message=self.on_message,
  87. on_error=self.on_error,
  88. on_close=self.on_close,
  89. # on_open=self.on_open,
  90. header=header
  91. )
  92. self.ws.on_open = self.on_open
  93. self.ws.run_forever(ping_interval=5)
         self.ws.close()
  1. if __name__ == '__main__':
  2.  
  3.   # cookie = 'cna=WVOlFr5u0GICAbSnr+SPipyd; lid=%E6%9B%B2%E7%BB%88%E4%BA%BA%E6%95%A3%E7%9A%84%E5%AF%82%E5%AF%9E%E6%B2%A1%E4%BA%BA%E6%87%82; enc=9cXAFThmyAgHiO%2FENYqm6TMx3M2iB6cB2%2BAszb%2F%2BiaM50PadsWGSN3OKooNrUXVyWs8ipZ6twUdCdaV%2BjkuBrA%3D%3D; t=7ed2d6ca2123fa1881e6b8fe97f185bd; _tb_token_=f36be58353654; cookie2=1643f0defa3225c1ed6be959bfc91256; unb=2206632888386; sn=anessa%E5%AE%89%E7%83%AD%E6%B2%99%E5%AE%98%E6%96%B9%E6%97%97%E8%88%B0%E5%BA%97%3Ait; __YSF_SESSION__={"baseId":"16a92e2747e29249","brandId":"bb52e1138cab1ddf","departmentId":"dc2f99d94018bd16","smartId":"1b66d71c287d3867","databankProjectId":"80457668a4affd3e"}; welcomeShownTime=1582606263036; sgcookie=DFzxOMaFc%2BovDXgNBTCL1; uc1=cookie14=UoTUOLUnHn2Ybg%3D%3D&lng=zh_CN; csg=76c331bb; _mw_us_time_=1582617537341; l=dBSesswPQExBY34QBOCNhOlrE8bOSIRAguzM48-6i_5a-OY6kYQOo8CFCHv6VjWfTzTB47_ypV99-etkV3MmndUU0ThjGxDc.; isg=BGVlRhs9VzAJErOY2VVv-JPpdCGfohk0W6VGnmdKIRyrfoXwL_IpBPMeDOII_jHs'
      # cookie = 'cna=WVOlFr5u0GICAbSnr+SPipyd; lid=%E6%9B%B2%E7%BB%88%E4%BA%BA%E6%95%A3%E7%9A%84%E5%AF%82%E5%AF%9E%E6%B2%A1%E4%BA%BA%E6%87%82; enc=9cXAFThmyAgHiO%2FENYqm6TMx3M2iB6cB2%2BAszb%2F%2BiaM50PadsWGSN3OKooNrUXVyWs8ipZ6twUdCdaV%2BjkuBrA%3D%3D; t=7ed2d6ca2123fa1881e6b8fe97f185bd; _tb_token_=f36be58353654; cookie2=1643f0defa3225c1ed6be959bfc91256; unb=2206632888386; sn=anessa%E5%AE%89%E7%83%AD%E6%B2%99%E5%AE%98%E6%96%B9%E6%97%97%E8%88%B0%E5%BA%97%3Ait; __YSF_SESSION__={"baseId":"16a92e2747e29249","brandId":"bb52e1138cab1ddf","departmentId":"dc2f99d94018bd16","smartId":"1b66d71c287d3867","databankProjectId":"80457668a4affd3e"}; welcomeShownTime=1582606263036; sgcookie=DFzxOMaFc%2BovDXgNBTCL1; uc1=cookie14=UoTUOLUnHn2Ybg%3D%3D&lng=zh_CN; csg=76c331bb; _mw_us_time_=1582617537341; isg=BF9fYqxYfUJhcHkq73MVis137rPpxLNm9esMTPGs_I5VgH8C-ZfStt1CQBD-A4ve; l=dBSesswPQExBYE42BOCwlOlrE8bOSIRYYuzM48-6i_5aJ6T18q_Oo8COPF96VjS1MoLB47_ypV99-etkqpxxF_kJyFrhit77nea_IFimexf..'
      cookie = 'cna=WVOlFr5u0GICAbSnr+SPipyd; lid=%E6%9B%B2%E7%BB%88%E4%BA%BA%E6%95%A3%E7%9A%84%E5%AF%82%E5%AF%9E%E6%B2%A1%E4%BA%BA%E6%87%82; enc=9cXAFThmyAgHiO%2FENYqm6TMx3M2iB6cB2%2BAszb%2F%2BiaM50PadsWGSN3OKooNrUXVyWs8ipZ6twUdCdaV%2BjkuBrA%3D%3D; t=d1c8dd737265e71d81c287ffa0e31436; _tb_token_=346ebfbaf3b65; cookie2=1d7315e9db402b38ebd67b74e172fdce; unb=2206632888386; sn=anessa%E5%AE%89%E7%83%AD%E6%B2%99%E5%AE%98%E6%96%B9%E6%97%97%E8%88%B0%E5%BA%97%3Ait; __YSF_SESSION__={"baseId":"16a92e2747e29249","brandId":"bb52e1138cab1ddf","departmentId":"dc2f99d94018bd16","smartId":"1b66d71c287d3867","databankProjectId":"80457668a4affd3e"}; welcomeShownTime=1582781993068; sgcookie=Dp%2FVOMDrzVopvKfT1OIMP; uc1=lng=zh_CN&cookie14=UoTUOLRxVjlLtw%3D%3D; csg=4e11fe74; _mw_us_time_=1582789699772; isg=BOjoR_A8Itylow4vXKYaZ44qudb6EUwbFoabiaIZNGNW_YhnSiEcq34_8ZUNVgTz; l=dBSesswPQExBY5b2BOCanurza77OSIRYYuPzaNbMi_5LR6T_3-bOo8mLsF96VjWfG1TB47_ypV99-etkq3DmndUU0ThjGxDc'
      Test(cookie).start()

需要注意的是构造好请求头和更新cookie,否则连接失败,websocket库新老版本架构不一致,如果版本过旧会导致代码报错可参考博客https://blog.csdn.net/tz_zs/article/details/95492963

参考博客:

https://segmentfault.com/a/1190000013149749

https://blog.csdn.net/weixin_41505223/article/details/84401044

https://blog.csdn.net/tz_zs/article/details/95492963

python爬虫----爬取阿里数据银行websocket接口的更多相关文章

  1. python爬虫爬取天气数据并图形化显示

    前言 使用python进行网页数据的爬取现在已经很常见了,而对天气数据的爬取更是入门级的新手操作,很多人学习爬虫都从天气开始,本文便是介绍了从中国天气网爬取天气数据,能够实现输入想要查询的城市,返回该 ...

  2. python爬虫——爬取网页数据和解析数据

    1.网络爬虫的基本概念 网络爬虫(又称网络蜘蛛,机器人),就是模拟客户端发送网络请求,接收请求响应,一种按照一定的规则,自动地抓取互联网信息的程序.只要浏览器能够做的事情,原则上,爬虫都能够做到. 2 ...

  3. 使用python爬虫爬取股票数据

    前言: 编写一个爬虫脚本,用于爬取东方财富网的上海股票代码,并通过爬取百度股票的单个股票数据,将所有上海股票数据爬取下来并保存到本地文件中 系统环境: 64位win10系统,64位python3.6, ...

  4. 用Python爬虫爬取广州大学教务系统的成绩(内网访问)

    用Python爬虫爬取广州大学教务系统的成绩(内网访问) 在进行爬取前,首先要了解: 1.什么是CSS选择器? 每一条css样式定义由两部分组成,形式如下: [code] 选择器{样式} [/code ...

  5. 使用Python爬虫爬取网络美女图片

    代码地址如下:http://www.demodashi.com/demo/13500.html 准备工作 安装python3.6 略 安装requests库(用于请求静态页面) pip install ...

  6. Python爬虫|爬取喜马拉雅音频

    "GOOD Python爬虫|爬取喜马拉雅音频 喜马拉雅是知名的专业的音频分享平台,用户规模突破4.8亿,汇集了有声小说,有声读物,儿童睡前故事,相声小品等数亿条音频,成为国内发展最快.规模 ...

  7. Python爬虫爬取全书网小说,程序源码+程序详细分析

    Python爬虫爬取全书网小说教程 第一步:打开谷歌浏览器,搜索全书网,然后再点击你想下载的小说,进入图一页面后点击F12选择Network,如果没有内容按F5刷新一下 点击Network之后出现如下 ...

  8. python爬虫—爬取英文名以及正则表达式的介绍

    python爬虫—爬取英文名以及正则表达式的介绍 爬取英文名: 一.  爬虫模块详细设计 (1)整体思路 对于本次爬取英文名数据的爬虫实现,我的思路是先将A-Z所有英文名的连接爬取出来,保存在一个cs ...

  9. MATLAB爬虫爬取股票数据

    近年来,大数据盛行,有关爬虫的教程层次不穷.那么,爬虫到底是什么呢? 什么是爬虫? 百度百科是这样定义的: 网络爬虫(又被称为网页蜘蛛,网络机器人,在FOAF社区中间,更经常的称为网页追逐者),是一种 ...

随机推荐

  1. prometheus operator(Kubernetes 集群监控)

    一.Prometheus Operator 介绍 Prometheus Operator 是 CoreOS 开发的基于 Prometheus 的 Kubernetes 监控方案,也可能是目前功能最全面 ...

  2. Fastdfs php扩展访问

    一.安装FastDFS client php extension compiled under PHP 5.4 and PHP 7.0   1.安装php扩展,进入fastdfs源码文件夹中的  ph ...

  3. linux 删除文件 磁盘空间未释放

    具体情况就是:删除了一个超大文件后,发现磁盘空间没有变化 原因:有进程正在使用这个文件,虽然我们从文件系统的目录结构上解除链接(unlink),然而文件是被 打开的(有一个进程正在使用),那么进程将仍 ...

  4. [MacOS]Sublime text3 安装(一)

    官网地址 https://www.sublimetext.com/ 直接下载地址(MacOS) https://download.sublimetext.com/Sublime%20Text%20Bu ...

  5. 用C语言实现中国象棋

    基于五子棋框架上的 象棋 小游戏 本游戏是上各种水课无聊时的产物...不参考现有游戏从零开始实现各项功能. 游戏配置:二维数组,循环系统,wasd基本移动,调整窗台的函数,以及富足的发呆时间.. 完整 ...

  6. pikachu-远程代码、命令执行漏洞(RCE)

    一.RCE概述 1.1 什么是RCE? RCE漏洞,可以让攻击者直接向后台服务器远程注入操作系统命令或者代码,从而控制后台系统. 1.2 远程系统命令执行 一般出现这种漏洞,是因为应用系统从设计上需要 ...

  7. 【Android开发艺术探索】四大组件的工作过程

    个人博客 http://www.milovetingting.cn 四大组件的工作过程 四大组件:Activity.Service.BroadcastReceiver.ContentProvider ...

  8. 「数据挖掘入门系列」Python快速入门

    Python环境搭建 本次入门系列将使用Python作为开发语言.要使用Python语言,我们先来搭建Python开发平台.我们将基于Python 2.7版本.以及Python的开发发行版本Anaco ...

  9. mysql 不能加载表问题

    记录一次 mysql 5.7 下,出现重启数据库后不能加载特定表的问题处理. 搜索了很多的类似的错误,大多都是说因为外键同名的索引丢失的情况.但在5.7这个版本下,会禁止更新外键关联的索引. 最后经过 ...

  10. QingTing.Fm-WPF是调用蜻蜓FMAPI 查询API内容展示,进行播放

    QingTing.Fm 是调用蜻蜓FM   API 查询界面内容,进行在线播放. Release地址下载 环境 Visual Studio 2019,dotNet Framework 4.6.1 SD ...