python爬虫----爬取阿里数据银行websocket接口
业务需求:爬取阿里品牌数据银行的自定义模块==>>>人群透视==>>>查看报告==>>数据
发现:数据通过websocket接口传递,此类型接口的详细理解大家可以参考:https://segmentfault.com/a/1190000013149749
最终获取页面:
页面获取情况如下:
绿色的就是我们需要模拟的请求,红色朝下的就是请求对应的数据,通过rid参数来找寻对应请求和数据,这个rid和时间戳很类似,没错这就是一个13位的时间戳,和随机数组合而成的:
- randomID = str(int(time.time()*1000))+str(self.count).zfill(3)
观察发送的请求:
- message = {
"method": "/iSheetCrowdService/report",- "headers": {"rid": random_id, "type": "PULL"},
- "body": {
- "args": {
- "id": "286",
- "condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200224\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66466812]}]}",
- "tags": ["mobile_brand_name_prefer"],
- "bizParam": {
- "databankCrowdId": "50104182",
- "bizType": "CUSTOM_ANALYSIS",
- "tag_identifier": "all",
- "captcha":'%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582700342424%3A0.23973179491554664%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23i5kRCD9eEE%2BqAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO%2F%2BDqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM%2B1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf%2F8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP%2BeqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE%2FaVEWBKYt4%2FavytijGp1%2FQCeg%2F2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x%2BfF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ%2BPvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS%2FhdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT%2FpfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe%2FbdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo%2FhPlYLykr4j14BCvVlwMUtgZbK%2BpgNu7vKERSRkRosaHoNotkt%2B%2BToClNeIRM%2Fk7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4%2B0sYXOjxvChWtv2%2FSMKb9fIeSKsALCP%2FNshczJBF5y1TMo4YIPh7%2BaMuMfcUqmMC%2BWrr1Xm%2FUAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ%2F3id3DwjcdgDENA6%2F8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc%2B09xUvKcueMc%2FPrKKRhC%2FuRLXdx0WzgP4%2F2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y%2BwPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn%2FVUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg%2B6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v%2BIxx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx%2BGrBAj50ha%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D'
- },
- "insightType": 0,
- "interaction": 'false',
- "rateParam": {},
- "appId": "208"}}}
其中关注几个变化的参数:databankCrowdId参数表示人群ID,condition参数含义未知可能与标签有关,可从https://databank.tmall.com/api/ecapi返回的数据得到;captcha则是先需要URL解码得到是字典格式的数据
- {"a":"TSF2",
- "c":"1582700342424:0.23973179491554664",
- "d":"nvc_register",
- "h":{"key1":"code0","nvcCode":400,"umidToken":"T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE"},
- "j":{"test":1},
"b":"122#i5kRCD9eEE+qAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep+FQLpoGUEELWn4yE7SNEEP7ZpERBuDPE+BQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO/+DqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM+1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf/8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP+eqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE/aVEWBKYt4/avytijGp1/QCeg/2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x+fF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ+PvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS/hdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT/pfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe/bdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo/hPlYLykr4j14BCvVlwMUtgZbK+pgNu7vKERSRkRosaHoNotkt++ToClNeIRM/k7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4+0sYXOjxvChWtv2/SMKb9fIeSKsALCP/NshczJBF5y1TMo4YIPh7+aMuMfcUqmMC+Wrr1Xm/UAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ/3id3DwjcdgDENA6/8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc+09xUvKcueMc/PrKKRhC/uRLXdx0WzgP4/2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y+wPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn/VUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg+6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v+Ixx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx+GrBAj50ha",- "e":"HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA"}umidToken
c参数是时间戳+":"+随机数;h.umidToken也是和时间有关的数据,可从um.json页面获取tn参数值;b参数需要复杂计算,但是通过测试可以为空字符串
代码:
- # -*- coding:utf-8 -*-
- import base64
- import json
- import os
- import ssl
- import zlib
- import requests
- import websocket
- from requests.adapters import HTTPAdapter
- try:
- import thread
- except ImportError:
- import _thread as thread
- import time
- class Test(object):
- def __init__(self, cookie_info):
- self.count = 0
- self.ws = None
- super(Test, self).__init__()def on_message(self, message):
- print("####### on_message #######")
- print(message)
- def on_error(self, error):
- print("####### on_error #######")
- print(error)
- def on_close(self):
- print("####### on_close #######")
- print(self)
- print("####### closed #######")
- def on_open(self):
- def run(*args):
- # for i in range(5):
- random_id = str(int(time.time() * 1000)) + str(self.count).zfill(2)
- print('random_id-----------------------------', random_id)
- message = {"method": "/iSheetCrowdService/report",
- "headers": {"rid": random_id, "type": "PULL"},
- "body": {
- "args": {
- "id": "286",
- "condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200224\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66466812]}]}",
- "tags": ["mobile_brand_name_prefer"],
- "bizParam": {
- "databankCrowdId": "50104182",
- "bizType": "CUSTOM_ANALYSIS",
- "tag_identifier": "all",
- "captcha":'%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582700342424%3A0.23973179491554664%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23i5kRCD9eEE%2BqAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO%2F%2BDqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM%2B1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf%2F8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP%2BeqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE%2FaVEWBKYt4%2FavytijGp1%2FQCeg%2F2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x%2BfF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ%2BPvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS%2FhdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT%2FpfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe%2FbdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo%2FhPlYLykr4j14BCvVlwMUtgZbK%2BpgNu7vKERSRkRosaHoNotkt%2B%2BToClNeIRM%2Fk7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4%2B0sYXOjxvChWtv2%2FSMKb9fIeSKsALCP%2FNshczJBF5y1TMo4YIPh7%2BaMuMfcUqmMC%2BWrr1Xm%2FUAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ%2F3id3DwjcdgDENA6%2F8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc%2B09xUvKcueMc%2FPrKKRhC%2FuRLXdx0WzgP4%2F2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y%2BwPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn%2FVUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg%2B6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v%2BIxx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx%2BGrBAj50ha%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D'
- },
- "insightType": 0,
- "interaction": 'false',
- "rateParam": {},
- "appId": "208"}}}
- # message = {"method":"/iSheetCrowdService/report","headers":{"rid":"158270035289505","type":"PULL"},"body":{"args":{"id":"286","condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200225\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66466812]}]}","tags":["mobile_brand_name_prefer"],"bizParam":{"databankCrowdId":"50104182","bizType":"CUSTOM_ANALYSIS","tag_identifier":"all","captcha":"%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582700342424%3A0.23973179491554664%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T08C67EE3AD81E11A23D01F0EE4CA1134D6022447F84D66B6623678D5FE%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23i5kRCD9eEE%2BqAJpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj97bm98oL6JXWpO%2F%2BDqMfEl8WxXWplul5EELXZ8CL6JNbERF3mqM3okBTlpM%2B1ul9rDLVZ8CRfJ4bEyF3mqW32E5pangLlul5EELXZ8oL6JNEEyB3DqMfbDEpxnSp1tP0EDLf%2F8om6J4EqRoBHqevfAIAKkS7gz4MzTChsGRkP%2BeqxYlg3ypGmtolPBVsN6ovcGQYuNVCPWaTtjBbU5X0QWvFMSFzE%2FaVEWBKYt4%2FavytijGp1%2FQCeg%2F2SfxjZw8fIlfio42QXEpWhdEspy4fBSEZgHrMRhwhnBIFLuxd8hTc7Y99x%2BfF2UuU5oOXLG0an0CHvFQqCYMqZXb8myhlZHuHOXI0GZ%2BPvFxGzVjTf4AmL63HeSjbSP8L6CSCuq47zX7tPDBWS%2FhdLqwO8dtOkEXes248plPRkVf5gVlcqsHSdYxDkjT%2FpfSszn8vXlIUg3GRKm9eMWi09PxekLM2tUu0nivXzonmbHdGHLsqhNe%2FbdjoKHYG6ygnnx3aZ9DU9ugUWmZgB9Ztbv1BYo%2FhPlYLykr4j14BCvVlwMUtgZbK%2BpgNu7vKERSRkRosaHoNotkt%2B%2BToClNeIRM%2Fk7vQm1x0YbZT3hzfU9k5kadIafvosIReZwiQhd4%2B0sYXOjxvChWtv2%2FSMKb9fIeSKsALCP%2FNshczJBF5y1TMo4YIPh7%2BaMuMfcUqmMC%2BWrr1Xm%2FUAjtyHnttlwQlkGDRPsSS4DMlM0OdWYLD9vL0ekEm7iz566ESLHP2aykbJ%2F3id3DwjcdgDENA6%2F8oojlvm6WP0JBDUviVTeDPK9V5RFekM3drtuDFit2UwRc%2B09xUvKcueMc%2FPrKKRhC%2FuRLXdx0WzgP4%2F2RJJeZhuKEQhYDTsFiDZnArDQMQyiMN0hRuwdfalhZFe27jXUG4Y%2BwPnHvpj1OXxEJ1VOgEtBzOO1AgwUr5SYa6UKhrbynm2X1J1HRyEn%2FVUqkvgc9Rx8ZYG2GqRr4L7eQ1N4sa7S6oqKEND6fX3eQhxFaqnimVCIg%2B6TwTp9Ant4P15WlGdVueq6HWATk8zB1CglVmsVH08lUwArNKjBpArD7v%2BIxx5VP1hJLChZOgfHqa6MVh7fP0lg6HXLeyMLSOXCz3oAK7iSOFkfu6RAx%2BGrBAj50ha%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D"},"insightType":0,"interaction":'false',"rateParam":{},"appId":"208"}}}
- # message = {"method":"/iSheetCrowdService/report","headers":{"rid":"158270351707803","type":"PULL"},"body":{"args":{"id":"286","condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200225\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66021683]}]}","tags":["mobile_brand_name_prefer"],"bizParam":{"databankCrowdId":"50083012","bizType":"CUSTOM_ANALYSIS","tag_identifier":"all","captcha":"%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582703516277%3A0.4058205419923615%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T64FCFB4BB4291AEADBCE045345EFA40062471EC081DBF0B70E8D984C29%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22122%23zKE7CE5OEEx2EDpZy4pjEJponDJE7SNEEP7ZpJRBuDPpJFQLpCGwoHZDpJEL7SwBEyGZpJLlu4Ep%2BFQLpoGUEELWn4yE7SNEEP7ZpERBuDPE%2BBQPpC76EJponDJLKMQEspPA04nTtBOmKBvALOESIAOsJhoR8HJNZIbVrA4tlm9BE5R8XAJh3Ue7sqj9qbj48oL6JlcpyBfDq9XpD5VanSL1uO5EELXZ8opwJ4EEyFvTkzlvDDpanAG4ul5EEtpr8CpUJ4bEyF3mqW32E5VanSL1uOIEELXZ8oL6JzEEyBfDqMfbDEb3ngL48liEmLXr8yt3%2ByYEmtWNsLigEzj4EOBJ1NftM0%2BL319%2FxuhJLSWywyyApnX897f6NIAyuyM%2BFgpQ9FU32VbYt2axX3Tlt%2BAQRv%2FkGLICcUjd0kuLF22mDbTj6q7rtA8xPM516jlZ9xnr%2FUhA%2Brn4Q7IxoSFpd3G7i03eGV6%2BEGYLr6Y43nfsZi8lWUZwISbYkgdNc%2F%2Bi5j6v4vfXndqyxsa3JOL7zraUIx7w3z%2BznKvXvHVsFFm56M6LrV8OzJQ8%2BkUw4ueUfYGEqGvNW8eTS%2B4SNpX73oQzDtsSgkAWxradYxWP%2F9OzqWvQp6rAkrdjYfnp%2BQQnf0eaGLCgHngMKG8xbPtABa0tUkkm41LFZlZB1olrI1LJYJojWRQXPkJ2qzIZmP4A%2FLeIH6nnGmqG5%2Fdk%2BR85BxFkzRfNvs3F4xKCJY7BLXJ%3D%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D"},"insightType":0,"interaction":'false',"rateParam":{},"appId":"208"}}}
- # message = {"method":"/iSheetCrowdService/report","headers":{"rid":"158279331489213","type":"PULL"},"body":{"args":{"id":"286","condition":"{\"compute\":\"INTERSECT\",\"ruleList\":[{\"filterList\":[{\"key\":\"brand\",\"type\":\"SINGLE_VALUE\",\"value\":\"372718624\"},{\"key\":\"ds\",\"type\":\"SINGLE_VALUE\",\"value\":\"20200224\"},{\"key\":\"stage\",\"type\":\"MULTI_VALUE\",\"value\":[\"1010\",\"1020\",\"1030\",\"1040\"]}],\"name\":\"stage\"},{\"type\":\"crowd\",\"value\":[66477382]}]}","tags":["mobile_brand_name_prefer"],"bizParam":{"databankCrowdId":"50104297","bizType":"CUSTOM_ANALYSIS","tag_identifier":"all","captcha":"%7B%22a%22%3A%22TSF2%22%2C%22c%22%3A%221582793313952%3A0.5572064376494401%22%2C%22d%22%3A%22nvc_register%22%2C%22h%22%3A%7B%22key1%22%3A%22code0%22%2C%22nvcCode%22%3A400%2C%22umidToken%22%3A%22T4B68EC0873FE11C30B608CA1D8453CD51D467E11E58EAC62C1D92D68D3%22%7D%2C%22j%22%3A%7B%22test%22%3A1%7D%2C%22b%22%3A%22%22%2C%22e%22%3A%22HELtCcggjijySBif5QK5Flm60fyrLzjVSvY2GZ7kF9k2ufvery5t6e1OxIGoUHpc7a0IbkE_0FA-F5WgiEpV7aeWlyQyxr1LL83v6PCoc3YbWdFNpRGiSow97HJFmhSolqL2iP8Yg3b6GvpNCl1IVN3_kiy7mdt7qA7PsE2Fu9J1ZID-lo1BWsvQpV6riLNbYizM9JlKkpiqJYbEB2zQGA%22%7D"},"insightType":0,"interaction":'false',"rateParam":{},"appId":"208"}}}
- self.ws.send(json.dumps(message))
- self.count += 2
- # run()
- thread.start_new_thread(run, ())
- def start(self):
- randomness = os.urandom(16)
- sec_websocket_key = base64.b64encode(randomness).decode('utf-8').strip()
- print(sec_websocket_key)
- header = {'Cookie': self.cookie,
- # 'x_csrf_token': self.x_csrf_token,
- 'Host': 'ws-insight-engine.tmall.com',
- 'Origin': 'https://insight-engine.tmall.com',
- 'Pragma': 'no-cache',
- 'Sec-WebSocket-Extensions': 'permessage-deflate; client_max_window_bits',
- # 这个参数要进行实时修改
- 'Sec-WebSocket-Key': sec_websocket_key, # 'IiVEH4VXg4doluxZ+jjhwA==',
- 'Sec-WebSocket-Version': '13',
- 'Upgrade': 'websocket',
- 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'
- }
- url = 'wss://ws-insight-engine.tmall.com/'
- # ob = Test(cookie)
- websocket.enableTrace(True)
- self.ws = websocket.WebSocketApp(url=url,
- on_message=self.on_message,
- on_error=self.on_error,
- on_close=self.on_close,
- # on_open=self.on_open,
- header=header
- )
- self.ws.on_open = self.on_open
- self.ws.run_forever(ping_interval=5)
self.ws.close()
- if __name__ == '__main__':
- # cookie = 'cna=WVOlFr5u0GICAbSnr+SPipyd; lid=%E6%9B%B2%E7%BB%88%E4%BA%BA%E6%95%A3%E7%9A%84%E5%AF%82%E5%AF%9E%E6%B2%A1%E4%BA%BA%E6%87%82; enc=9cXAFThmyAgHiO%2FENYqm6TMx3M2iB6cB2%2BAszb%2F%2BiaM50PadsWGSN3OKooNrUXVyWs8ipZ6twUdCdaV%2BjkuBrA%3D%3D; t=7ed2d6ca2123fa1881e6b8fe97f185bd; _tb_token_=f36be58353654; cookie2=1643f0defa3225c1ed6be959bfc91256; unb=2206632888386; sn=anessa%E5%AE%89%E7%83%AD%E6%B2%99%E5%AE%98%E6%96%B9%E6%97%97%E8%88%B0%E5%BA%97%3Ait; __YSF_SESSION__={"baseId":"16a92e2747e29249","brandId":"bb52e1138cab1ddf","departmentId":"dc2f99d94018bd16","smartId":"1b66d71c287d3867","databankProjectId":"80457668a4affd3e"}; welcomeShownTime=1582606263036; sgcookie=DFzxOMaFc%2BovDXgNBTCL1; uc1=cookie14=UoTUOLUnHn2Ybg%3D%3D&lng=zh_CN; csg=76c331bb; _mw_us_time_=1582617537341; l=dBSesswPQExBY34QBOCNhOlrE8bOSIRAguzM48-6i_5a-OY6kYQOo8CFCHv6VjWfTzTB47_ypV99-etkV3MmndUU0ThjGxDc.; isg=BGVlRhs9VzAJErOY2VVv-JPpdCGfohk0W6VGnmdKIRyrfoXwL_IpBPMeDOII_jHs'
# cookie = 'cna=WVOlFr5u0GICAbSnr+SPipyd; lid=%E6%9B%B2%E7%BB%88%E4%BA%BA%E6%95%A3%E7%9A%84%E5%AF%82%E5%AF%9E%E6%B2%A1%E4%BA%BA%E6%87%82; enc=9cXAFThmyAgHiO%2FENYqm6TMx3M2iB6cB2%2BAszb%2F%2BiaM50PadsWGSN3OKooNrUXVyWs8ipZ6twUdCdaV%2BjkuBrA%3D%3D; t=7ed2d6ca2123fa1881e6b8fe97f185bd; _tb_token_=f36be58353654; cookie2=1643f0defa3225c1ed6be959bfc91256; unb=2206632888386; sn=anessa%E5%AE%89%E7%83%AD%E6%B2%99%E5%AE%98%E6%96%B9%E6%97%97%E8%88%B0%E5%BA%97%3Ait; __YSF_SESSION__={"baseId":"16a92e2747e29249","brandId":"bb52e1138cab1ddf","departmentId":"dc2f99d94018bd16","smartId":"1b66d71c287d3867","databankProjectId":"80457668a4affd3e"}; welcomeShownTime=1582606263036; sgcookie=DFzxOMaFc%2BovDXgNBTCL1; uc1=cookie14=UoTUOLUnHn2Ybg%3D%3D&lng=zh_CN; csg=76c331bb; _mw_us_time_=1582617537341; isg=BF9fYqxYfUJhcHkq73MVis137rPpxLNm9esMTPGs_I5VgH8C-ZfStt1CQBD-A4ve; l=dBSesswPQExBYE42BOCwlOlrE8bOSIRYYuzM48-6i_5aJ6T18q_Oo8COPF96VjS1MoLB47_ypV99-etkqpxxF_kJyFrhit77nea_IFimexf..'
cookie = 'cna=WVOlFr5u0GICAbSnr+SPipyd; lid=%E6%9B%B2%E7%BB%88%E4%BA%BA%E6%95%A3%E7%9A%84%E5%AF%82%E5%AF%9E%E6%B2%A1%E4%BA%BA%E6%87%82; enc=9cXAFThmyAgHiO%2FENYqm6TMx3M2iB6cB2%2BAszb%2F%2BiaM50PadsWGSN3OKooNrUXVyWs8ipZ6twUdCdaV%2BjkuBrA%3D%3D; t=d1c8dd737265e71d81c287ffa0e31436; _tb_token_=346ebfbaf3b65; cookie2=1d7315e9db402b38ebd67b74e172fdce; unb=2206632888386; sn=anessa%E5%AE%89%E7%83%AD%E6%B2%99%E5%AE%98%E6%96%B9%E6%97%97%E8%88%B0%E5%BA%97%3Ait; __YSF_SESSION__={"baseId":"16a92e2747e29249","brandId":"bb52e1138cab1ddf","departmentId":"dc2f99d94018bd16","smartId":"1b66d71c287d3867","databankProjectId":"80457668a4affd3e"}; welcomeShownTime=1582781993068; sgcookie=Dp%2FVOMDrzVopvKfT1OIMP; uc1=lng=zh_CN&cookie14=UoTUOLRxVjlLtw%3D%3D; csg=4e11fe74; _mw_us_time_=1582789699772; isg=BOjoR_A8Itylow4vXKYaZ44qudb6EUwbFoabiaIZNGNW_YhnSiEcq34_8ZUNVgTz; l=dBSesswPQExBY5b2BOCanurza77OSIRYYuPzaNbMi_5LR6T_3-bOo8mLsF96VjWfG1TB47_ypV99-etkq3DmndUU0ThjGxDc'
Test(cookie).start()
需要注意的是构造好请求头和更新cookie,否则连接失败,websocket库新老版本架构不一致,如果版本过旧会导致代码报错可参考博客https://blog.csdn.net/tz_zs/article/details/95492963
参考博客:
https://segmentfault.com/a/1190000013149749
https://blog.csdn.net/weixin_41505223/article/details/84401044
https://blog.csdn.net/tz_zs/article/details/95492963
python爬虫----爬取阿里数据银行websocket接口的更多相关文章
- python爬虫爬取天气数据并图形化显示
前言 使用python进行网页数据的爬取现在已经很常见了,而对天气数据的爬取更是入门级的新手操作,很多人学习爬虫都从天气开始,本文便是介绍了从中国天气网爬取天气数据,能够实现输入想要查询的城市,返回该 ...
- python爬虫——爬取网页数据和解析数据
1.网络爬虫的基本概念 网络爬虫(又称网络蜘蛛,机器人),就是模拟客户端发送网络请求,接收请求响应,一种按照一定的规则,自动地抓取互联网信息的程序.只要浏览器能够做的事情,原则上,爬虫都能够做到. 2 ...
- 使用python爬虫爬取股票数据
前言: 编写一个爬虫脚本,用于爬取东方财富网的上海股票代码,并通过爬取百度股票的单个股票数据,将所有上海股票数据爬取下来并保存到本地文件中 系统环境: 64位win10系统,64位python3.6, ...
- 用Python爬虫爬取广州大学教务系统的成绩(内网访问)
用Python爬虫爬取广州大学教务系统的成绩(内网访问) 在进行爬取前,首先要了解: 1.什么是CSS选择器? 每一条css样式定义由两部分组成,形式如下: [code] 选择器{样式} [/code ...
- 使用Python爬虫爬取网络美女图片
代码地址如下:http://www.demodashi.com/demo/13500.html 准备工作 安装python3.6 略 安装requests库(用于请求静态页面) pip install ...
- Python爬虫|爬取喜马拉雅音频
"GOOD Python爬虫|爬取喜马拉雅音频 喜马拉雅是知名的专业的音频分享平台,用户规模突破4.8亿,汇集了有声小说,有声读物,儿童睡前故事,相声小品等数亿条音频,成为国内发展最快.规模 ...
- Python爬虫爬取全书网小说,程序源码+程序详细分析
Python爬虫爬取全书网小说教程 第一步:打开谷歌浏览器,搜索全书网,然后再点击你想下载的小说,进入图一页面后点击F12选择Network,如果没有内容按F5刷新一下 点击Network之后出现如下 ...
- python爬虫—爬取英文名以及正则表达式的介绍
python爬虫—爬取英文名以及正则表达式的介绍 爬取英文名: 一. 爬虫模块详细设计 (1)整体思路 对于本次爬取英文名数据的爬虫实现,我的思路是先将A-Z所有英文名的连接爬取出来,保存在一个cs ...
- MATLAB爬虫爬取股票数据
近年来,大数据盛行,有关爬虫的教程层次不穷.那么,爬虫到底是什么呢? 什么是爬虫? 百度百科是这样定义的: 网络爬虫(又被称为网页蜘蛛,网络机器人,在FOAF社区中间,更经常的称为网页追逐者),是一种 ...
随机推荐
- linux 下生成随机密码生成器
[root@localhost ~]# yum -y install pwgen [root@localhost ~]# pwgen -ncCyB1 8 1 kei%b3Xa [root@localh ...
- HSRP 详解
简介 HSRP(Hot Standby Router Protocol 热备份路由器协议)是Cisco的专有协议.HSRP把多台路由器组成一个“热备份组”,形成一个虚拟路由器.这个组内只有一个路由器是 ...
- Element-UI ( Dropdow )下拉菜单组件command传输对象
通过 :command绑定对象数据,handleCommand方法处理数据 template <div v-for="(item, index) in FlyWarningList&q ...
- 量化投资学习笔记27——《Python机器学习应用》课程笔记01
北京理工大学在线课程: http://www.icourse163.org/course/BIT-1001872001 机器学习分类 监督学习 无监督学习 半监督学习 强化学习 深度学习 Scikit ...
- RaspberryPi 3b+ 安装OpenWrt教程
layout: post title: "RaspberryPi 3b+ 安装OpenWrt教程" date: 2019-09-28 22:00:00 +0800 categori ...
- javascript30--day01--Drum kit
相关视频链接:https://www.bilibili.com/video/av8481988/?p=3 Drum kit 做题思路(1)监听键盘事件 addEventListener(‘事件名’,执 ...
- 【Web性能权威指南】 PDF
Web性能权威指南.pdf 网盘:https://545c.com/file/24657411-424998805 获取码:276922
- [Contract] Solidity address payable 转换与数组地址
address payable --> address address payable addr1 = msg.sender; address addr2 = addr1; // 隐式转 a ...
- 架构师成长系列 | 从 2019 到 2020,Apache Dubbo 年度回顾与总结
作者 | 刘军(陆龟)Apache Dubbo PMC 本文整理自架构师成长系列 2 月 18 日直播课程. 关注"阿里巴巴云原生"公众号,回复 "218",即 ...
- 小白的linux笔记8:linux自动运行爬虫并发送提醒邮件
有了成功运行的爬虫后,希望能每天定时运行,且遇到错误时能及时发出提醒. 发出提醒 可以用mailx发出邮件做提醒.没有的话先安装Yum install mailx. 以qq邮箱为例,需要设置/etc/ ...