Python-S9-Day124-爬虫&微信
01 今日内容概要
02 内容回顾:flask上下文
03 内容回顾:多app应用
04 内容回顾:面向对象和数据库
05 内容回顾:爬虫
06 Web微信:获取二维码(一)
07 Web微信:获取二维码(二)
08 Web微信:获取用户头像
09 Web微信:确认登录
10 Web微信:用户凭证
11 Web微信:用户信息初始化
12 今日作业
01 今日内容概要
1.1 基于Flask开发Web微信;
- 爬虫;
- Flask知识;
02 内容回顾:flask上下文
2.1 Flask的上下文管理回顾;
2.1.1 threading.local;
- 应用:Flask上下文管理中的Local类更加高级
- DBUtils线程池的模式:为每个线程创建一个连接
2.1.2 SQLAlchemy;
2.1.3 什么是栈?后进先出,push和pop方法;
2.1.4 __getattr__方法;
2.2 Flask上下文管理机制;
2.2.1 Flask程序有几个Local和LocalStack?
- 请求LocalStack,帮助开发对stack对应列表操作
- 应用LocalStack,帮助开发对stack对应列表操作
2.2.2 为什么要创建两个Local或者LocalStack?
- 编写离线脚本时,需要配置文件,而配置文件存放在app中;
- 编写离线脚本时,不需要请求相关数据
03 内容回顾:多app应用
multi_app.py;
from flask import Flask
from werkzeug.wsgi import DispatcherMiddleware
from werkzeug.serving import run_simple app1 = Flask("app01")
app1.config['DB'] = 123
app2 = Flask("app02")
app2.config['DB'] = 456 @app1.route('/web')
def web():
print('web')
return 'cuixiaozhao' @app1.route('/news')
def news():
print('news')
return 'cuixiaozhao' @app2.route('/admin')
def admin():
print('admin')
return 'cuixiaozhao' @app2.route('/article')
def article():
print('article')
return 'cuixiaozhao' '''
/web
/news
/admin
'''
app = DispatcherMiddleware(app1, {
'/app2': app2,
})
if __name__ == '__main__':
run_simple(hostname='127.0.0.1', port=5000, application=app)
离线脚本.py;
from multi_app import app1
from multi_app import app2 with app1.app_context():
pass # 为app1创建数据库;
with app2.app_context():
pass # 为app2创建数据库
04 内容回顾:面向对象和数据库
4.1 面向对象的认识;
4.2 特殊方法;
4.2.1 __enter__
4.2.2 __exit__
4.3 约束
4.3.1 接口(java)
4.3.2 抽象方法、抽象类;
4.3.3 Python本无接口,但有抽象方法抽象类
4.3.4 Python中继承+抛出异常做
4.4 metaclass的作用:指定类由哪个type创建(type泛指继承type的所有类)
# class Base(object):
# def send(self):
# raise NotImplemented("子类中必须要实现send方法")
#
# class Foo(Base):
# def send(self):
# print('13811221893')
#
#
# obj = Foo()
# obj.send()
import abc class Base(metaclass=abc.ABCMeta): @abc.abstractmethod
def send(self):
pass def func(self):
print(123) class Foo(Base):
def send(self):
print("发送信息给gf")
obj = Foo()
obj.send()
obj.func()
4.5 数据库
4.5.1 Django的数据库锁;
4.5.2 数据库引擎
- InnoDB——表|行、无事务、慢;
- MyIsam——表、无事务、快;
4.5.3 数据库如何加锁?
- 终端1:begin; select * from tb for update;commit;
- 终端2:begin;select * from tb for update;commit;
4.5.4 什么时候需要加锁呢?
- 记数
- 应用场景——商品数量更新;
4.5.5 了解
- 视图,即虚拟表
- 触发器,增加、修改、删除前后做操作,但是性能比较低,一般不用;
- 函数,返回一个值;
- 存储过程,返回一个结果集;
4.5.6 索引
- 主键索引
- 联合索引
- 唯一索引
- 联合唯一索引
- 索引的注意事项:命中索引;无法命中:LIke、函数、or、不等于
4.5.7 慢日志?
- 将查询时间久
- 未命中索引
- 修改数据库配置文件——slow_query_log = ON;long_query_time = 2;slow_query_log_file = /usr/slow.log;log_query_not_using_indexes = ON
4.5.8 SQL性能测试-通过执行计划;
- explain select * from tb;
4.5.9 数据库优化方案;
- 不适用select *;
- select id,1 from tb;
- 短索引
- 使用连表操作不使用子查询;
- 列的固定长度在前面
- 分库分表(水平、垂直)
- 使用组合索引》索引合并
- 内存代替连表
- limit 1
- 读写分离与主从复制一起使用(主从数据一致,但是实现了读写分离)
- 分页,越往后越慢,记录当前页面的最大和最小id;
- 使用Redis做缓存;仅限于热点数据;但是注意缓存数据量很大,一旦Redis宕机,MySQL会出现雪崩的风险;做Redis的高可用;
05 内容回顾:爬虫
5.1 requests模块的参数;
- url
- headers
- cookies
- data
- json
- params
- proxy
- 文件上传
- SSL证书
5.2 bs4
- find
- find_all
- text
- attrs
5.3 爬虫套路
- 汽车之家,没防爬虫策略
- 抽屉
- 抽屉登录
- github csrf_token
- 拉勾网 自生成token,放在请求头,构造特殊请求头;
5.4 PIL生成点线杂乱验证码,反解效率更低,使用无意义;谷歌开源的TensorFlow进行机器学习;
- 成立团队自行开发进行做验证码破解;
- 购买爬虫相关验证码破解服务;
- 手工登录,然后获取cookies,再进行爬取;
06 Web微信:获取二维码(一)
6.1 基于Flask开发Web微信;
6.2 什么是长轮询?为了希望实时得到一个数据;前端写递归,后端夯住;
07 Web微信:获取二维码(二)
7.1 创建Flask项目;
7.2 动态获取微信二维码;
from flask import Flask, render_template
import requests
import time
import re app = Flask(__name__) app.secret_key = "cuixiaozhao" @app.route('/login')
def login():
ctime = int(time.time() * 1000)
qcode_url = "https://login.wx.qq.com/jslogin?appid=wx782c26e4c19acffb&redirect_uri=https%3A%2F%2Fwx.qq.com%2Fcgi-bin%2Fmmwebwx-bin%2Fwebwxnewloginpage&fun=new&lang=zh_CN&_={0}".format(
ctime)
rep = requests.get(
url=qcode_url, )
qcode = re.findall('uuid = "(.*)";', rep.text)[0]
return render_template('login.html', qcode=qcode) if __name__ == '__main__':
app.run()
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<div style="width: 200px; margin: 0 auto;">
<h1 style="text-align: center">扫码登录</h1>
<img style="width:200px;height: 200px" src="https://login.weixin.qq.com/qrcode/{{qcode}}" alt="">
</div> </body>
</html>
08 Web微信:获取用户头像
8.1 app.py;
from flask import Flask, render_template, session, jsonify
import requests
import time
import re app = Flask(__name__) app.secret_key = "1231sdfasdf" @app.route('/login')
def login():
ctime = int(time.time() * 1000)
qcode_url = "https://login.wx.qq.com/jslogin?appid=wx782c26e4c19acffb&redirect_uri=https%3A%2F%2Fwx.qq.com%2Fcgi-bin%2Fmmwebwx-bin%2Fwebwxnewloginpage&fun=new&lang=zh_CN&_={0}".format(
ctime)
rep = requests.get(
url=qcode_url, )
qcode = re.findall('uuid = "(.*)";', rep.text)[0]
# qcode = re.findall('uuid = "(.*)";', rep.text)[0]
session['qcode'] = qcode
return render_template('login1.html', qcode=qcode) @app.route('/check/login')
def check_login():
qcode = session['qcode']
ctime = int(time.time() * 1000)
check_login_url = 'https://login.wx.qq.com/cgi-bin/mmwebwx-bin/login?loginicon=true&uuid={0}&tip=0&r=-976036168&_={1}'.format(
qcode, ctime)
rep = requests.get(
url=check_login_url
)
result = {'code': 408}
if 'window.code=408 ' in rep.text:
# 用户未扫码;
#result['code'] = 201
return jsonify(result)
elif 'window.code=201' in rep.text:
# 用户扫码,获取头像;
result['code'] = 201
result['avatar'] = re.findall("window.userAvatar = '(.*)';", rep.text)[0]
return jsonify(result)
# print(rep.text)
# return "检查是否已经扫码" # https://login.wx.qq.com/cgi-bin/mmwebwx-bin/login?loginicon=true&uuid=4ZiNBuh1yg==&tip=0&r=924518826&_=1536673672364 if __name__ == '__main__':
app.run() '''
window.code=201;window.userAvatar = 'data:img/jpg;base64,/9j/4AAQSkZJRgABAQAASABIAAD/2wBDAAcFBQYFBAcGBgYIBwcICxILCwoKCxYPEA0SGhYbGhkWGRgcICgiHB4mHhgZIzAkJiorLS4tGyIyNTEsNSgsLSz/2wBDAQcICAsJCxULCxUsHRkdLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCz/wAARCACEAIQDASIAAhEBAxEB/8QAGwAAAQUBAQAAAAAAAAAAAAAAAAECAwQGBQf/xAA5EAABAwIEBAEKBQMFAAAAAAABAAIDBBEFEiFRBhMxQWEUFSJEUnGSobHRI0KBkcEHMkMzNGJy8f/EABoBAAIDAQEAAAAAAAAAAAAAAAACAQMEBQb/xAAqEQACAgECBQMDBQAAAAAAAAAAAQIRAwQSBRMhMVEUQXEiI2EyM6HB8f/aAAwDAQACEQMRAD8A80DUoCcAnWXqjxdDQEXTrIspIsbdFvBOsiyAsZZF0+yRAUMskIUlrpCFBNkRCSykskIUDWQkJCFIRqkISjpkJKaSpbJpCVliZEW6oT7IQNZ0bJcqWyWyuMFiWRZOsjKgi6obZFk6yLICxmTwRlT7JLIJsZY7JLqSyaSglMZdNIUhCSygZMiITSFKQm2SsdMiCaQpSEwhKOmR2QnEaoUDWdGyUBSZUob4K6zFTI7IspciMqLDayKyS6myppZqiwpkd0llLkGyMtkBTIrpLKYt0TS1Fk0yIhNIUuRIWqLG6kJCQhSlqaWqBupE4phGimLUwhKx1ZCeqE8t1QlGOxkShqscpAjU2TsIMqMisGPwSthc9wa1pLibAAalFhsKuVJkXY8wYiXlgpiXNOUjMLg7dU2bAsRgjL5aORrR1JHhf+FG9eSNhyciQsVgxkI5ZTWG0rZU0sVrlnZNMevRFk7CrlSFqtthc94Yxpc5xsABckq07AsUsD5tq7HoeS7X5JXNIaOGT7M5BYmlq6c+FV1M0unoqiIDqXxkAfuFTLPBG6weNruiqWpharRYmFiiydpWshTGPVCLDaaLlJREuj5Md1YosKkrZ8jAABq5x6NCp3mjacynoKirkLKaCSd4Fy2Nhcbb2C1vAXDcr+JmT4hTSxMpWmVjZYy3O/t1Ha9/0XWw2CHDx5NTNyty5nvP9zzpqf36K3z5IpWzROLZGG7SqZzck0iMc4xkm+tCcY4I6GUYtR2Ycw5rb2Dj9/8AzZTYTh5NIJqjLI6VvbUWP2VOvnq8RmkqqyYPDATFE3Rkem3c+JVHDqrEqPmRUkjeVIDo8X5ZP5m/boq9s3j231GU8Uc/Mrp/fkxuL4NLQYtU0zI3vZG8hpAJ06jX3Fc8wkGxaQR2XqNMPJYQ0PJd1Liep7lZjFMP8vdNUx6ztkfmHtC5+a0Rm+zKm4uXwZMxadErKWSY2jjc8js0XXWosKnr6tlPAzM937AbnxXpWB4MzBYeRT2Ly28knQvOny2CTLnWNfk6ej4fPVPwl7mS/pzw4ajHZK6ric1tC0PY17bZnnoddrE++y2WKUZpKptXDI1kbzZ7HGwvuFbkEzSHteQ9urSufJSzVFVJW1rxK8A8tg0bGNgP5XLy5HkluZ6vQaNaTona9/yWKyCKroH08lpI5mEEjUEEdl4nXYfJRVs1NIDmieWXt1t3XrTJ5aFr4oB+E7oD/j8W/ZXY8UbFTtiiYYw0b6p8GZ4rKtdw16pp3VHhpiTDGvQuI+GhVw+cqJn4rm5pox+bdw8d1jDTnZdOGVTVo8vqdHPTT2T/ANOaY0Locg7ITbzLsNt5EfZUkEMlNMJI9CPmtN5tHso82j2Vg5xv5BQinGYy2tdmW2xuEeU3Cs1dKKeDMR1K5xsei0Qe5WcnNDlTcSZ8wdDI0d2n6JkczYmBrRqoXH0Xe4qO5HdOU2W3TktK5cExY6XKMzuY42/VW2uJXRpcNY6mY9rAM4zGw6kpJz2IvwYudPr7HJwyrnw6vfOGAsl0kaBa4v2W1pKmKotLG67XMP1K4hwwbKzQU7qV7yCQC3p+oWLI1Lqeq4fllifKfZ/wdV79dSoJXgRP17FV3yHcqF7yWO17LOehUQc8bAlNyRnUtCjz2Ts4IQWFasrY6CgY4elKW+gz+T4LGTUzp5nyvAzPJcbC2q1E1EZpHPNzc6X2UJw3wWrHJQR5TXZpaidPsjMeQ+CFpfNp2QreYYOUegchuyTkN3Vu6SwXPs30Z7iGINo4+13/AMLPEWWm4oNqSG3tn6LMZl0tP+hHA1v7zFLfRPuUdgpC8Bh07KPO3ZXmIXoFrcNgBwymO8Y+iyIsStrh1hhlML/42/VZdT2R0+HL638DjTBQVEIjhJCulw3VWteDBpusNnodMvuwOU9Md/pu9yVztUxzhkd7kHoyMWIQSB0TLFH5UDUdGCnDqdjtylNKFYpP9nF/1CedUWeVyL638FE0rb9EK5ZCLEo7ZBHZRvJHZXMqQxNd1ASWNRlOJH3p4R/yP0WcuVvsQwCLE2tBmfFlJPogG65p4GhPr0vwBb8OaEYJI4+q0uTJkcoroZGR34Z1UbBmK2J4EhOnl0vwBOHA0LRYVsnwBXeox+TL6LN4MfmDQSu9SV8go4W66MA+Svu4FiPrz/gH2XQp+H4KaFkeZ0mUAXOl1nz5YzSSN+i0+TFJubOKa2TYprap02Zp7C60fmuD2FE/A4JQcrjGT3CyWjtYGo5FJmbc5RPdoVozwxEfWn/CEw8Kxn1p/wAIRaO0tXi8mbLrJhkAWkdwmw+tv+EJh4Ri71cnwhFon1mHycumrjy2sHbRXGzOcFchwCCnFs7nncqyKCNo0CizhZOsm0c0Od4oXS8mYPyoRZXR10qEJQHtTkITIgEIQpIBRuCEJSWJZAGqEIHQ6yQoQoHGprkIQBCQmHqhCBRpGqEIUkH/2Q==';
'''
8.2 login.html;
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<div style="width: 200px; margin: 0 auto;">
<h1 style="text-align: center">扫码登录</h1>
<img id="userAvatar" style="width:200px;height: 200px" src="https://login.weixin.qq.com/qrcode/{{qcode}}" alt="">
</div>
<script src="https://cdn.bootcss.com/jquery/3.3.1/jquery.min.js"></script>
<script>
$(function () {
checkLogin();
}); function checkLogin() {
$.ajax({
url: '/check/login',
method: 'GET',
dataType:'json',
success: function (arg) {
console.log(arg);
checkLogin();
if(arg.code === 408){
checkLogin();
}else if (arg.code === 201){
$('#userAvatar').attr('src',arg.avatar);
}
}
}) }
</script>
</body>
</html>
注意:window.code=408 不能出现空格;
09 Web微信:确认登录
9.1 app.py;
from flask import Flask, render_template, session, jsonify
import requests
import time
import re app = Flask(__name__) app.secret_key = "1231sdfasdf" @app.route('/login')
def login():
ctime = int(time.time() * 1000)
qcode_url = "https://login.wx.qq.com/jslogin?appid=wx782c26e4c19acffb&redirect_uri=https%3A%2F%2Fwx.qq.com%2Fcgi-bin%2Fmmwebwx-bin%2Fwebwxnewloginpage&fun=new&lang=zh_CN&_={0}".format(
ctime)
rep = requests.get(
url=qcode_url, )
qcode = re.findall('uuid = "(.*)";', rep.text)[0]
# qcode = re.findall('uuid = "(.*)";', rep.text)[0]
session['qcode'] = qcode
return render_template('login.html', qcode=qcode) @app.route('/check/login')
def check_login():
qcode = session['qcode']
ctime = int(time.time() * 1000)
check_login_url = 'https://login.wx.qq.com/cgi-bin/mmwebwx-bin/login?loginicon=true&uuid={0}&tip=0&r=-976036168&_={1}'.format(
qcode, ctime)
rep = requests.get(
url=check_login_url
)
result = {'code': 408}
if 'window.code=408 ' in rep.text:
# 用户未扫码;
result['code'] = 408
elif 'window.code=201' in rep.text:
# 用户扫码,获取头像;
result['code'] = 201
result['avatar'] = re.findall("window.userAvatar = '(.*)';", rep.text)[0]
elif 'window.code=200' in rep.text:
redirect_uri = re.findall('window.redirect_uri="(.*)";', rep.text)[0]
print(redirect_uri)
result['code'] = 200
return jsonify(result)
@app.route('/index')
def index():
return render_template('index.html') if __name__ == '__main__':
app.run() '''
window.code=201;window.userAvatar = 'data:img/jpg;base64,/9j/4AAQSkZJRgABAQAASABIAAD/2wBDAAcFBQYFBAcGBgYIBwcICxILCwoKCxYPEA0SGhYbGhkWGRgcICgiHB4mHhgZIzAkJiorLS4tGyIyNTEsNSgsLSz/2wBDAQcICAsJCxULCxUsHRkdLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCwsLCz/wAARCACEAIQDASIAAhEBAxEB/8QAGwAAAQUBAQAAAAAAAAAAAAAAAAECAwQGBQf/xAA5EAABAwIEBAEKBQMFAAAAAAABAAIDBBEFEiFRBhMxQWEUFSJEUnGSobHRI0KBkcEHMkMzNGJy8f/EABoBAAIDAQEAAAAAAAAAAAAAAAACAQMEBQb/xAAqEQACAgECBQMDBQAAAAAAAAAAAQIRAwQSBRMhMVEUQXEiI2EyM6HB8f/aAAwDAQACEQMRAD8A80DUoCcAnWXqjxdDQEXTrIspIsbdFvBOsiyAsZZF0+yRAUMskIUlrpCFBNkRCSykskIUDWQkJCFIRqkISjpkJKaSpbJpCVliZEW6oT7IQNZ0bJcqWyWyuMFiWRZOsjKgi6obZFk6yLICxmTwRlT7JLIJsZY7JLqSyaSglMZdNIUhCSygZMiITSFKQm2SsdMiCaQpSEwhKOmR2QnEaoUDWdGyUBSZUob4K6zFTI7IspciMqLDayKyS6myppZqiwpkd0llLkGyMtkBTIrpLKYt0TS1Fk0yIhNIUuRIWqLG6kJCQhSlqaWqBupE4phGimLUwhKx1ZCeqE8t1QlGOxkShqscpAjU2TsIMqMisGPwSthc9wa1pLibAAalFhsKuVJkXY8wYiXlgpiXNOUjMLg7dU2bAsRgjL5aORrR1JHhf+FG9eSNhyciQsVgxkI5ZTWG0rZU0sVrlnZNMevRFk7CrlSFqtthc94Yxpc5xsABckq07AsUsD5tq7HoeS7X5JXNIaOGT7M5BYmlq6c+FV1M0unoqiIDqXxkAfuFTLPBG6weNruiqWpharRYmFiiydpWshTGPVCLDaaLlJREuj5Md1YosKkrZ8jAABq5x6NCp3mjacynoKirkLKaCSd4Fy2Nhcbb2C1vAXDcr+JmT4hTSxMpWmVjZYy3O/t1Ha9/0XWw2CHDx5NTNyty5nvP9zzpqf36K3z5IpWzROLZGG7SqZzck0iMc4xkm+tCcY4I6GUYtR2Ycw5rb2Dj9/8AzZTYTh5NIJqjLI6VvbUWP2VOvnq8RmkqqyYPDATFE3Rkem3c+JVHDqrEqPmRUkjeVIDo8X5ZP5m/boq9s3j231GU8Uc/Mrp/fkxuL4NLQYtU0zI3vZG8hpAJ06jX3Fc8wkGxaQR2XqNMPJYQ0PJd1Liep7lZjFMP8vdNUx6ztkfmHtC5+a0Rm+zKm4uXwZMxadErKWSY2jjc8js0XXWosKnr6tlPAzM937AbnxXpWB4MzBYeRT2Ly28knQvOny2CTLnWNfk6ej4fPVPwl7mS/pzw4ajHZK6ric1tC0PY17bZnnoddrE++y2WKUZpKptXDI1kbzZ7HGwvuFbkEzSHteQ9urSufJSzVFVJW1rxK8A8tg0bGNgP5XLy5HkluZ6vQaNaTona9/yWKyCKroH08lpI5mEEjUEEdl4nXYfJRVs1NIDmieWXt1t3XrTJ5aFr4oB+E7oD/j8W/ZXY8UbFTtiiYYw0b6p8GZ4rKtdw16pp3VHhpiTDGvQuI+GhVw+cqJn4rm5pox+bdw8d1jDTnZdOGVTVo8vqdHPTT2T/ANOaY0Locg7ITbzLsNt5EfZUkEMlNMJI9CPmtN5tHso82j2Vg5xv5BQinGYy2tdmW2xuEeU3Cs1dKKeDMR1K5xsei0Qe5WcnNDlTcSZ8wdDI0d2n6JkczYmBrRqoXH0Xe4qO5HdOU2W3TktK5cExY6XKMzuY42/VW2uJXRpcNY6mY9rAM4zGw6kpJz2IvwYudPr7HJwyrnw6vfOGAsl0kaBa4v2W1pKmKotLG67XMP1K4hwwbKzQU7qV7yCQC3p+oWLI1Lqeq4fllifKfZ/wdV79dSoJXgRP17FV3yHcqF7yWO17LOehUQc8bAlNyRnUtCjz2Ts4IQWFasrY6CgY4elKW+gz+T4LGTUzp5nyvAzPJcbC2q1E1EZpHPNzc6X2UJw3wWrHJQR5TXZpaidPsjMeQ+CFpfNp2QreYYOUegchuyTkN3Vu6SwXPs30Z7iGINo4+13/AMLPEWWm4oNqSG3tn6LMZl0tP+hHA1v7zFLfRPuUdgpC8Bh07KPO3ZXmIXoFrcNgBwymO8Y+iyIsStrh1hhlML/42/VZdT2R0+HL638DjTBQVEIjhJCulw3VWteDBpusNnodMvuwOU9Md/pu9yVztUxzhkd7kHoyMWIQSB0TLFH5UDUdGCnDqdjtylNKFYpP9nF/1CedUWeVyL638FE0rb9EK5ZCLEo7ZBHZRvJHZXMqQxNd1ASWNRlOJH3p4R/yP0WcuVvsQwCLE2tBmfFlJPogG65p4GhPr0vwBb8OaEYJI4+q0uTJkcoroZGR34Z1UbBmK2J4EhOnl0vwBOHA0LRYVsnwBXeox+TL6LN4MfmDQSu9SV8go4W66MA+Svu4FiPrz/gH2XQp+H4KaFkeZ0mUAXOl1nz5YzSSN+i0+TFJubOKa2TYprap02Zp7C60fmuD2FE/A4JQcrjGT3CyWjtYGo5FJmbc5RPdoVozwxEfWn/CEw8Kxn1p/wAIRaO0tXi8mbLrJhkAWkdwmw+tv+EJh4Ri71cnwhFon1mHycumrjy2sHbRXGzOcFchwCCnFs7nncqyKCNo0CizhZOsm0c0Od4oXS8mYPyoRZXR10qEJQHtTkITIgEIQpIBRuCEJSWJZAGqEIHQ6yQoQoHGprkIQBCQmHqhCBRpGqEIUkH/2Q==';
'''
9.2 login.html
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<title>Title</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<div style="width: 200px;margin: 0 auto;">
<h1 style="text-align: center;">扫码登录</h1>
<img id="userAvatar" style="width: 200px;height: 200px;" src="https://login.weixin.qq.com/qrcode/{{qcode}}" alt="">
</div> <script src="https://cdn.bootcss.com/jquery/3.3.0/jquery.min.js"></script>
<script>
$(function () {
checkLogin();
}); function checkLogin() {
$.ajax({
url: '/check/login',
method: 'GET',
dataType: 'json',
success: function (arg) {
console.log(arg);
checkLogin();
if (arg.code === 408) {
checkLogin();
} else if (arg.code === 201) {
$('#userAvatar').attr('src', arg.avatar);
checkLogin();
}else if(arg.code === 200){
location.href = "/index"
}
}
})
}
</script>
</body>
</html>
9.3 index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body>
<h1>欢迎使用Web微信</h1>
</body>
</html>
10 Web微信:用户凭证
10.1 xml解析器函数的引入,将xml格式的文档转化为字典类型;
import re data = 'window.QRLogin.code = 200; window.QRLogin.uuid = "IaFEIY2ncA==";' ret = re.findall('uuid = "(.*)";', data)[0]
#print(ret) #text = '<error><ret>0</ret><message></message><skey>@crypt_7f94ab6d_09c6f5c641735b9d508b2973de7e1c73</skey><wxsid>dxtAt/oUed7drgJw</wxsid><wxuin>2150889942</wxuin><pass_ticket>DgAlY28j5CsCBMamz1h0FU42CAK6Bn%2BLk888LtaZnsOlckkPWFpgMu0KBFY91IUi</pass_ticket><isgrayscale>1</isgrayscale></error>'
from bs4 import BeautifulSoup def xml_parse(text):
result = {}
soup = BeautifulSoup(text,'html.parser')
tag_list = soup.find(name='error').find_all()
for tag in tag_list:
result[tag.name] = tag.text
return result
v = '<error><ret>0</ret><message></message><skey>@crypt_7f94ab6d_09c6f5c641735b9d508b2973de7e1c73</skey><wxsid>dxtAt/oUed7drgJw</wxsid><wxuin>2150889942</wxuin><pass_ticket>DgAlY28j5CsCBMamz1h0FU42CAK6Bn%2BLk888LtaZnsOlckkPWFpgMu0KBFY91IUi</pass_ticket><isgrayscale>1</isgrayscale></error>'
result = xml_parse(v)
print(result)
11 Web微信:用户信息初始化
12 当日总结
12.1 获取二维码;
12.2 检查登录;
12.3 检查登录:确认登录
12.4 获取凭证,redirect_uri
12.5 信息初始化——联系人、公众号信息;
12.6 获取联系人所有的信息;
Python-S9-Day124-爬虫&微信的更多相关文章
- 教你如何入手用python实现简单爬虫微信公众号并下载视频
主要功能 如何简单爬虫微信公众号 获取信息:标题.摘要.封面.文章地址 自动批量下载公众号内的视频 一.获取公众号信息:标题.摘要.封面.文章URL 操作步骤: 1.先自己申请一个公众号 2.登录自己 ...
- c#代码 天气接口 一分钟搞懂你的博客为什么没人看 看完python这段爬虫代码,java流泪了c#沉默了 图片二进制转换与存入数据库相关 C#7.0--引用返回值和引用局部变量 JS直接调用C#后台方法(ajax调用) Linq To Json SqlServer 递归查询
天气预报的程序.程序并不难. 看到这个需求第一个想法就是只要找到合适天气预报接口一切都是小意思,说干就干,立马跟学生沟通价格. 不过谈报价的过程中,差点没让我一口老血喷键盘上,话说我们程序猿的人 ...
- Python学习网络爬虫--转
原文地址:https://github.com/lining0806/PythonSpiderNotes Python学习网络爬虫主要分3个大的版块:抓取,分析,存储 另外,比较常用的爬虫框架Scra ...
- 初探爬虫 ——《python 3 网络爬虫开发实践》读书笔记
零.背景 之前在 node.js 下写过一些爬虫,去做自己的私人网站和工具,但一直没有稍微深入的了解,借着此次公司的新项目,体系的学习下. 本文内容主要侧重介绍爬虫的概念.玩法.策略.不同工具的列举和 ...
- Python 开发轻量级爬虫08
Python 开发轻量级爬虫 (imooc总结08--爬虫实例--分析目标) 怎么开发一个爬虫?开发一个爬虫包含哪些步骤呢? 1.确定要抓取得目标,即抓取哪些网站的哪些网页的哪部分数据. 本实例确定抓 ...
- Python 开发轻量级爬虫07
Python 开发轻量级爬虫 (imooc总结07--网页解析器BeautifulSoup) BeautifulSoup下载和安装 使用pip install 安装:在命令行cmd之后输入,pip i ...
- Python 开发轻量级爬虫06
Python 开发轻量级爬虫 (imooc总结06--网页解析器) 介绍网页解析器 将互联网的网页获取到本地以后,我们需要对它们进行解析才能够提取出我们需要的内容. 也就是说网页解析器是从网页中提取有 ...
- Python 开发轻量级爬虫05
Python 开发轻量级爬虫 (imooc总结05--网页下载器) 介绍网页下载器 网页下载器是将互联网上url对应的网页下载到本地的工具.因为将网页下载到本地才能进行后续的分析处理,可以说网页下载器 ...
- Python 开发轻量级爬虫04
Python 开发轻量级爬虫 (imooc总结04--url管理器) 介绍抓取URL管理器 url管理器用来管理待抓取url集合和已抓取url集合. 这里有一个问题,遇到一个url,我们就抓取它的内容 ...
- Python 开发轻量级爬虫03
Python 开发轻量级爬虫 (imooc总结03--简单的爬虫架构) 现在来看一下一个简单的爬虫架构. 要实现一个简单的爬虫,有哪些方面需要考虑呢? 首先需要一个爬虫调度端,来启动爬虫.停止爬虫.监 ...
随机推荐
- JS案例练习:图片切换+切换模式
先附图: CSS样式部分: <style> *{;} body{font-family:'Microsoft YaHei';} .menu{margin:20px auto 0; widt ...
- Mysql update后insert造成死锁原因分析及解决
系统中出现死锁的日志如下: ) TRANSACTION: , ACTIVE sec inserting mysql tables , locked LOCK WAIT lock struct(s), ...
- 第1章 .Net应用程序体系结构
1. CLR:公共语言运行库,是每种.Net编程语言都使用的运行库 Windows 8为Windows Store应用程序引入了一个新的编程接口:Windows运行库. C# 6 具有许多小而实用的语 ...
- UVA 1213 - Sum of Different Primes(递推)
类似一个背包问题的计数问题.(虽然我也不记得这叫什么背包了 一开始我想的状态定义是:f[n = 和为n][k 个素数]. 递推式呼之欲出: f[n][k] = sigma f[n-pi][k-1]. ...
- UVA 12118 Inspector's Dilemma(连通性,欧拉路径,构造)
只和连通分量以及度数有关.不同连通分量只要连一条边就够了,连通分量为0的时候要特判.一个连通分量只需看度数为奇的点的数量,两个端点(度数为奇)是必要的. 如果多了,奇点数也一定是2的倍数(一条边增加两 ...
- js 封装父页面子页面交互接口
定义标准接口 Interface= {}; Interface.ParentWin = {}; Interface.ChildWin = {}; /** * 父页面提供的标准接口函数名称 */ Int ...
- Airflow 调度基础
1. Airflow Airflow是一个调度.监控工作流的平台.用于将一个工作流制定为一组任务的有向无环图(DAG),并指派到一组计算节点上,根据相互之间的依赖关系,有序执行. 2. 安装 pip安 ...
- window下安装ubuntu(ubuntu可删除)
进入ububtu13.04的安装界面,这里我们选择了“中文(简体)”,然后单击安装: 下图是现场拍的: 出现如下图时,请根据需要选择,然后单击“继续” , 接下来会出现问你是否要连接网络,我们选择 ...
- jquery iCheck 插件
1 官网:http://www.bootcss.com/p/icheck/#download 2 博客:https://www.cnblogs.com/xcsn/p/6307610.html http ...
- js实现指定日期增加指定月份
首先,大致思路为: 1. 先将字符串格式的时间类型转化为Date类型 2. 再将Date类型的时间增加指定月份 3. 最后将Date类型的时间在转化为字符串类型 1. 先将字符串格式的时间类型转化为 ...