Selenium support for PhantomJS has been deprecated, please use headless

　　今天在使用Selenuim+PhantomJS动态抓取网页时，出现如下报错信息：

C:\Python36\lib\site-packages\selenium-3.11.0-py3.6.egg\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead

  warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '

　　意思就是Selenuim已经放弃PhantomJS，了，建议使用火狐或者谷歌无界面浏览器。

　　下载chromedriver

　　要确保机器上安装谷歌浏览器

　　把chromedriver.exe放在C:\Python27\Scripts路径下

　　Chrome-headless 模式， Google 针对 Chrome 浏览器 59版新增加的一种模式，可以让你不打开UI界面的情况下使用 Chrome 浏览器，所以运行效果与 Chrome 保持完美一致。

　　火狐驱动：https://github.com/mozilla/geckodriver/releases

　　https://github.com/mozilla/geckodriver/releases/download/v0.19.1/geckodriver-v0.19.1-linux64.tar.gz

　　Geckodriver版本与Firefox版本映射关系

　　https://blog.csdn.net/u013250071/article/details/78803230

　　下载驱动后，可以放在python27/scrpts目录下，也可以放在某个目录，设置在环境变量path里面

　具体实现代码：

        chrome_options = Options()
　　　　　#Chrome-headless 模式， Google 针对 Chrome 浏览器 59版 新增加的一种模式，可以让你不打开UI界面的情况下使用 Chrome 浏览器，所以运行效果与 Chrome 保持完美一致。

        chrome_options.add_argument('--headless')

        chrome_options.add_argument('--disable-gpu')

        self.driver = webdriver.Chrome(chrome_options=chrome_options)

        self.driver.set_page_load_timeout(10)

        self.driver.maximize_window()

　　其它使用同 phantomjs 一样

完整python代码

# coding=utf-8

import os

import re

from selenium import webdriver

from selenium.webdriver.chrome.options import Options

from datetime import datetime,timedelta

import time

from pyquery import PyQuery as pq

import re

import datetime

class consumer:

    def __init__(self):

        #通过配置文件获取IEDriverServer.exe路径

        # IEDriverServer ='C:\Program Files\Internet Explorer\IEDriverServer.exe'

        # self.driver = webdriver.Ie(IEDriverServer)

        # self.driver.maximize_window()

        # self.driver = webdriver.PhantomJS(service_args=['--load-images=false'])

        # self.driver = driver = webdriver.Chrome()

        # chrome_options = Options()

        # chrome_options.add_argument('--headless')

        # chrome_options.add_argument('--disable-gpu')

        # self.driver = webdriver.Chrome(chrome_options=chrome_options)

        options = webdriver.FirefoxOptions()

        options.set_headless()

        # options.add_argument('-headless')

        options.add_argument('--disable-gpu')

        self.driver = webdriver.Firefox(firefox_options=options)

        self.driver.set_page_load_timeout(10)

        self.driver.maximize_window()

    def WriteLog(self, message,date):

        fileName = os.path.join(os.getcwd(), 'consumer/' + date  +   '.txt')

        with open(fileName, 'a') as f:

            f.write(message)

    # http://search.cctv.com/search.php?qtext=消费主张&type=video

    def CatchData(self,url='http://search.cctv.com/search.php?qtext=%E6%B6%88%E8%B4%B9%E4%B8%BB%E5%BC%A0&type=video'):

        error = ''

        try:

            self.driver.get(url)

            selenium_html = self.driver.execute_script("return document.documentElement.outerHTML")

            doc = pq(selenium_html)

            filename = datetime.datetime.now().strftime('%Y-%m-%d')

            message = '{0},{1}'.format( '标题', '时间')

            filename = datetime.datetime.now().strftime('%Y-%m-%d')

            self.WriteLog(message, filename)

            pages = doc("div[class='page']").find("a")

            # 2018-06-05 00:12:21

            pattern = re.compile("\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}")

            for index in range(1,6):

                url = "get_data('{0}', '消费主张', 'relevance', 'video', '-1', '1', '', '20', '1')".format(index)

                self.driver.execute_script(url)

                selenium_html = self.driver.execute_script("return document.documentElement.outerHTML")

                doc = pq(selenium_html)

                print(index)

                try:

                    Elements = doc("div[class='jvedio']").find("a")

                    for sub in Elements.items():

                        title = sub.attr('title')

                        print(title)

                        ts = pattern.findall(title)

                        strtime = ''

                        if ts and len(ts) == 1:

                            strtime = ts[0]

                        if strtime:

                            index1 = title.index(strtime)

                            title = str(title[0:index1]).replace("•","")

                        title = '\n{0},{1}'.format(title, strtime)

                        self.WriteLog(title, filename)

                except Exception as e:

                    print("OS error: {0}".format(e))

        except Exception as e1:

            error = "ex"

# python "C:\Program Files (x86)\JetBrains\PyCharm 2016.2.3\helpers\pydev\setup_cython.py" build_ext --inplace

obj = consumer()

obj.CatchData()

# obj.CatchContent('')

# obj.export('')

Selenium support for PhantomJS has been deprecated, please use headless的更多相关文章

selenium抓取淘宝数据报错:warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless
ssh://root@192.168.33.12:22/root/anaconda3/bin/python3 -u /www/python3/maoyantop100/meishi_selenium. ...
使用PhantomJS报warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '解决方法
selenium已经放弃PhantomJS了,建议使用火狐或者谷歌无界面浏览器.使用无界面浏览器Selenium+Headless Firefox Selenium+Headless Firefox和 ...
PhantomJS报错warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '
原因:Selenuim已经放弃PhantomJS3.x了,建议使用火狐或者谷歌无头浏览器. 解决方法: 1.phantomjs降级,换个2.x版本的 2.使用无头浏览器,示例代码(自己改了改,如有错误 ...
新版的 selenium已经放弃PhantomJS改用Chorme headless
新版的 selenium已经放弃PhantomJS改用Chorme headless 使用pip show selenium显示默认安装的是3.1.3版本目前使用新版selenium调用Phant ...
PhantomJS的替代品--无头浏览器（Headless Chrome）
在使用PhantomJS时候,出现提示: UserWarning: Selenium support for PhantomJS has been deprecated, please use hea ...
selenium + firefox/chrome/phantomjs登陆之模拟点击
登陆之模拟点击工具:python/java + selenium + firefox/chrome/phantomjs (1)windows开发环境搭建默认已经安装好了firefox 安装pip ...
selenium support
org.openqa.selenium.support.ui.Select select = new org.openqa.selenium.support.ui.Select(driver.fi ...
第三百三十七节，web爬虫讲解2—PhantomJS虚拟浏览器+selenium模块操作PhantomJS
第三百三十七节,web爬虫讲解2—PhantomJS虚拟浏览器+selenium模块操作PhantomJS PhantomJS虚拟浏览器 phantomjs 是一个基于js的webkit内核无头浏览器 ...
在 Selenium 中让 PhantomJS 执行它的 API
from selenium import webdriver driver = webdriver.PhantomJS() script = "var page = this; page.o ...

随机推荐

测试开发之Django——No6.Django模板中的标签语言
模板中的标签语言 1.if/else {% if %} 标签检查(evaluate)一个变量,如果这个变量为真(即:变量存在,非空,不是布尔值假),系统会显示在{% if %} 和 {% endi ...
C#学习网站收集
1. 大名鼎鼎的CodeGuru 号称代码领头羊非常著名的关于程序开发的网站,大量的资料.强烈推荐 http://www.codeguru.com/ - 外文 2. Developer.com: A ...
curl 命令模拟 HTTP GET/POST 请求
https://www.cnblogs.com/alfred0311/p/7988648.html
SpringAOP学习第一天 @Pointcut注解
自从上班之后,就很少再看AOP相关的内容,几年时间里虽然也有一两次完整看过,一直没有机会用到,都忘记了.今天重温一下 TestNG测试类 package com.test.spring.aop.min ...
day12--数据库（Mysq）
1. 数据库介绍什么是数据库?(https://www.cnblogs.com/alex3714/articles/5950372.html) 数据库(Database)是按照数据结构来组织.存储和 ...
【AtCoder】AGC018
A - Getting Difference 我们肯定可以得到这些数的gcd,然后判断每个数减整数倍的gcd能否得到K #include <bits/stdc++.h> #define f ...
Docker 启动时容器无法联网
转自:https://blog.csdn.net/u014062332/article/details/52911405 启动docker web服务时虚拟机端口转发外部无法访问 centos 7 ...
Jedis入门
一:介绍 1.Jedis的官网 2.使用这个可以从上面的连接进入github. https://github.com/xetorthio/jedis 3.使用方式或者使用jar包,不过这里我使用官 ...
076 Apache的HBase与cdh的sqoop集成（不建议不同版本之间的集成）
1.修改sqoop的配资文件 2.从mysql导入到hbase(import) bin/sqoop import \ --connect jdbc:mysql://linux-hadoop3.ibei ...
R从3.4升级到3.5
这里介绍的就是R的一个包:installr. installr {installr} R Documentation Installing software from R Description Gi ...

Selenium support for PhantomJS has been deprecated, please use headless

Selenium support for PhantomJS has been deprecated, please use headless的更多相关文章

随机推荐

热门专题