http://www.hashbangcode.com/blog/netscape-http-cooke-file-parser-php

I recently needed to create a function that would read and extract cookies from a Netscape HTTP cookie file. This file is generated by PHP when it runs CURL (with the appropriate options enabled) and can be used in subsequent CURL calls. This file can be read to see what cookies where created after CURL has finished running. As an example, this is the sort of file that might be created during a typical CURL call.

# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This file was generated by libcurl! Edit at your own risk.

www.example.com        FALSE        /        FALSE                cookiename        value

The first few lines are comments and can therefore be ignored. The cookie data consists of the following items (in the order they appear in the file.

  • domain - The domain that created and that can read the variable.
  • flag - A TRUE/FALSE value indicating if all machines within a given domain can access the variable. This value is set automatically by the browser, depending on the value you set for domain.
  • path - The path within the domain that the variable is valid for.
  • secure - A TRUE/FALSE value indicating if a secure connection with the domain is needed to access the variable.
  • expiration - The UNIX time that the variable will expire on.
  • name - The name of the variable.
  • value - The value of the variable.

So the function used to extract this information would look like this. It works in a pretty straightforward way and essentially returns an array of cookies found, if any. I originally tried to use a hash character to determine the start of a commented line and then try to extract anything else that had content. It turns out, however, that some sites will add cookies with a hash character at the start (yes, even for the URL parameter). So it is safer to detect for a cookie line by seeing if there are 6 tab characters in it. This is then exploded by the tab character and converted into an array of data items.

/**
 * Extract any cookies found from the cookie file. This function expects to get
 * a string containing the contents of the cookie file which it will then
 * attempt to extract and return any cookies found within.
 *
 * @param string $string The contents of the cookie file.
 *
 * @return array The array of cookies as extracted from the string.
 *
 */
function extractCookies($string) {
    $cookies = array();

    $lines = explode("\n", $string);

    // iterate over lines
    foreach ($lines as $line) {

        // we only care for valid cookie def lines
        if (isset($line[0]) && substr_count($line, "\t") == 6) {

            // get tokens in an array
            $tokens = explode("\t", $line);

            // trim the tokens
            $tokens = array_map('trim', $tokens);

            $cookie = array();

            // Extract the data
            $cookie['domain'] = $tokens[0];
            $cookie['flag'] = $tokens[1];
            $cookie['path'] = $tokens[2];
            $cookie['secure'] = $tokens[3];

            // Convert date to a readable format
            $cookie['expiration'] = date('Y-m-d h:i:s', $tokens[4]);

            $cookie['name'] = $tokens[5];
            $cookie['value'] = $tokens[6];

            // Record the cookie.
            $cookies[] = $cookie;
        }
    }

    return $cookies;
}

To test this function I used the following code. This takes a URL (google.com in this case) and sets up the options for CURL so that when the page is downloaded it also creates a cookie file. This file is then analyzed using the above function to see what cookies are present therein.

// Url to extract cookies from
$url = 'http://www.google.com/';

// Create a cookiefar file
$cookiefile = tempnam("/tmp", "CURLCOOKIE");

// create a new cURL resource
$curl = curl_init();

curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

// Set user agent
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.3) Gecko/20090910 Ubuntu/9.04 (jaunty) Shiretoko/3.5.3");

// set URL and other appropriate options
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_HEADER, true);

curl_setopt($curl, CURLOPT_COOKIEJAR, $cookiefile);

$data = curl_exec($curl);

// close cURL resource, and free up system resources
curl_close($curl);

// Extract and store any cookies found
print_r(extractCookies(file_get_contents($cookiefile)));

When run, this function produces the following output.

Array
(
    [0] => Array
        (
            [domain] => .google.com
            [flag] => TRUE
            [path] => /
            [secure] => FALSE
            [expiration] => 2013-06-29 10:00:01
            [name] => PREF
            [value] => ID=051f529ee8937fc5:FF=0:TM=1309424401:LM=1309424401:S=4rhYyPL_bW9KxVHI
        )

)

Netscape HTTP Cooke File Parser In PHP的更多相关文章

  1. minIni: A minimal INI file parser

    https://www.compuphase.com/minini.htm https://github.com/compuphase/minIni

  2. VBA json parser[z]

    http://www.ediy.co.nz/vbjson-json-parser-library-in-vb6-xidc55680.html VB-JSON: A Visual Basic 6 (VB ...

  3. Lexer and parser generators (ocamllex, ocamlyacc)

    Chapter 12 Lexer and parser generators (ocamllex, ocamlyacc) This chapter describes two program gene ...

  4. qLibc 对于C C++都是一个很好的框架,提供Tree Hash Stack String I/O File Time等功能

    qLibc Copyright qLibc is published under 2-clause BSD license known as Simplified BSD License. Pleas ...

  5. Python中的option Parser

    一般来说,Python中有两个内建的模块用于处理命令行参数: 一个是 getopt,<Deep in python>一书中也有提到,只能简单处理 命令行参数: 另一个是 optparse, ...

  6. java面试题总汇

    coreJava部分 7 1.面向对象的特征有哪些方面? 7 2.作用域public,private,protected,以及不写时的区别? 7 3.String 是最基本的数据类型吗? 7 4.fl ...

  7. SSH三大框架笔面试总结

    Java工程师(程序员)面题 Struts,Spring,Hibernate三大框架 1.Hibernate工作原理及为什么要用? 原理: 1.读取并解析配置文件 2.读取并解析映射信息,创建Sess ...

  8. Java面试葵花宝典

    面向对象的特征有哪些方面  1. 抽象:抽象就是忽略一个主题中与当前目标2. 无关的那些方面,3. 以便更充分地注意与当前目标4. 有关的方面.抽象并不5. 打算了解全部问题,而6. 只是选择其中的一 ...

  9. java面试题小全

    面向对象的特征有哪些方面   1. 抽象:抽象就是忽略一个主题中与当前目标2. 无关的那些方面,3. 以便更充分地注意与当前目标4. 有关的方面.抽象并不5. 打算了解全部问题,而6. 只是选择其中的 ...

随机推荐

  1. 关于跨域GET、POST请求的小结//////////////////////zzzzzzz

    JQuery:$.ajax/$.getJSON支持jsonp格式的跨域,但是只支持GET方式,暂不支持POST: so,跨域POST是个值得研究的问题啊!万能的JQuery无法跨域POST:鉴于基本国 ...

  2. PeopleSoft Rich Text Boxes上定制Tool Bars

      在使用PT8.50或在8.51时,你可能遇到过Rich-text编辑框.该插件使你能够格式化文本,添加颜色.链接.图片等等.下面是效果图: 如果页面中只有这么一个字段,该文本框就会有足够的空间来容 ...

  3. PHP设计模式之:单例模式

        前 些日子开始着真正的去了解下设计模式,开始么,简单地从单例模式开始,当然网上看了一些资料,单例模式比较好理解,看看介绍,然后看看代码基本也就能够理 解了,设计模式这些的花点心思基本的是能够理 ...

  4. Mountains(CVTE面试题)解题报告

    题目大意: 用一个数组代表群山的高度.高度大的地方代表山峰,小的地方代表山谷.山谷可以容水.假设有一天下了大雨,求群山中总共可以容纳多少水? 如图所示情况,a代表该数组,总共可以容纳5个水. 解题思路 ...

  5. 多Web服务器之间共享Session的解决方案

    一.提出问题: 为了满足足够大的应用,满足更多的客户,于是我们架设了N台Web服务器(N>=2),在多台Web服务器的情况下,我们会涉及到一个问题:用户登陆一台服务器以后,如果在跨越到另一台服务 ...

  6. redmine computed custom field formula tips

    项目中要用到Computed custom field插件,公式不知道怎么写,查了些资料,记录在这里. 1.http://apidock.com/ruby/Time/strftime 查看ruby的字 ...

  7. Samsung S4卡屏卡在开机画面的不拆机恢复照片一例

    大家好!欢迎再次来到我Dr.wonder的世界, 今天我给你们带来Samsung S4 I9508 卡屏开在开机画面的恢复!非常de经典. 首先看图 他开机一直卡在这里, 然后 ,我们使用专业仪器,在 ...

  8. Duilib 开发中的小经验

    # duilib开发中收集的小代码 # ## 1 窗体创建 ## - 窗体多继承于 public WindowImplBase ,简单的定义几个函数就可以实现:拖曳caption移动(设置xml窗体的 ...

  9. WebLogic集群体系架构

    WebLogic Server集群概述  WebLogic Server 群集由多个 WebLogic Server 服务器实例组成,这些服务器实例同时运行并一起工作以提高可缩放性和可靠性.对于客户端 ...

  10. Mvc 模块化开发

    在Mvc中,标准的模块化开发方式是使用Areas,每一个Area都可以注册自己的路由,使用自己的控件器与视图.但是在具体使用上它有如下两个限制 1.必须把视图文件放到主项目的Areas文件夹下才能生效 ...