URL Parsing
【URL Parsing】
urllib.parse.
urlparse
(urlstring, scheme='', allow_fragments=True)
Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment
. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the path component, which is retained if present. For example:
Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.
The scheme argument gives the default addressing scheme.
If the allow_fragments argument is false, fragment identifiers are not recognized. Instead, they are parsed as part of the path, parameters or query component, and fragment
is set to the empty string in the return value.
urllib.parse.
parse_qs
(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')
Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a dictionary. The dictionary keys are the unique query variable names and the values are lists of values for each name.
urllib.parse.
parse_qsl
(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')
Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a list of name, value pairs.
urllib.parse.
urlunparse
(parts)
Construct a URL from a tuple as returned by urlparse()
. The parts argument can be any six-item iterable.
urllib.parse.
urlsplit
(urlstring, scheme='', allow_fragments=True)
This is similar to urlparse()
, but does not split the params from the URL.
urllib.parse.
urlunsplit
(parts)
Combine the elements of a tuple as returned by urlsplit()
into a complete URL as a string. The parts argument can be any five-item iterable.
urllib.parse.
urldefrag
(url)
If url contains a fragment identifier, return a modified version of url with no fragment identifier, and the fragment identifier as a separate string. If there is no fragment identifier in url, return url unmodified and an empty string.
URL Parsing的更多相关文章
- w3 parse a url
最新链接:https://www.w3.org/TR/html53/ 2.6 URLs — HTML5 li, dd li { margin: 1em 0; } dt, dfn { font-wei ...
- GO语言的开源库
Indexes and search engines These sites provide indexes and search engines for Go packages: godoc.org ...
- Python Web Crawler
Python版本:3.5.2 pycharm URL Parsing¶ https://docs.python.org/3.5/library/urllib.parse.html?highlight= ...
- Curl的编译
下载 curl的官网:https://curl.haxx.se/ libcurl就是一个库,curl就是使用libcurl实现的. curl是一个exe,也可以说是整个项目的名字,而libcurl就是 ...
- Go语言(golang)开源项目大全
转http://www.open-open.com/lib/view/open1396063913278.html内容目录Astronomy构建工具缓存云计算命令行选项解析器命令行工具压缩配置文件解析 ...
- Go by Example
Go by Example Go is an open source programming language designed for building simple, fast, and reli ...
- Node.js API
Node.js v4.4.7 Documentation(官方文档) Buffer Prior to the introduction of TypedArray in ECMAScript 2015 ...
- [转]Go语言(golang)开源项目大全
内容目录 Astronomy 构建工具 缓存 云计算 命令行选项解析器 命令行工具 压缩 配置文件解析器 控制台用户界面 加密 数据处理 数据结构 数据库和存储 开发工具 分布式/网格计算 文档 编辑 ...
- python爬虫从入门到放弃(三)之 Urllib库的基本使用
官方文档地址:https://docs.python.org/3/library/urllib.html 什么是Urllib Urllib是python内置的HTTP请求库包括以下模块urllib.r ...
随机推荐
- 【转载】AngularJs 指令directive之controller,link,compile
关于自定义指令的命名,你可以随便怎么起名字都行,官方是推荐用[命名空间-指令名称]这样的方式,像ng-controller.不过你可千万不要用 ng-前缀了,防止与系统自带的指令重名.另外一个需知道的 ...
- Android——数据存储(课堂代码整理:SharedPreferences存储和手机内部文件存储)
layout文件: <?xml version="1.0" encoding="utf-8"?> <LinearLayout xmlns:an ...
- github 如何合并不同分支
From: http://stackoverflow.com/questions/1123344/merging-between-forks-in-github 1. 添加remote origina ...
- 排序算法总结(四)快速排序【QUICK SORT】
感觉自己这几篇都是主要参考的Wikipedia上的,快排就更加是了....wiki上的快排挺清晰并且容易理解的,需要注意的地方我也添加上了注释,大家可以直接看代码.需要注意的是,wikipedia上快 ...
- Mysql 数据库单机多实例部署手记
最近的研发机器需要部署多个环境,包括数据库.为了管理方便考虑将mysql数据库进行隔离,即采用单机多实例部署的方式.找了会资料发现用的人也不是太多,一般的生产环境为了充分发挥机器性能都是单机单 ...
- 【其它】 MathJax - 网页中显示数学公式的终极武器
最近在学习一些数学课程.但时间一长,发现很多东西又都忘了.而且过程中的很多心得没有留下记录,觉得挺可惜的.所以决定开个博客来记录一些东西,也希望能同数学爱好者们一起学习. 但写数学博客首先得解决显示数 ...
- Redis入门(一)系统安装
硬件环境:Thinkpad T450,Intel i5-5200U CPU @ 2.20GHz × 4 ,8GB RAM 软件环境: ubuntu 14.04.4 (trusty) 一.软件安装 #w ...
- 过滤HTML代码
public static string FilterHtml(string string_include_html) { string[] HtmlRegexArr ={ #region Html ...
- HTML5能取代IOS原生应用吗
介绍 移动应用程序(App)和HTML5都是目前最火的技术,二者之间也有不少重叠之处.在移动设备浏览器里运行的html5的web页面,也可以重新打包成不同平台上运行的app.目前很多浏览器都有很好的跨 ...
- 严格模式use strict
严格模式主要有以下限制: 变量必须声明后再使用函数的参数不能有同名属性,否则报错不能使用with语句不能对只读属性赋值,否则报错不能使用前缀0表示八进制数,否则报错不能删除不可删除的属性,否则报错不能 ...