Python 正则表达式【二】
关于前向,后向,匹配,非匹配
Matches if ... matches next, but doesn’t consume any of the string. This is called a lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it’s followed by 'Asimov'.(?!...)Matches if ... doesn’t match next. This is a negative lookahead assertion. For example, Isaac (?!Asimov) will match 'Isaac ' only if it’s not followed by 'Asimov'.(?<=...)Matches if the current position in the string is preceded by a match for ... that ends at the current position. This is called a positive lookbehind assertion. (?<=abc)def will find a match in abcdef, since the lookbehind will back up 3 characters and check if the contained pattern matches. The contained pattern must only match strings of some fixed length, meaning that abc or a|b are allowed, but a* and a{3,4} are not. Note that patterns which start with positive lookbehind assertions will never match at the beginning of the string being searched; you will most likely want to use the search() function rather than the match() function:
>>> import re
>>> m = re.search('(?<=abc)def', 'abcdef')
>>> m.group(0)
'def'This example looks for a word following a hyphen:
>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'
(?<!...)Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.
前向,后向,匹配,非匹配示例代码:
- import re
- def testPrevPostMatch():
- # post match: (?=xxx)
- # post non-match: (?!xxx)
- # prev match: (?<=xxx)
- # prev non-match: (?<!xxx)
- #note that input string is:
- #src=\"http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA\"
- qqPicUrlStr = 'src=\\"http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA\\"'
- qqPicUrlInvalidPrevStr = '1234567http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA\\"'
- qqPicUrlInvalidPostStr = 'src=\\"http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA123'
- canFindPrevPostP = r'(?<=src=\\")(?P<qqPicUrl>http://.+?\.qq\.com.+?)(?=\\")'
- qqPicUrl = ""
- foundPrevPost = re.search(canFindPrevPostP, qqPicUrlStr)
- print "foundPrevPost=",foundPrevPost
- if(foundPrevPost):
- qqPicUrl = foundPrevPost.group("qqPicUrl")
- print "qqPicUrl=",qqPicUrl; # qqPicUrl= http://b101.photo.store.qq.com/psb?/V10ppwxs00XiXU/5dbOIlYaLYVPWOz*1nHYeSFq09Z5rys72RIJszCsWV8!/b/YYUOOzy3HQAAYqsTPjz7HQAA
- print "can found qqPicUrl here"
- foundInvalidPrev = re.search(canFindPrevPostP, qqPicUrlInvalidPrevStr)
- print "foundInvalidPrev=",foundInvalidPrev; # foundInvalidPrev= None
- if(not foundInvalidPrev):
- print "can NOT found qqPicUrl here"
- foundInvalidPost = re.search(canFindPrevPostP, qqPicUrlInvalidPostStr)
- print "foundInvalidPost=",foundInvalidPost; # foundInvalidPost= None
- if(not foundInvalidPost):
- print "can NOT found qqPicUrl here"
- return
Python中正则表达式关于引用named group的用法示例
- import re
- def testBackReference():
- # back reference (?P=name) test
- backrefValidStr = '"group":0,"iconType":"NonEmptyDocumentFolder","id":"9A8B8BF501A38A36!601","itemType":32,"name":"released","ownerCid":"9A8B8BF501A38A36"'
- backrefInvalidStr = '"group":0,"iconType":"NonEmptyDocumentFolder","id":"9A8B8BF501A38A36!601","itemType":32,"name":"released","ownerCid":"987654321ABCDEFG"'
- backrefP = r'"group":\d+,"iconType":"\w+","id":"(?P<userId>\w+)!\d+","itemType":\d+,"name":".+?","ownerCid":"(?P=userId)"'
- userId = ""
- foundBackref = re.search(backrefP, backrefValidStr)
- print "foundBackref=",foundBackref; # foundBackref= <_sre.SRE_Match object at 0x02B96660>
- if(foundBackref):
- userId = foundBackref.group("userId")
- print "userId=",userId; # userId= 9A8B8BF501A38A36
- print "can found userId here"
- foundBackref = re.search(backrefP, backrefInvalidStr)
- print "foundBackref=",foundBackref; # foundBackref= None
- if(not foundBackref):
- print "can NOT found userId here"
- return
Python 正则表达式【二】的更多相关文章
- python正则表达式二[转]
原文:http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html 1. 正则表达式基础 1.1. 简单介绍 正则表达式并不是Python的一 ...
- python3.4学习笔记(十二) python正则表达式的使用,使用pyspider匹配输出带.html结尾的URL
python3.4学习笔记(十二) python正则表达式的使用,使用pyspider匹配输出带.html结尾的URL实战例子:使用pyspider匹配输出带.html结尾的URL:@config(a ...
- Python正则表达式初识(二)
前几天给大家分享了Python正则表达式初识(一),介绍了正则表达式中的三个特殊字符“^”.“.”和“*”,感兴趣的伙伴可以戳进去看看,今天小编继续给大家分享Python正则表达式相关特殊字符知识点. ...
- Python 正则表达式入门(中级篇)
Python 正则表达式入门(中级篇) 初级篇链接:http://www.cnblogs.com/chuxiuhong/p/5885073.html 上一篇我们说在这一篇里,我们会介绍子表达式,向前向 ...
- python正则表达式 小例几则
会用到的语法 正则字符 释义 举例 + 前面元素至少出现一次 ab+:ab.abbbb 等 * 前面元素出现0次或多次 ab*:a.ab.abb 等 ? 匹配前面的一次或0次 Ab?: A.Ab 等 ...
- Python 正则表达式-OK
Python正则表达式入门 一. 正则表达式基础 1.1. 简单介绍 正则表达式并不是Python的一部分. 正则表达式是用于处理字符串的强大工具, 拥有自己独特的语法以及一个独立的处理引擎, 效率上 ...
- Python正则表达式一
推荐 http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html#!comments 这篇博客超好,建议收藏. 不过对于正则表达式小白,他没 ...
- python 正则表达式汇总
一. 正则表达式基础 1.1.概念介绍 正则表达式是用于处理字符串的强大工具,它并不是Python的一部分. 其他编程语言中也有正则表达式的概念,区别只在于不同的编程语言实现支持的语法数量不同. 它拥 ...
- 一篇搞定Python正则表达式
1. 正则表达式语法 1.1 字符与字符类 1 特殊字符:\.^$?+*{}[]()| 以上特殊字符要想使用字面值,必须使用\进行转义 2 字符类 1. 包含在[]中的一个或者多个字符被称为字符 ...
- Python正则表达式很难?一篇文章搞定他,不是我吹!
1. 正则表达式语法 1.1 字符与字符类 1 特殊字符:.^$?+*{}| 以上特殊字符要想使用字面值,必须使用进行转义 2 字符类 1. 包含在[]中的一个或者多个字符被称为字符类,字符类在匹配时 ...
随机推荐
- CNN for NLP
卷积神经网络在自然语言处理任务中的应用.参考链接:Understanding Convolutional Neural Networks for NLP(2015.11) Instead of ima ...
- D - A or...or B Problem
题意:给定A,B,问[A,B]里取任意个数按位或,结果有多少种. 思路:这题需要找出一个分界点,即找到最高位的B是1,A是0的位置x(最低位从0开始),那么对于所有OR的结果,x处要么是1要么是0,x ...
- 精选30道Java多线程面试题
1.线程和进程的区别 进程是应用程序的执行实例.比如说,当你双击的Microsoft Word的图标,你就开始运行的Word的进程.线程是执行进程中的路径.另外,一个过程可以包含多个线程.启动Word ...
- tcpip入门的网络教程汇总
网络编程懒人入门(一):快速理解网络通信协议(上篇) http://www.52im.net/thread-1095-1-1.html TCP/IP详解学习笔记 https://www.cnblogs ...
- Java-CharTools工具类
package com.gootrip.util; import java.io.UnsupportedEncodingException; /** * <p>Title:字符编码工具类 ...
- android的ant编译打包
Android本身是支持ant打包项目的,并且SDK中自带一个build.xml文件. 通过该文件,可以对文件进行编译.打包.安装等.并且支持多种方式打包,如debug或者release. 一般的,可 ...
- js 获取两个数组的交集,并集,补集,差集
https://blog.csdn.net/piaojiancong/article/details/98199541 ES5 const arr1 = [1,2,3,4,5], arr2 = [5, ...
- jquery keyup()方法 语法
jquery keyup()方法 语法 作用:完整的 key press 过程分为两个部分,按键被按下,然后按键被松开并复位.当按钮被松开时,发生 keyup 事件.它发生在当前获得焦点的元素上.ke ...
- CodeForces 1197 D Yet Another Subarray Problem
题面 不得不说CF还是很擅长出这种让人第一眼看摸不着头脑然后再想想就发现是个SB题的题的hhh(请自行断句). 设sum[]为前缀和数组,那么区间 [l,r]的价值为 sum[r] - sum[l-1 ...
- Centos 7 安装 Xilinx SDSoC Development Environment
1.CentOS版本信息 $ cat /etc/redhat-releaseCentOS Linux release 7.6.1810 (Core) 2.SDSoC下载地址: https://www. ...