javascript reg 不加入分组
from :https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-what-does-a-question-mark-followed-by-a-colon
fter reading some tutorials I still don't get it.
Could someone explain how ?:
is used and what it's good for?
Let me try to explain this with an example.
Consider the following text:
https://stackoverflow.com/
https://stackoverflow.com/questions/tagged/regex
Now, if I apply the regex below over it...
(http|ftp)://([^/\r\n]+)(/[^\r\n]*)?
... I would get the following result:
Match "https://stackoverflow.com/"
Group 1: "http"
Group 2: "stackoverflow.com"
Group 3: "/"
Match "https://stackoverflow.com/questions/tagged/regex"
Group 1: "http"
Group 2: "stackoverflow.com"
Group 3: "/questions/tagged/regex"
But I don't care about the protocol -- I just want the host and path of the URL. So, I change the regex to include the non-capturing group (?:)
.
(?:http|ftp)://([^/\r\n]+)(/[^\r\n]*)?
Now, my result looks like this:
Match "https://stackoverflow.com/"
Group 1: "stackoverflow.com"
Group 2: "/"
Match "https://stackoverflow.com/questions/tagged/regex"
Group 1: "stackoverflow.com"
Group 2: "/questions/tagged/regex"
See? The first group has not been captured. The parser uses it to match the text, but ignores it later, in the final result.
EDIT:
As requested, let me try to explain groups too.
Well, groups serve many purposes. They can help you to extract exact information from a bigger match (which can also be named), they let you rematch a previous matched group, and can be used for substitutions. Let's try some examples, shall we?
Ok, imagine you have some kind of XML or HTML (be aware that regex may not be the best tool for the job, but it is nice as an example). You want to parse the tags, so you could do something like this (I have added spaces to make it easier to understand):
\<(?<TAG>.+?)\> [^<]*? \</\k<TAG>\>
or
\<(.+?)\> [^<]*? \</\1\>
The first regex has a named group (TAG), while the second one uses a common group. Both regexes do the same thing: they use the value from the first group (the name of the tag) to match the closing tag. The difference is that the first one uses the name to match the value, and the second one uses the group index (which starts at 1).
Let's try some substitutions now. Consider the following text:
Lorem ipsum dolor sit amet consectetuer feugiat fames malesuada pretium egestas.
Now, let's use the this dumb regex over it:
\b(\S)(\S)(\S)(\S*)\b
This regex matches words with at least 3 characters, and uses groups to separate the first three letters. The result is this:
Match "Lorem"
Group 1: "L"
Group 2: "o"
Group 3: "r"
Group 4: "em"
Match "ipsum"
Group 1: "i"
Group 2: "p"
Group 3: "s"
Group 4: "um"
...
Match "consectetuer"
Group 1: "c"
Group 2: "o"
Group 3: "n"
Group 4: "sectetuer"
...
So, if we apply the substitution string...
$1_$3$2_$4
... over it, we are trying to use the first group, add an underscore, use the third group, then the second group, add another underscore, and then the fourth group. The resulting string would be like the one below.
L_ro_em i_sp_um d_lo_or s_ti_ a_em_t c_no_sectetuer f_ue_giat f_ma_es m_la_esuada p_er_tium e_eg_stas.
You can use named groups for substitutions too, using ${name}
.
To play around with regexes, I recommend http://regex101.com/, which offers a good amount of details on how the regex works; it also offers a few regex engines to choose from.
javascript reg 不加入分组的更多相关文章
- JavaScript正则表达式模式匹配(2)——分组模式匹配
var pattern=/google{4,8}$/; // {4,8}$表示匹配结尾4-8次 var str='googleeeeeeeee'; // 表示e的4-8次 alert(pattern. ...
- javascript正则表达式(一)
元字符 ( [ { \ ^ $ | ) ? * + . 预定义的特殊字符 字符 正则 描述 \t /\t/ 制表符 \n /\n/ 制表符 \r /\r/ 回车符 \f /\f/ 换页符 \a /\a ...
- 正则表达式(javascript)学习总结
正则表达式在jquery.linux等随处可见,已经无孔不入.因此有必要对这个工具认真的学习一番.本着认真.严谨的态度,这次总结我花了近一个月的时间.但本文无任何创新之处,属一般性学习总结. 一.思考 ...
- JS正则表达式---分组
JS正则表达式---分组 之前写了一篇关于正则新手入门的文章,本以为对正则表达式相对比较了解 但是今天我又遇到了一个坑,可能是自己不够细心的原因吧,今天就着重和大家分享一下javascript正则表达 ...
- javascript的正则表达式总结
网上正则表达式的教程够多了,但由于javascript的历史比较悠久,也比较古老,因此有许多特性是不支持的.我们先从最简单地说起,文章所演示的正则基本都是perl方式. 元字符 ( [ { \ ^ $ ...
- javascript:正则大全
:replace函数,为写自己的js模板做准备 待完善 function 1,声明&用法 //数组: var arr=[];//字面量 var arr=new Array();//构造函数 / ...
- JavaScript探秘系列
此文章所在专题列表如下: 我们应该如何去了解JavaScript引擎的工作原理 JavaScript探秘:编写可维护的代码的重要性 JavaScript探秘:谨慎使用全局变量 JavaScript探秘 ...
- 系列文章--JavaScript教程文章
JavaScript教程文章专题列表如下: 我们应该如何去了解JavaScript引擎的工作原理 JavaScript探秘:编写可维护的代码的重要性 JavaScript探秘:谨慎使用全局变量 Jav ...
- 温故知新 javascript 正则表达式
很长时间没看 正则表达式了,碰巧今天用到,温故知新了一把 看书学习吧 50% 的举一反三练习中的原创. 一 javascript正则表达式的基本知识 1 javascript 正则对象创建 ...
随机推荐
- Python:笔记(6)——正则表达式
Python:笔记(6)——正则表达式 re模块 re模块用于在字符串中执行基于正则表达式模式的匹配和替换. 使用原始字符串 正则表达式使用 \ 对特殊字符进行转义,比如,为了匹配字符串 ‘pytho ...
- iframe 跨域问题解决方案 利用window.name+iframe跨域获取数据详解
详解 前文提到用jsonp的方式来跨域获取数据,本文为大家介绍下如何利用window.name+iframe跨域获取数据. 首先我们要简单了解下window.name和iframe的相关知识.ifra ...
- SQL学习笔记八之ORM框架SQLAlchemy
阅读目录 一 介绍 二 创建表 三 增删改查 四 其他查询相关 五 正查.反查 一 介绍 SQLAlchemy是Python编程语言下的一款ORM框架,该框架建立在数据库API之上,使用关系对象映射进 ...
- bzoj1628 [Usaco2007 Demo]City skyline(单调栈)
Description Input 第一行给出N,W 第二行到第N+1行:每行给出二个整数x,y,输入的x严格递增,并且第一个x总是1 Output 输出一个整数,表示城市中最少包含的建筑物数量 Sa ...
- Qt无法调试Qvector
现象: 解决: 打开文件 $(VSDIR)\Common7\Packages\Debugger\autoexp.dat (VSDIR是本机Visual Studio的安装目录)把定义QVector和Q ...
- 根据Bootstrap的Modal开发的提示框
代码: (function ($) { $(function () { var Modal = function () { var htmlContent = "<div id=\&q ...
- 基于Oracle Sequence的流水号生成规则
流水号在各种系统中随处可见,一般都是使用自增.年月日时分秒+自增.UUID等,要么纯数字,要么纯字母,这种流水号缺乏一定的辨识度. 下面为大家介绍一种具有辨识度的流水号的生成方式:领域或者应用的标识 ...
- CF_863_F(Netflow)
codeforces_863_F 题目大意:给出一个数组的大小(n<=50),以及每个位置填数的范围限制(若无限制,即为1-n),最后求填出数组的最小花费,定义总花费为数组中每个数出现次数的平方 ...
- UVa 1662 Brackets Removal
https://vjudge.net/problem/UVA-1662 题意: 给出一个序列,判断序列中哪些括号是可以去掉的,只可以改变符号.输出括号最少的序列. 思路: 感觉这道题目就是写起来繁琐了 ...
- Pandas 练习题
1. 使用 pandas 中的函数,下载上证综指过去一段时间的数据,进行数据探索. 上证综指,全称是上海证券综合指数,是以上证所挂牌上市的全部股票为计算范围,以发行量为权数的加权综合股价指数.这一指数 ...