javascript reg 不加入分组

from ：https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-what-does-a-question-mark-followed-by-a-colon

fter reading some tutorials I still don't get it.

Could someone explain how ?: is used and what it's good for?

Let me try to explain this with an example.

Consider the following text:

https://stackoverflow.com/

https://stackoverflow.com/questions/tagged/regex

Now, if I apply the regex below over it...

(http|ftp)://([^/\r\n]+)(/[^\r\n]*)?

... I would get the following result:

Match "https://stackoverflow.com/"

     Group 1: "http"

     Group 2: "stackoverflow.com"

     Group 3: "/"

Match "https://stackoverflow.com/questions/tagged/regex"

     Group 1: "http"

     Group 2: "stackoverflow.com"

     Group 3: "/questions/tagged/regex"

But I don't care about the protocol -- I just want the host and path of the URL. So, I change the regex to include the non-capturing group (?:).

(?:http|ftp)://([^/\r\n]+)(/[^\r\n]*)?

Now, my result looks like this:

Match "https://stackoverflow.com/"

     Group 1: "stackoverflow.com"

     Group 2: "/"

Match "https://stackoverflow.com/questions/tagged/regex"

     Group 1: "stackoverflow.com"

     Group 2: "/questions/tagged/regex"

See? The first group has not been captured. The parser uses it to match the text, but ignores it later, in the final result.

EDIT:

As requested, let me try to explain groups too.

Well, groups serve many purposes. They can help you to extract exact information from a bigger match (which can also be named), they let you rematch a previous matched group, and can be used for substitutions. Let's try some examples, shall we?

Ok, imagine you have some kind of XML or HTML (be aware that regex may not be the best tool for the job, but it is nice as an example). You want to parse the tags, so you could do something like this (I have added spaces to make it easier to understand):

   \<(?<TAG>.+?)\> [^<]*? \</\k<TAG>\>

or

   \<(.+?)\> [^<]*? \</\1\>

The first regex has a named group (TAG), while the second one uses a common group. Both regexes do the same thing: they use the value from the first group (the name of the tag) to match the closing tag. The difference is that the first one uses the name to match the value, and the second one uses the group index (which starts at 1).

Let's try some substitutions now. Consider the following text:

Lorem ipsum dolor sit amet consectetuer feugiat fames malesuada pretium egestas.

Now, let's use the this dumb regex over it:

\b(\S)(\S)(\S)(\S*)\b

This regex matches words with at least 3 characters, and uses groups to separate the first three letters. The result is this:

Match "Lorem"

     Group 1: "L"

     Group 2: "o"

     Group 3: "r"

     Group 4: "em"

Match "ipsum"

     Group 1: "i"

     Group 2: "p"

     Group 3: "s"

     Group 4: "um"

...

Match "consectetuer"

     Group 1: "c"

     Group 2: "o"

     Group 3: "n"

     Group 4: "sectetuer"

...

So, if we apply the substitution string...

$1_$3$2_$4

... over it, we are trying to use the first group, add an underscore, use the third group, then the second group, add another underscore, and then the fourth group. The resulting string would be like the one below.

L_ro_em i_sp_um d_lo_or s_ti_ a_em_t c_no_sectetuer f_ue_giat f_ma_es m_la_esuada p_er_tium e_eg_stas.

You can use named groups for substitutions too, using ${name}.

To play around with regexes, I recommend http://regex101.com/, which offers a good amount of details on how the regex works; it also offers a few regex engines to choose from.

javascript reg 不加入分组的更多相关文章

JavaScript正则表达式模式匹配(2)——分组模式匹配
var pattern=/google{4,8}$/; // {4,8}$表示匹配结尾4-8次 var str='googleeeeeeeee'; // 表示e的4-8次 alert(pattern. ...
javascript正则表达式（一）
元字符 ( [ { \ ^ $ | ) ? * + . 预定义的特殊字符字符正则描述 \t /\t/ 制表符 \n /\n/ 制表符 \r /\r/ 回车符 \f /\f/ 换页符 \a /\a ...
正则表达式（javascript）学习总结
正则表达式在jquery.linux等随处可见,已经无孔不入.因此有必要对这个工具认真的学习一番.本着认真.严谨的态度,这次总结我花了近一个月的时间.但本文无任何创新之处,属一般性学习总结. 一.思考 ...
JS正则表达式---分组
JS正则表达式---分组之前写了一篇关于正则新手入门的文章,本以为对正则表达式相对比较了解但是今天我又遇到了一个坑,可能是自己不够细心的原因吧,今天就着重和大家分享一下javascript正则表达 ...
javascript的正则表达式总结
网上正则表达式的教程够多了,但由于javascript的历史比较悠久,也比较古老,因此有许多特性是不支持的.我们先从最简单地说起,文章所演示的正则基本都是perl方式. 元字符 ( [ { \ ^ $ ...
javascript:正则大全
:replace函数,为写自己的js模板做准备待完善 function 1,声明&用法 //数组: var arr=[];//字面量 var arr=new Array();//构造函数 / ...
JavaScript探秘系列
此文章所在专题列表如下: 我们应该如何去了解JavaScript引擎的工作原理 JavaScript探秘:编写可维护的代码的重要性 JavaScript探秘:谨慎使用全局变量 JavaScript探秘 ...
系列文章--JavaScript教程文章
JavaScript教程文章专题列表如下: 我们应该如何去了解JavaScript引擎的工作原理 JavaScript探秘:编写可维护的代码的重要性 JavaScript探秘:谨慎使用全局变量 Jav ...
温故知新 javascript 正则表达式
很长时间没看正则表达式了,碰巧今天用到,温故知新了一把看书学习吧 50% 的举一反三练习中的原创. 一 javascript正则表达式的基本知识 1 javascript 正则对象创建 ...

随机推荐

一个很大的文件，存放了10G个整数的乱序数列，如何用程序找出中位数。
一.梳理审题一.看清题目: 注意这个题目的量词,这个文件中有10G个整数,而不是这个文件占了10G的内存空间. 二.一些疑问: 在计算机中我们讲的G.M等都是存储容量的概念,但是一般都会在会面加上B ...
uva11020 set
有n个人,每个人有两个属性x,y.如果对于一个人P(x,y) 不存在另外一个人(x',y') 使得x'<x,y'<=y 或者 x'<=x,y'<y 我们说p是有优势的,每次给出 ...
iview使用vue-i18n实现国际化
iview官网中和网上的例子中使用的都是webpack方式,需要import js文件,但是由于项目架构比较简单,没有使用webpack,纯html和js进行交互.所以这里就直接使用js文件引用方式. ...
20189215《Linux内核原理与分析》第一周作业
实验1 Linux系统简介本节主要学习了 Linux 的历史,Linux 与 Windows 的区别等入门知识.通过学习,我明确了目的,是要用 Linux 来做程序开发.搭建服务器等:并且非常接受不 ...
ifconfig源码分析之与内核交互数据
<ifconfig源码分析之与内核交互数据>本文档的Copyleft归rosetta所有,使用GPL发布,可以自由拷贝.转载,转载时请保持文档的完整性.参考资料:<Linux设备驱动 ...
maven clean 异常问题
当使用`mvn clean`,报`maven… Failed to clean project: Failed to delete ..`时,如果你觉得这个文件删除成功或失败没有关系,可以使用如下命令 ...
【Nature 子刊】I型HLA基因中和癌症相关的体细胞突变--转载
肿瘤的发生与免疫系统的功能密切相关.在免疫系统中,MHC(主要组织相容性复体,majorhistocompatibilitycomplex)是所有生物相容复合体抗原的一种统称.HLA(humanleu ...
Tp5，Service使用
C层,操控数据库,并处理页面数据展示. M层,纯粹的操作自己所对应的数据库. Service层,可以通用的处理一些逻辑计算,也可以将复杂的数据表处理整合到一起,也可以将复杂的业务逻辑整合到一起. 创建 ...
TinyURL
2018-03-09 15:19:04 TinyURL,短地址,或者叫短链接,指的是一种互联网上的技术与服务.此服务可以提供一个非常短小的URL以代替原来的可能较长的URL,将长的URL地址缩短. 用 ...
java中如何使用Junit测试
java中如何使用Junit测试一.总结一句话总结:a.单元测试的测试代码在test文件夹下,和源码不在同一个文件夹下 b.测试的类方法都以test开头,后面接要测试的类或者方法的名字 1.JUn ...

javascript reg 不加入分组

EDIT:

javascript reg 不加入分组的更多相关文章

随机推荐

热门专题