Difference between [0-9], [[:digit:]] and \d
Yes, it is [[:digit:]]
~ [0-9]
~ \d
(where ~ means aproximate).
In most programming languages (where it is supported) \d
≡ [[:digit:]]
(identical).
The \d
is less common than [[:digit:]]
(not in POSIX but it is in GNU grep -P
).
There are many digits in UNICODE, for example:
123456789 # Hindu-Arabic
Arabic numerals٠١٢٣٤٥٦٧٨٩ # ARABIC-INDIC
۰۱۲۳۴۵۶۷۸۹ # EXTENDED ARABIC-INDIC/PERSIAN
߀߁߂߃߄߅߆߇߈߉ # NKO DIGIT
०१२३४५६७८९ # DEVANAGARI
All of which may be included in [[:digit:]]
or \d
.
Instead, [0-9]
is generally only the ASCII digits 0123456789
.
There are many languages: Perl, Java, Python, C. In which [[:digit:]]
(and \d
) calls for an extended meaning. For example, this perl code will match all the digits from above:
$ a='0123456789 ٠١٢٣٤٥٦٧٨٩ ۰۱۲۳۴۵۶۷۸۹ ߀߁߂߃߄߅߆߇߈߉ ०१२३४५६७८९'
$ echo "$a" | perl -C -pe 's/[^\d]//g;' ; echo
0123456789٠١٢٣٤٥٦٧٨٩۰۱۲۳۴۵۶۷۸۹߀߁߂߃߄߅߆߇߈߉०१२३४५६७८९
Which is equivalent to select all characters that have the Unicode properties of Numeric
and digits
:
$ echo "$a" | perl -C -pe 's/[^\p{Nd}]//g;' ; echo
0123456789٠١٢٣٤٥٦٧٨٩۰۱۲۳۴۵۶۷۸۹߀߁߂߃߄߅߆߇߈߉०१२३४५६७८९
Which grep could reproduce (the specific version of pcre may have a diferent internal list of numeric code points than Perl):
$ echo "$a" | grep -oP '\p{Nd}+'
0123456789
٠١٢٣٤٥٦٧٨٩
۰۱۲۳۴۵۶۷۸۹
߀߁߂߃߄߅߆߇߈߉
०१२३४५६७८९
Change it to [0-9] to see:
$ echo "$a" | grep -o '[0-9]\+'
0123456789
POSIX
For the specific POSIX BRE or ERE:
The \d
is not supported (not in POSIX but is in GNU grep -P
). [[:digit:]]
is required by POSIX to correspond to the digit character class, which in turn is required by ISO C to be the characters 0 through 9 and nothing else. So only in C locale all [0-9]
, [0123456789]
, \d
and [[:digit:]]
mean exactly the same. The [0123456789]
has no possible misinterpretations, [[:digit:]]
is available in more utilities and it is common to mean only [0123456789]
. The \d
is supported by few utilities.
As for [0-9]
, the meaning of range expressions is only defined by POSIX in the C locale; in other locales it might be different (might be codepoint order or collation order or something else).
Difference between [0-9], [[:digit:]] and \d的更多相关文章
- 好用到没朋友的大数模板(c++) 2014-10-01 15:06 116人阅读 评论(0) 收藏
#include <iostream> #include <cstring> using namespace std; #define DIGIT 4 //四位隔开,即万进制 ...
- [转]Wing IDE 6.0 安装及算号器注册机代码
下载安装wing 选择第三个,运行算号器,输入license id 输入request id. Python 2 算号器注册机代码 import string import random import ...
- javascript:jQuery tablesorter 2.0
https://mottie.github.io/tablesorter/docs/index.html 1.GridView <%@ Page Language="C#" ...
- WingIDE6.0神秘代码
python2: import string import random import sha BASE16 = '0123456789ABCDEF' BASE30 = '123456789ABCDE ...
- Wing IDE 6.0 算号器注册机代码
我开发Python时喜欢用Wing IDE, 然后最近发现Wing IDE升级到6.0版本了, 但是之前能在5.1上用的算号器代码不能用在6.0上了, 所以就上网搜搜是否有相关算号器, 果然, 找到了 ...
- WingIDE 5.0注冊机
在wingIDE下开发python很方便,但IDE不是免费的,网上有破解的方法.请支持正版. 把下列文件CalcActivationCode.py载入到wingIDE中.LicenseID能够随便给一 ...
- 0.0.0.0 IPAddress.Any 【】127.0.0.1 IPAddress.Loopback 【】localhost
0.0.0.0 IPAddress.Any https://msdn.microsoft.com/en-us/library/system.net.ipaddress.any(v=vs.110).a ...
- 经典面试编程题--atoi()函数的实现(就是模拟手算,核心代码就一句total = 10 * total + (c - '0'); 但是要注意正负号、溢出等问题)
一.功能简介 把一个字符串转换成整数 二.linux c库函数实现 /*** *long atol(char *nptr) - Convert string to long * *Purpose: * ...
- 51 Nod1042 数字0到9的数量
1042 数字0-9的数量 基准时间限制:1 秒 空间限制:131072 KB 分值: 10 难度:2级算法题 收藏 关注 给出一段区间a-b,统计这个区间内0-9出现的次数. 比如 10-19 ...
- c语言NULL和0区别及NULL详解
先看下面一段代码输出什么: #include<stdo.h> int main() { int *p=NULL; printf("%s",p); } 输出<n ...
随机推荐
- 2014end
人.事.物. 人一年了,从十六班到十六班,从j101到j101再到A207. 她:结婚,然后走了.就是这样,干脆得我都来不及留恋.是的,再也听不到她那很温柔语气,看不到她偶尔激动时就踮起脚尖.还记得晚 ...
- hdu4418 Time travel 【期望dp + 高斯消元】
题目链接 BZOJ4418 题解 题意:从一个序列上某一点开始沿一个方向走,走到头返回,每次走的步长各有概率,问走到一点的期望步数,或者无解 我们先将序列倍长形成循环序列,\(n = (N - 1) ...
- CF762E Radio Stations
题目戳这里. 我还以为是KDtree呢,但是KDtree应该也可以做吧. 这是一道数据结构好题.考虑到由于\(K \le 10\),所以我们用两个大vector--\(Left,Right\),\(L ...
- async的用法
package com.example.administrator.myapplication; import android.os.AsyncTask; import android.util.Lo ...
- Python之日志操作(logging)
import logging 1.自定义日志级别,日志格式,输出位置 logging.basicConfig( level=logging.DEBUG, format='%(asctime)s | ...
- 移动端list布局,左边固定,右边自适应
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8&quo ...
- 利用java.lang.reflect.Constructor动态实例化对象
} } } } } Student t = co ...
- HDU1099---数学 | 思维
hdu 1099 Lottery题意:1~n编号的彩票,要买全,等概率条件下平均要买几张.已经买了m张时,买中剩下的概率为1-m/n,则要买的张数为1/(1-m/n)n=2,s=1+1/(1-1/2) ...
- finally return 执行顺序问题
网上有很多人探讨Java中异常捕获机制try...catch...finally块中的finally语句是不是一定会被执行?很多人都说不是,当然他们的回答是正确的,经过我试验,至少有两种情况下fina ...
- [bzoj3697]采药人的路径——点分治
Brief Description 采药人的药田是一个树状结构,每条路径上都种植着同种药材. 采药人以自己对药材独到的见解,对每种药材进行了分类.大致分为两类,一种是阴性的,一种是阳性的. 采药人每天 ...