Regular Expressions in Grep Command with 10 Examples --reference
Regular expressions are used to search and manipulate the text, based on the patterns. Most of the Linux commands and programming languages use regular expression.
Grep command is used to search for a specific string in a file. Please refer our earlier article for 15 practical grep command examples.
You can also use regular expressions with grep command when you want to search for a text containing a particular pattern. Regular expressions search for the patterns on each line of the file. It simplifies our search operation.
This articles is part of a 2 article series.
This part 1 article covers grep examples for simple regular expressions. The future part 2 article will cover advanced regular expression examples in grep.
Let us take the file /var/log/messages file which will be used in our examples.
Example 1. Beginning of line ( ^ )
In grep command, caret Symbol ^ matches the expression at the start of a line. In the following example, it displays all the line which starts with the Nov 10. i.e All the messages logged on November 10.
$ grep "^Nov 10" messages.1
Nov 10 01:12:55 gs123 ntpd[2241]: time reset +0.177479 s
Nov 10 01:17:17 gs123 ntpd[2241]: synchronized to LOCAL(0), stratum 10
Nov 10 01:18:49 gs123 ntpd[2241]: synchronized to, stratum 3
Nov 10 13:21:26 gs123 ntpd[2241]: time reset +0.146664 s
Nov 10 13:25:46 gs123 ntpd[2241]: synchronized to LOCAL(0), stratum 10
Nov 10 13:26:27 gs123 ntpd[2241]: synchronized to, stratum 3
The ^ matches the expression in the beginning of a line, only if it is the first character in a regular expression. ^N matches line beginning with N.
Example 2. End of the line ( $)
Character $ matches the expression at the end of a line. The following command will help you to get all the lines which ends with the word “terminating”.
$ grep "terminating.$" messages
Jul 12 17:01:09 cloneme kernel: Kernel log daemon terminating.
Oct 28 06:29:54 cloneme kernel: Kernel log daemon terminating.
From the above output you can come to know when all the kernel log has got terminated. Just like ^ matches the beginning of the line only if it is the first character, $ matches the end of the line only if it is the last character in a regular expression.
Example 3. Count of empty lines ( ^$ )
Using ^ and $ character you can find out the empty lines available in a file. “^$” specifies empty line.
$ grep -c "^$" messages anaconda.log
The above commands displays the count of the empty lines available in the messages and anaconda.log files.
Example 4. Single Character (.)
The special meta-character “.” (dot) matches any character except the end of the line character. Let us take the input file which has the content as follows.
$ cat input
1. first line
2. hi hello
3. hi zello how are you
4. cello
5. aello
6. eello
7. last line
Now let us search for a word which has any single character followed by ello. i.e hello, cello etc.,
$ grep ".ello" input
2. hi hello
3. hi zello how are you
4. cello
5. aello
6. eello
In case if you want to search for a word which has only 4 character you can give grep -w “….” where single dot represents any single character.
Example 5. Zero or more occurrence (*)
The special character “*” matches zero or more occurrence of the previous character. For example, the pattern ’1*’ matches zero or more ’1′.
The following example searches for a pattern “kernel: *” i.e kernel: and zero or more occurrence of space character.
$ grep "kernel: *." *
messages.4:Jul 12 17:01:02 cloneme kernel: ACPI: PCI interrupt for device 0000:00:11.0 disabled
messages.4:Oct 28 06:29:49 cloneme kernel: ACPI: PM-Timer IO Port: 0x1008
messages.4:Oct 28 06:31:06 btovm871 kernel: sda: sda1 sda2 sda3
messages.4:Oct 28 06:31:06 btovm871 kernel: sd 0:0:0:0: Attached scsi disk sda
In the above example it matches for kernel and colon symbol followed by any number of spaces/no space and “.” matches any single character.
Example 6. One or more occurrence (\+)
The special character “\+” matches one or more occurrence of the previous character. ” \+” matches at least one or more space character.
If there is no space then it will not match. The character “+” comes under extended regular expression. So you have to escape when you want to use it with the grep command.
$ cat input
hi hello
hi hello how are you
hihello $ grep "hi \+hello" input
hi hello
hi hello how are you
In the above example, the grep pattern matches for the pattern ‘hi’, followed by one or more space character, followed by “hello”.
If there is no space between hi and hello it wont match that. However, * character matches zero or more occurrence.
“hihello” will be matched by * as shown below.
$ grep "hi *hello" input
hi hello
hi hello how are you
Example 7. Zero or one occurrence (\?)
The special character “?” matches zero or one occurrence of the previous character. “0?” matches single zero or nothing.
$ grep "hi \?hello" input
hi hello
“hi \?hello” matches hi and hello with single space (hi hello) and no space (hihello).
The line which has more than one space between hi and hello did not get matched in the above command.
Example 8.Escaping the special character (\)
If you want to search for special characters (for example: * , dot) in the content you have to escape the special character in the regular expression.
$ grep "127\.0\.0\.1" /var/log/messages.4
Oct 28 06:31:10 btovm871 ntpd[2241]: Listening on interface lo, Enabled
Example 9. Character Class ([0-9])
The character class is nothing but list of characters mentioned with in the square bracket which is used to match only one out of several characters.
$ grep -B 1 "[0123456789]\+ times" /var/log/messages.4
Oct 28 06:38:35 btovm871 init: open(/dev/pts/0): No such file or directory
Oct 28 06:38:35 btovm871 last message repeated 2 times
Oct 28 06:38:38 btovm871 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not Found
Oct 28 06:38:38 btovm871 last message repeated 3 times
Repeated messages will be logged in messages logfile as “last message repeated n times”. The above example searches for the line which has any number (0to9) followed by the word “times”. If it matches it displays the line before the matched line and matched line also.
With in the square bracket, using hyphen you can specify the range of characters. Like [0123456789] can be represented by [0-9]. Alphabets range also can be specified such as [a-z],[A-Z] etc. So the above command can also be written as
$ grep -B 1 "[0-9]\+ times" /var/log/messages.4
Example 10. Exception in the character class
If you want to search for all the characters except those in the square bracket, then use ^ (Caret) symbol as the first character after open square bracket. The following example searches for a line which does not start with the vowel letter from dictionary word file in linux.
$ grep -i "^[^aeiou]" /usr/share/dict/linux.words
First caret symbol in regular expression represents beginning of the line. However, caret symbol inside the square bracket represents “except” — i.e match except everything in the square bracket.
字符 | 说明 |
\ |
将下一字符标记为特殊字符、文本、反向引用或八进制转义符。例如,“n”匹配字符“n”。“\n”匹配换行符。序列“\\”匹配“\”,“\(”匹配“(”。 |
^ |
匹配输入字符串开始的位置。如果设置了 RegExp 对象的 Multiline 属性,^ 还会与“\n”或“\r”之后的位置匹配。 |
$ |
匹配输入字符串结尾的位置。如果设置了 RegExp 对象的 Multiline 属性,$ 还会与“\n”或“\r”之前的位置匹配。 |
* |
零次或多次匹配前面的字符或子表达式。例如,zo* 匹配“z”和“zoo”。* 等效于 {0,}。 |
+ |
一次或多次匹配前面的字符或子表达式。例如,“zo+”与“zo”和“zoo”匹配,但与“z”不匹配。+ 等效于 {1,}。 |
? |
零次或一次匹配前面的字符或子表达式。例如,“do(es)?”匹配“do”或“does”中的“do”。? 等效于 {0,1}。 |
{n} |
n 是非负整数。正好匹配 n 次。例如,“o{2}”与“Bob”中的“o”不匹配,但与“food”中的两个“o”匹配。 |
{n,} |
n 是非负整数。至少匹配 n 次。例如,“o{2,}”不匹配“Bob”中的“o”,而匹配“foooood”中的所有 o。“o{1,}”等效于“o+”。“o{0,}”等效于“o*”。 |
{n,m} |
M 和 n 是非负整数,其中 n <= m。匹配至少 n 次,至多 m 次。例如,“o{1,3}”匹配“fooooood”中的头三个 o。'o{0,1}' 等效于 'o?'。注意:您不能将空格插入逗号和数字之间。 |
? |
当此字符紧随任何其他限定符(*、+、?、{n}、{n,}、{n,m})之后时,匹配模式是“非贪心的”。“非贪心的”模式匹配搜索到的、尽可能短的字符串,而默认的“贪心的”模式匹配搜索到的、尽可能长的字符串。例如,在字符串“oooo”中,“o+?”只匹配单个“o”,而“o+”匹配所有“o”。 |
. |
匹配除“\n”之外的任何单个字符。若要匹配包括“\n”在内的任意字符,请使用诸如“[\s\S]”之类的模式。 |
(pattern) |
匹配 pattern 并捕获该匹配的子表达式。可以使用 $0…$9 属性从结果“匹配”集合中检索捕获的匹配。若要匹配括号字符 ( ),请使用“\(”或者“\)”。 |
(?:pattern) |
匹配 pattern 但不捕获该匹配的子表达式,即它是一个非捕获匹配,不存储供以后使用的匹配。这对于用“or”字符 (|) 组合模式部件的情况很有用。例如,'industr(?:y|ies) 是比 'industry|industries' 更经济的表达式。 |
(?=pattern) |
执行正向预测先行搜索的子表达式,该表达式匹配处于匹配 pattern 的字符串的起始点的字符串。它是一个非捕获匹配,即不能捕获供以后使用的匹配。例如,'Windows (?=95|98|NT|2000)' 匹配“Windows 2000”中的“Windows”,但不匹配“Windows 3.1”中的“Windows”。预测先行不占用字符,即发生匹配后,下一匹配的搜索紧随上一匹配之后,而不是在组成预测先行的字符后。 |
(?!pattern) |
执行反向预测先行搜索的子表达式,该表达式匹配不处于匹配 pattern 的字符串的起始点的搜索字符串。它是一个非捕获匹配,即不能捕获供以后使用的匹配。例如,'Windows (?!95|98|NT|2000)' 匹配“Windows 3.1”中的 “Windows”,但不匹配“Windows 2000”中的“Windows”。预测先行不占用字符,即发生匹配后,下一匹配的搜索紧随上一匹配之后,而不是在组成预测先行的字符后。 |
x|y |
匹配 x 或 y。例如,'z|food' 匹配“z”或“food”。'(z|f)ood' 匹配“zood”或“food”。 |
[xyz] |
字符集。匹配包含的任一字符。例如,“[abc]”匹配“plain”中的“a”。 |
[^xyz] |
反向字符集。匹配未包含的任何字符。例如,“[^abc]”匹配“plain”中的“p”。 |
[a-z] |
字符范围。匹配指定范围内的任何字符。例如,“[a-z]”匹配“a”到“z”范围内的任何小写字母。 |
[^a-z] |
反向范围字符。匹配不在指定的范围内的任何字符。例如,“[^a-z]”匹配任何不在“a”到“z”范围内的任何字符。 |
\b |
匹配一个字边界,即字与空格间的位置。例如,“er\b”匹配“never”中的“er”,但不匹配“verb”中的“er”。 |
\B |
非字边界匹配。“er\B”匹配“verb”中的“er”,但不匹配“never”中的“er”。 |
\cx |
匹配 x 指示的控制字符。例如,\cM 匹配 Control-M 或回车符。x 的值必须在 A-Z 或 a-z 之间。如果不是这样,则假定 c 就是“c”字符本身。 |
\d |
数字字符匹配。等效于 [0-9]。 |
\D |
非数字字符匹配。等效于 [^0-9]。 |
\f |
换页符匹配。等效于 \x0c 和 \cL。 |
\n |
换行符匹配。等效于 \x0a 和 \cJ。 |
\r |
匹配一个回车符。等效于 \x0d 和 \cM。 |
\s |
匹配任何空白字符,包括空格、制表符、换页符等。与 [ \f\n\r\t\v] 等效。 |
\S |
匹配任何非空白字符。与 [^ \f\n\r\t\v] 等效。 |
\t |
制表符匹配。与 \x09 和 \cI 等效。 |
\v |
垂直制表符匹配。与 \x0b 和 \cK 等效。 |
\w |
匹配任何字类字符,包括下划线。与“[A-Za-z0-9_]”等效。 |
\W |
与任何非单词字符匹配。与“[^A-Za-z0-9_]”等效。 |
\xn |
匹配 n,此处的 n 是一个十六进制转义码。十六进制转义码必须正好是两位数长。例如,“\x41”匹配“A”。“\x041”与“\x04”&“1”等效。允许在正则表达式中使用 ASCII 代码。 |
\num |
匹配 num,此处的 num 是一个正整数。到捕获匹配的反向引用。例如,“(.)\1”匹配两个连续的相同字符。 |
\n |
标识一个八进制转义码或反向引用。如果 \n 前面至少有 n 个捕获子表达式,那么 n 是反向引用。否则,如果 n 是八进制数 (0-7),那么 n 是八进制转义码。 |
\nm |
标识一个八进制转义码或反向引用。如果 \nm 前面至少有 nm 个捕获子表达式,那么 nm 是反向引用。如果 \nm 前面至少有 n 个捕获,则 n 是反向引用,后面跟有字符 m。如果两种前面的情况都不存在,则 \nm 匹配八进制值 nm,其中 n 和 m 是八进制数字 (0-7)。 |
\nml |
当 n 是八进制数 (0-3),m 和 l 是八进制数 (0-7) 时,匹配八进制转义码 nml。 |
\un |
匹配 n,其中 n 是以四位十六进制数表示的 Unicode 字符。例如,\u00A9 匹配版权符号 (©)。 |
Regular Expressions in Grep Command with 10 Examples --reference的更多相关文章
- 15 Practical Grep Command Examples In Linux / UNIX
You should get a grip on the Linux grep command. This is part of the on-going 15 Examples series, wh ...
- Regular Expressions --正则表达式官方教程 This lesson explains how to use th ...
- PCRE Perl Compatible Regular Expressions Learning
catalog . PCRE Introduction . pcre2api . pcre2jit . PCRE Programing 1. PCRE Introduction The PCRE li ...
- 转载:邮箱正则表达式Comparing E-mail Address Validating Regular Expressions
Comparing E-mail Address Validating Regular Expressions Updated: 2/3/2012 Summary This page compares ...
- [Regular Expressions] Introduction
var str = "Is this This?"; //var regex = new RegExp("is", "gi"); var r ...
- Introducing Regular Expressions 学习笔记
Introducing Regular Expressions 读书笔记 工具: regexbuddy: ...
- 正则表达式(Regular expressions)使用笔记
Regular expressions are a powerful language for matching text patterns. This page gives a basic intr ...
- 【Python学习笔记】Coursera课程《Using Python to Access Web Data 》 密歇根大学 Charles Severance——Week2 Regular Expressions课堂笔记
Coursera课程<Using Python to Access Web Data > 密歇根大学 Charles Severance Week2 Regular Expressions ...
- Finding Comments in Source Code Using Regular Expressions
Many text editors have advanced find (and replace) features. When I’m programming, I like to use an ...
- JVM 学习笔记
1. JAVA类分为三类: 1.1 系统类 (用系统类加载器加载bootstrap ClassLoader) 1.2 扩展类 (用扩展类加载器加载Ext ClassLoader) 1. ...
- 通过live555实现H264 RTSP直播
- delphi非IE内核浏览器控件TEmbeddedChrome下载|TEmbeddedChrome代码
下载地址: 点击下载 代码示例: 在TForm的oncreate方法中写入一些代码 procedure TForm1.FormCreate(Sender: TObject); begin Chromi ...
- git rebase实战
在develop分支上rebase另外一个分支master,是将master作为本地,develop作为远端来处理的. 最后的效果是,develop分支看起来像是在master分支的最新的节点之后才进 ...
- XSS跨站脚本攻击在Java开发中防范的方法
1. 防堵跨站漏洞,阻止攻击者利用在被攻击网站上发布跨站攻击语句不可以信任用户提交的任何内容,首先代码里对用户输入的地方和变量都需要仔细检查长度和对”<”,”>”,”;”,”’”等字符做过 ...
- 【转】Android中设置TextView的颜色setTextColor--代码中设置字体颜色
原文网址: android中设置TextView的颜色有方法setText ...
- 转《本文为腾讯Bugly原创文章 ---全站 HTTPS 来了》
最近大家在使用百度.谷歌或淘宝的时候,是不是注意浏览器左上角已经全部出现了一把绿色锁,这把锁表明该网站已经使用了 HTTPS 进行保护.仔细观察,会发现这些网站已经全站使用 HTTPS.同时,iOS ...
- [PeterDLax著泛函分析习题参考解答]第4章 Hahn-Bananch 定理的应用
1. 证明: 若在 4.1 节中取 $S=\sed{\mbox{正整数}}$, $Y$ 是收敛数列构成的空间, $\ell$ 由 (14) 式定义, 则由 (4) 给出的 $p$ 和由 (11) 定义 ...
- [转]js动态创建json类型
废话少说:json是一个特有的键值对数组类型.既然是数组类型那么我们就可以这样定义 1.先定义数组 var Data = []; 2.理解键值对 对象名:值{ "id": i, & ...
- 一个简单的DELPHI程序注册码设计 .
当你辛辛苦苦用DELPHI做好了一个你认为十分不错的程序,你是否想把它发布出去成为共享软件呢 做为一个共享软件,注册码肯定是少不了的,你可以通过判断程序是否注册来进行功能,时间或一些其它限制.现在就 ...