00011 - find中的-print0和xargs中-0的奥妙
默认情况下, find 每输出一个文件名, 后面都会接着输出一个换行符 ('\n'), 因此我们看到的 find 的输出都是一行一行的:
[bash-4.1.5] ; ls -l
total 0
-rw-r--r-- 1 root root 0 2010-08-02 18:09 file1.log
-rw-r--r-- 1 root root 0 2010-08-02 18:09 file2.log
[bash-4.1.5] ; find -name '*.log'
比如我想把所有的 .log 文件删掉, 可以这样配合 xargs 一起用:
[bash-4.1.5] ; find -name '*.log'
[bash-4.1.5] ; find -name '*.log' | xargs rm
[bash-4.1.5] ; find -name '*.log'
嗯, 不错, find+xargs 真的很强大. 然而:
[bash-4.1.5] ; ls -l
total 0
-rw-r--r-- 1 root root 0 2010-08-02 18:12 file 1.log
-rw-r--r-- 1 root root 0 2010-08-02 18:12 file 2.log
[bash-4.1.5] ; find -name '*.log'
./file 1.log
./file 2.log
[bash-4.1.5] ; find -name '*.log' | xargs rm
rm: cannot remove `./file': No such file or directory
rm: cannot remove `1.log': No such file or directory
rm: cannot remove `./file': No such file or directory
rm: cannot remove `2.log': No such file or directory
原因其实很简单, xargs 默认是以空白字符 (空格, TAB, 换行符) 来分割记录的, 因此文件名 ./file 1.log 被解释成了两个记录 ./file 和 1.log, 不幸的是 rm 找不到这两个文件.
为了解决此类问题, 聪明的人想出了一个办法, 让 find 在打印出一个文件名之后接着输出一个 NULL 字符 ('\0') 而不是换行符, 然后再告诉 xargs 也用 NULL 字符来作为记录的分隔符. 这就是 find 的 -print0 和 xargs 的 -0 的来历吧.
[bash-4.1.5] ; ls -l
total 0
-rw-r--r-- 1 root root 0 2010-08-02 18:12 file 1.log
-rw-r--r-- 1 root root 0 2010-08-02 18:12 file 2.log
[bash-4.1.5] ; find -name '*.log' -print0 | hd
0 1 2 3 4 5 6 7 8 9 A B C D E F |0123456789ABCDEF|
00000000: 2e 2f 66 69 6c 65 20 31 2e 6c 6f 67 00 2e 2f 66 |./file 1.log../f|
00000010: 69 6c 65 20 32 2e 6c 6f 67 00 |ile 2.log. |
[bash-4.1.5] ; find -name '*.log' -print0 | xargs -0 rm
[bash-4.1.5] ; find -name '*.log'
你可能要问了, 为什么要选 '\0' 而不是其他字符做分隔符呢? 这个也容易理解: 一般的编程语言中都用 '\0' 来作为字符串的结束标志, 文件的路径名中不可能包含 '\0' 字符.
find /usr/local/backups -name "*.html" -mtime +10 -print0 |xargs -0 rm -rfv
find /usr/local/backups -mtime +10 -name "*.html" -exec rm -rf {} \;
find -print 和 -print0的区别:
-print 在每一个输出后会添加一个回车换行符,而-print0则不会。
find . -maxdepth 1 ! -name "." -print0 | xargs -0 du -b | sort -nr | head -10 | nl
nl:可以为输出列加上编号,与cat -n相似,但空行不编号
for file in *; do du -b "$file"; done|sort -nr|head -10|nl
find . -name "*.txt" -print0 | xargs -0 sed -i 's/aaa/bbb/g'
find . -name '*.txt' -type f -print0 |xargs -0 grep -n 'aaa' #“-n”输出行号
True; print the full file name on the standard output, followed by a newline. If you are piping the output of find into another program and there is the faintest possibility that the files which you are searching for might con-
tain a newline, then you should seriously consider using the -print0 option instead of -print. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.
True; print the full file name on the standard output, followed by a null character (instead of the newline character that -print uses). This allows file names that contain newlines or other types of white space to be correctly
interpreted by programs that process the find output. This option corresponds to the -0 option of xargs.
-0 Input items are terminated by a null character instead of by whitespace, and the quotes and backslash are not special (every character is taken literally). Disables the end of file string, which is treated like any other argu-
ment. Useful when input items might contain white space, quote marks, or backslashes. The GNU find -print0 option produces input suitable for this mode.
