cut 截取自定列

可以按照某个字符进行分割,然后取出其中的指定列:

[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt
140.205.201.30 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /rs-status HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "POST /phpmyadmin/ HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /ganglia/index.php HTTP/1.1" -
164.132.91.1 - - [/Dec/::: +] "GET / HTTP/1.1" -
114.215.45.101 - - [/Dec/::: +] "GET / HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /index.php HTTP/1.1" -
140.205.201.30 - - [/Dec/::: +] "GET /jobs/ HTTP/1.1" -
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt |cut -d ' ' -f
"GET
"GET
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"POST
"GET
"GET
"GET
"GET
"GET

可以指定更多的列:

[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt |cut -d ' ' -f ,,
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
- - [/Dec/:::
[root@iz8vbbqbnh4ug2q9so5jflz logs]# cat  localhost_access_log.--.txt |cut -d ' ' -f ,,-
- - "GET / HTTP/1.1" -
- - "GET /rs-status HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /phpmyadmin/ HTTP/1.1" -
- - "POST /phpmyadmin/ HTTP/1.1" -
- - "GET /ganglia/index.php HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET / HTTP/1.1" -
- - "GET /index.php HTTP/1.1" -
- - "GET /jobs/ HTTP/1.1" -

sort 对列进行排序

例如,对tomcat访问日志,对请求响应返回大小进行排序:

cat localhost_access_log.--.txt |sort -t ' ' -k 

-t : 指定分隔符

-k : 指定排序的列

114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/plugin/jquery-file-upload/js/vendor/tmpl.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /img/logo-pale.png HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /interview/detail.do?manageKey=15ba76c6fbeeccd2f8df875379ac88e9&targetPanel=dialog HTTP/1.1"

排序是由方向的,默认是升序排序,如果要降序排列,可以在列号后面增加一个r:

cat localhost_access_log.--.txt |sort -t ' ' -k 10r

最后要注意的是,这里的排序默认是按字符串的字典顺序排列的,如果要按其数值拍,则需要增加一个n:

 cat localhost_access_log.--.txt |sort -t ' ' -k 10n
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /css/smartadmin-production.css HTTP/1.1"
112.65.193.14 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
114.241.108.197 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
223.72.82.98 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "GET /js/jqueryui/1.10.3/jquery-ui.min.js HTTP/1.1"

由此可见,此网站最大的静态资源是这个jquery-ui.min.js文件。

uniq去重

 cat localhost_access_log.--.txt |cut -d ' ' -f , |sort -t ' ' -k 2n,|uniq
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106
114.241.108.197
223.72.82.98
59.108.217.106
112.65.193.14
114.241.108.197
223.72.82.98
59.108.217.106

wc统计

[root@iZ25klm6k7uZ logs]# wc -l localhost_access_log.--.txt  统计行数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -w localhost_access_log.--.txt 统计词数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]# wc -m localhost_access_log.--.txt 共计字符数
localhost_access_log.--.txt
[root@iZ25klm6k7uZ logs]#

sed正则查找

用sed来查找500的日志信息:

[root@iZ25klm6k7uZ logs]# sed -n '/\b500\b/p' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"

注意:-n和-p配合,表示只打印匹配的行。

awk正则匹配

用awk来查找500日志信息:

awk '($9 ~ /500/)' localhost_access_log.--.txt 

输出和上面的sed一样。

zwk有默认的分隔符,比如\t,空格等。如果要指定分隔符可以用-F。

zwk的强大之处在于它支持编程,格式如下:

awk pattern { action } 例如上面的查找500日志可以完整表达如下:

[root@iZ25klm6k7uZ logs]# awk -F ' ' '($9 ~ /500/){print }' localhost_access_log.--.txt
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
119.127.17.97 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"
59.108.217.106 - - [/Dec/::: +] "POST /interview/add.do HTTP/1.1"

同时查找500和404的日志:

awk -F ' ' '($9 ~ /500/ || $9 ~ /404/){print $1,$6,$7,$9}' localhost_access_log.--.txt

或者

awk -F ' ' '($9 ~ /500|404|400/){print $1,"-",$4,"-",$6,"-",$9}' localhost_access_log.--.txt

Shell编程之文本处理的更多相关文章

  1. linux —— shell 编程(文本处理)

    导读 本文为博文linux —— shell 编程(整体框架与基础笔记)的第4小点的拓展.(本文所有语句的测试均在 Ubuntu 16.04 LTS 上进行) 目录 基本文本处理 流编辑器sed aw ...

  2. shell编程之文本与日志过滤

    1:grep命令: grep -v  "char"  file_name 匹配不包括"char"的文本 grep -n -w "char" ...

  3. shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中

    shell编程系列24--shell操作数据库实战之利用shell脚本将文本数据导入到mysql中 利用shell脚本将文本数据导入到mysql中 需求1:处理文本中的数据,将文本中的数据插入到mys ...

  4. shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容

    shell编程系列11--文本处理三剑客之sed利用sed删除文本中的内容 删除命令对照表 命令 含义 1d 删除第一行内容 ,10d 删除1行到10行的内容 ,+5d 删除10行到16行的内容 /p ...

  5. Linux学习笔记(17) Shell编程之基础

    1. 正则表达式 (1) 正则表达式用来在文件中匹配符合条件的字符串,正则是包含匹配.grep.awk.sed等命令可以支持正则表达式:通配符用来匹配符合条件的文件名,通配符是完全匹配.ls.find ...

  6. Linux Shell编程入门

    从程序员的角度来看, Shell本身是一种用C语言编写的程序,从用户的角度来看,Shell是用户与Linux操作系统沟通的桥梁.用户既可以输入命令执行,又可以利用 Shell脚本编程,完成更加复杂的操 ...

  7. Shell编程菜鸟基础入门笔记

    Shell编程基础入门     1.shell格式:例 shell脚本开发习惯 1.指定解释器 #!/bin/bash 2.脚本开头加版权等信息如:#DATE:时间,#author(作者)#mail: ...

  8. ****CodeIgniter使用cli模式运行,把php作为shell编程

    shell简介 在计算机科学中,Shell俗称壳(用来区别于核).而我们常说的shell简单理解就是一个命令行界面,它使得用户能与操作系统的内核进行交互操作. 常见的shell环境有:MS-DOS.B ...

  9. Linux Shell编程基础

    在学习Linux BASH Shell编程的过程中,发现由于不经常用,所以很多东西很容易忘记,所以写篇文章来记录一下 ls   显示当前路径下的文件,常用的有 -l 显示长格式  -a 显示所有包括隐 ...

随机推荐

  1. Mybatis框架的搭建和基本使用方法

    1.1MyBatis的下载 最新yBatis可以在github官网上下载: https://github.com/mybatis/mybatis-3 1.2 Mybatis Jar包 1.3MyBat ...

  2. DateTime格式

    SELECT * FROM TABLE (TO_DATE('2007/9/1','yyyy/mm/dd') BETWEEN CGGC_STRATDATE AND CGGC_ENDDATE OR CGG ...

  3. C#中泛型之Dictionary

    1.命名空间:System.Collections.Generic(程序集:mscorlib)2.描述: 1).从一组键(Key)到一组值(Value)的映射,每一个添加项都是由一个值及其相关连的键组 ...

  4. C#导出EXCEL没有网格线的解决方法

    今天在做项目时,通过流导出数据到Excel却不显示网格线,真是郁闷.上网查了好久才得一良方(注意<XML>标签中的代码): DataTable thisTable = DBHelper.G ...

  5. Leetcode刷题

    Leetcode题库       本博客用于记录在LeetCode网站上一些题的解答方法.具体实现方法纯属个人的一些解答,这些解答可能不是好的解答方法,记录在此,督促自己多学习多练习.     The ...

  6. python调用c代码2

    1.生成动态链接库 [root@typhoeus79 c]# more head.c #include <stdio.h> #include <stdlib.h> typede ...

  7. 去除HTML选择——兼容IE、FireFox(document.onselectstart,样式)

    引之:http://taoistwar.iteye.com/blog/278963 今天做一个拖动效果,在网上找了个模板,作发后发现一拖动就会选择其它页面部分,需要去除这个效果, 找了个模板看了下发现 ...

  8. ERP服务器简单维护

    前言: 此页内容对于网管高手来说是小儿科,但是以我们对大多数企业的了解,依然有好多企业将服务器的日常维护给忽视了. 所以在此,给大家做一个宣传.让大家提高服务器维护的意识 以提高服务器运行的稳定性.安 ...

  9. 安装memcached

    简介 memcached是免费和开放源代码的高性能分布式内存对象缓存系统,旨在通过减轻数据库负载来加速动态Web应用程序.其有以下特点: 基于简单的文本行协议 全部数据按照k/v形式存放在内存中,无持 ...

  10. NGUI_01

    序言:这是张三疯第一次开始NGUI插件的学习,刚开始学习,肯定有很多漏洞,后期会及时的补上的.希望大家可以见谅,希望大佬多多指教. 扩充:为提供和我一样的小白找不到免费的NGUI插件,这里分享百度网盘 ...