日志分析工具ELK(四)

Logstash收集TCP日志

#Input plugins TCP插件

所需的配置选项

tcp {

    port =>...

}

[root@linux-node1 ~]# cat tcp.conf

input {

    tcp {

        host =>"192.168.230.128"

        port =>""

}

}

output {

    stdout{

        codec =>"rubydebug"

}

}

[root@linux-node1 ~]# /opt/logstash/bin/logstash -f tcp.conf

打开另外一个窗口，进行测试查看

[root@linux-node1 ~]# echo "hehe"|nc192.168.230.1286666

 [root@linux-node1 ~]# echo "oldboy">/dev/tcp/192.168.230.128/6666 #伪终端

[root@linux-node1 ~]# nc 192.168.230.1286666</etc/resolv.conf #还可以追加文件

查看第一个窗口

#TCP用于什么呢，在工作中用于这种要往哪个索引追加一些东西，它们之间漏掉了，通过某种方法写成文件，可以使用nc直接附加进去，也可以弄个文件再收一遍，但那个比较费劲

#如果文件较大，时间较长，可以使用screen

Filter grok

之前学习了Input Output 现在来学习Filter

Filter插件 grok

filter插件有很多，在这里就学习grok插件，使用正则匹配日志里的域来拆分。在实际生产中，apache日志不支持jason，就只能使用grok插件匹配；mysql慢查询日志也是无法拆分，只能使用grok正则表达式匹配拆分。

在如下链接，github上有很多写好的grok模板，可以直接引用

https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

官方链接地址

https://www.elastic.co/guide/en/logstash/2.3/plugins-filters-grok.html

#Logstash附带120默认模式。你可以在这里找到

Logstash ships with about 120 patterns by default. You can find them here: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns. You can add your own trivially. (See the patterns_dir setting)

Examples: With that idea of a syntax and semantic, we can pull out useful fields from a sample log like this fictional http request log:

55.3.244.1 GET /index.html 158240.043

The pattern for this could be:

预定义的正则表达式，可以来引用

%{IP:client}%{WORD:method}%{URIPATHPARAM:request}%{NUMBER:bytes}%{NUMBER:duration}

A more realistic example, let’s read these logs from a file:

input {

file{

    path =>"/var/log/http.log"

}

}

filter {

  grok {

    match =>{"message"=>"%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"}

}

}

After the grok filter, the event will have a few extra fields in it:

#使用filter grok后，会输出以下字段

    client:55.3.244.1

    method: GET

    request:/index.html

    bytes:15824

    duration:0.043

我们来测试一下

[root@linux-node1 ~]# cat grok.conf

input {

    stdin {}

}

filter {

  grok {

    match =>{"message"=>"%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}"}

}

}

output {

    stdout {

    codec =>"rubydebug"

}

}

[root@linux-node1 ~]# /opt/logstash/bin/logstash -f grok.conf

Settings: Default pipeline workers:2

Pipeline main started

55.3.244.1 GET /index.html 158240.043#输入这一行

{

"message"=>"55.3.244.1 GET /index.html 15824 0.043",

"@version"=>"",

"@timestamp"=>"2017-01-05T15:21:49.510Z",

"host"=>"linux-node1.example.com",

"client"=>"55.3.244.1",#自动引入了client

"method"=>"GET",

"request"=>"/index.html",

"bytes"=>"",

"duration"=>"0.043"

}

那怎么自动引入的呢，系统在安装完软件的时候已经帮我们内置了

[root@linux-node1 patterns]# pwd #在这个目录下的grok-patterns文件

/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.5/patterns

七、Logstash解耦之消息队列

数据源Datasource把数据写到input插件中，output插件使用消息队列把消息写入到消息队列Message Queue中，Logstash indexing Instance启动logstash使用input插件读取消息队列中的信息，Fliter插件过滤后在使用output写入到elasticsearch中。
　　如果生产环境中不适用正则grok匹配，可以写Python脚本从消息队列中读取信息，输出到elasticsearch中

redis用来解耦

上图架构的优点

解耦，松耦合

解除了由于网络原因不能直接连elasticsearch的情况

方便架构演变，增加新内容

消息队列可以使用rabbitmq，zeromq等，也可以使用redis，kafka（消息不删除，但是比较重量级）等

引入redis到架构中

#安装redis

yum-y install redis

#修改下配置文件

[root@linux-node1 conf.d]# grep '^[a-z]'/etc/redis.conf

daemonize yes  #修改这行为yes，改成在后台运行

pidfile /var/run/redis/redis.pid

port 6379

tcp-backlog 511

bind 192.168.230.128#监听的IP

[root@linux-node1 conf.d]# systemctl start redis

[root@linux-node1 conf.d]# netstat -ntpl|grep6379

tcp        00192.168.230.128:63790.0.0.0:*               LISTEN      2998/redis-server 1

#我们来测试一下

[root@linux-node1 conf.d]# cat redis-out.conf

input {

    stdin{}

}

output {

    redis {

        host =>"192.168.230.128"

        port =>""

        db =>""

        data_type =>"list"#数据类型为list

        key =>"demo"

}

}

#启动配置文件输入信息

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f redis-out.conf

Settings: Default pipeline workers:4

Pipeline main started

chuck #输入

sisi 

#开另外一个窗口连接，info查看

[root@linux-node1 conf.d]# redis-cli -h 192.168.230.128

192.168.230.128:6379> info

# Server

redis_version:2.8.19

redis_git_sha1:00000000

redis_git_dirty:0

redis_build_id:c0359e7aa3798aa2

redis_mode:standalone

os:Linux 3.10.0-123.el7.x86_64 x86_64

arch_bits:64

multiplexing_api:epoll

gcc_version:4.8.3

process_id:6518

run_id:3ab08fa2b91c79194b9f5c15b7c54680461f6e07

tcp_port:6379

uptime_in_seconds:165

uptime_in_days:0

hz:10

lru_clock:10407823

config_file:/etc/redis.conf

# Clients

connected_clients:2

client_longest_output_list:0

client_biggest_input_buf:0

blocked_clients:0

# Memory

used_memory:2211840

used_memory_human:2.11M

used_memory_rss:2895872

used_memory_peak:2211840

used_memory_peak_human:2.11M

used_memory_lua:35840

mem_fragmentation_ratio:1.31

mem_allocator:jemalloc-3.6.0

# Persistence

loading:0

rdb_changes_since_last_save:2

rdb_bgsave_in_progress:0

rdb_last_save_time:1486802666

rdb_last_bgsave_status:ok

rdb_last_bgsave_time_sec:-1

rdb_current_bgsave_time_sec:-1

aof_enabled:0

aof_rewrite_in_progress:0

aof_rewrite_scheduled:0

aof_last_rewrite_time_sec:-1

aof_current_rewrite_time_sec:-1

aof_last_bgrewrite_status:ok

aof_last_write_status:ok

# Stats

total_connections_received:2

total_commands_processed:3

instantaneous_ops_per_sec:0

total_net_input_bytes:316

total_net_output_bytes:13

instantaneous_input_kbps:0.00

instantaneous_output_kbps:0.00

rejected_connections:0

sync_full:0

sync_partial_ok:0

sync_partial_err:0

expired_keys:0

evicted_keys:0

keyspace_hits:0

keyspace_misses:0

pubsub_channels:0

pubsub_patterns:0

latest_fork_usec:0

# Replication

role:master

connected_slaves:0

master_repl_offset:0

repl_backlog_active:0

repl_backlog_size:1048576

repl_backlog_first_byte_offset:0

repl_backlog_histlen:0

# CPU

used_cpu_sys:0.25

used_cpu_user:0.02

used_cpu_sys_children:0.00

used_cpu_user_children:0.00

# Keyspace

db6:keys=3,expires=0,avg_ttl=0#输出的内容，创建了这个db 6 ，里边有一个key

192.168.230.128:6379> select 6#选择db 6

OK

192.168.230.128:6379[6]> keys *#里边有个demo，选择demo这个key

1)"demo"

这是一个列表，怎么查看这个消息

192.168.230.128:6379[6]> LINDEX demo -1# -1表示最后一行，从内容上看已经写进去了（信息、主机、时间戳等）

"{\"message\":\"sisi\",\"@version\":\"1\",\"@timestamp\":\"2017-01-26T13:14:37.766Z\",\"host\":\"linux-node1.example.com\"}"

192.168.230.128:6379[6]> LINDEX demo -2

"{\"message\":\"chuck\",\"@version\":\"1\",\"@timestamp\":\"2017-02-11T08:46:47.597Z\",\"host\":\"linux-node1.example.com\"}"

为了下一步写input插件到把消息发送到elasticsearch中，多在redis中写入写数据

[root@linux-node1 ~]# /opt/logstash/bin/logstash -f redis-out.conf

Settings: Default filter workers:1

Logstash startup completed

chuck

sisi

a

b

c

d

e

f

g

h

i

j

k

l

m

n

o

p

q

r

s

t

u

v

w

x

y

z

k

l

m

n

g

s

#查看redis中名字为demo的key长度

192.168.230.128:6379[6]> LLEN demo

(integer)31

#使用redis发送消息到elasticsearch中

编写redis-in.conf

[root@linux-node1 conf.d]# cp redis-out-conf redis-in-conf

[root@linux-node1 conf.d]# cat redis-in-conf

input {

    redis {

        host =>"192.168.230.128"

        port =>""

        db =>""

        data_type =>"list"

        key =>"demo"

}

}

output {

     elasticsearch {

                hosts =>["192.168.230.128:9200"]

                index =>"redis-demo-%{+YYY.MM.dd}"

}

}

#启动配置文件

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f redis-in-conf

Settings: Default pipeline workers:4

Pipeline main started

#不断刷新demo这个key的长度（读取很快，刷新一定要速度）

192.168.230.128:6379[6]> LLEN demo

(integer)25

192.168.230.128:6379[6]> LLEN demo

(integer)7#可以看到redis的消息正在写入到elasticsearch中

192.168.230.128:6379[6]> LLEN demo

(integer)0

在elasticsearch中查看增加了redis-demo，由于在不同时间点添加的，所以有两个索引

将all.conf的内容改为经由redis

编写shipper.conf作为redis收集logstash配置文件

[root@linux-node1 conf.d]# cat shipper.conf

input{

  syslog {

type=>"system-syslog"

    host =>"192.168.230.128"

    port =>""

}

file{

    path =>"/var/log/nginx/access_json.log"

    codec => json

    start_position =>"beginning"

type=>"nginx-log"

}

file{

     path =>"/var/log/messages"

type=>"system"

     start_position =>"beginning"

}

file{

     path =>"/var/log/elasticsearch/check-cluster.log"

type=>"es-error"

     start_position =>"beginning"

        codec => multiline {

           pattern =>"^\["

           negate => true

           what =>"previous"

}

}

}

output{

if[type]=="system"{

    redis {

        host =>"192.168.230.128"

        port =>""

        db =>""

        data_type =>"list"

        key =>"system"

}

}

if[type]=="es-error"{

     redis {

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"es-error"

}

}

if[type]=="system-syslog"{

 redis {

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"system-syslog"

}

}

if[type]=="nginx-log"{

     redis {

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"nginx-log"

}

}

}

#在redis中查看keys

192.168.230.128:6379[6]> select 6

192.168.230.128:6379[6]> keys *

1)"system"

2)"system-syslog"

3)"es-error"

编写indexer.conf作为redis发送elasticsearch配置文件

[root@linux-node2 /]# cat indexer.conf

input{

        redis {

type=>"system"

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"system"

}

         redis {

type=>"es-error"

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"es-error"

}

     redis {

type=>"system-syslog"

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"system-syslog"

}

         redis {

type=>"nginx-log"

                host =>"192.168.230.128"

                port =>""

                db =>""

                data_type =>"list"

                key =>"nginx-log"

}

}

output{

if[type]=="system"{

    elasticsearch {

        hosts =>["192.168.230.128:9200"]

        index =>"system-%{+YYY.MM.dd}"

}

}

if[type]=="es-error"{

        elasticsearch {

                hosts =>["192.168.230.128:9200"]

                index =>"es-error-%{+YYY.MM.dd}"

}

}

if[type]=="system-syslog"{

        elasticsearch {

                hosts =>["192.168.230.128:9200","192.168.230.129:9200"]

                index =>"system-syslog-%{+YYY.MM.dd}"

}

}

if[type]=="nginx-log"{

        elasticsearch {

                hosts =>["192.168.230.128:9200","192.168.230.129:9200"]

                index =>"nginx-log-%{+YYY.MM.dd}"

}

}

}

#启动shipper.conf

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f shipper.conf

Settings: Default pipeline workers:4

Pipeline main started

由于日志量小，很快就会全部被发送到elasticsearch，key也就没了，所以多写写数据到日志中

[root@linux-node1 conf.d]# for n in `seq 10000`;doecho$n>>/var/log/nginx/access_json.log;done

[root@linux-node1 conf.d]# for n in `seq 10000`;doecho$n>>/var/log/messages;done

[root@linux-node1 conf.d]# for n in `seq 10000`;doecho$n>>/var/log/elasticsearch/check-cluster.log;done

查看key的长度看到key在增长

192.168.230.128:6379[6]> LLEN nginx-log

(integer)2450

192.168.230.128:6379[6]> LLEN nginx-log

(integer)2680

192.168.230.128:6379[6]> LLEN nginx-log

(integer)2920

#启动indexer.conf

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f indexer.conf

Settings: Default pipeline workers:4

Pipeline main started

#查看key的长度看到key在减小

192.168.230.128:6379[6]> LLEN nginx-log

(integer)20000

192.168.230.128:6379[6]> LLEN nginx-log

(integer)19875

192.168.230.128:6379[6]> LLEN nginx-log

(integer)19875

192.168.230.128:6379[6]> LLEN nginx-log

(integer)19750

192.168.230.128:6379[6]> LLEN nginx-log

(integer)19750

kibana查看nginx-log索引

实时写入测试，节点1启动shipper.conf

[root@linux-node1 conf.d]# /opt/logstash/bin/logstash -f shipper.conf

Settings: Default pipeline workers:4

Pipeline main started

#在节点2上启动indexer.conf

[root@linux-node2 /]# /opt/logstash/bin/logstash -f indexer.conf

OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N

Settings: Default pipeline workers:1

Pipeline main started

#在nginx log上增加点东西

[root@linux-node1 conf.d]# for n in `echo zsq`;doecho$n>>/var/log/nginx/access_json.log;done

Kibana搜索查看关键字

日志分析工具ELK(四)的更多相关文章

日志分析工具ELK配置详解
日志分析工具ELK配置详解一.ELK介绍 1.1 elasticsearch 1.1.1 elasticsearch介绍 ElasticSearch是一个基于Lucene的搜索服务器.它提供了一个分 ...
日志分析工具ELK(五)
八．Kibana实践选择绝对时间和相对时间搜索还可以添加相关信息自动刷新页面时间,也可以关闭创建图像,可视化编辑Markdown,创建一个值班联系表值班联系表保存再创建一个饼图;查看 ...
日志分析工具ELK(一)
一.ELK介绍 1.1 elasticsearch 1.1.1 elasticsearch介绍 ElasticSearch是一个基于Lucene的搜索服务器.它提供了一个分布式多用户能力的全文搜索引擎 ...
日志分析工具ELK(三)
目前官网更新特别快,不到半年时间就更新了好几个版本,目前最新的是5.1 以下安装配置使用4.5版本的 https://www.elastic.co/guide/en/kibana/4.5/index. ...
日志分析工具ELK(二)
五.Logstash日志收集实践在学习Logstash之前,我们需要先了解以下几个基本概念: logstash收集日志基本流程: input-->codec-->filter--> ...
Linux 日志分析工具之awstats
一.awstats 是什么官方网站:AWStats is a free powerful and featureful tool that generates advanced web, strea ...
Eventlog Analyzer日志管理系统、日志分析工具、日志服务器的功能及作用
Eventlog Analyzer日志管理系统.日志分析工具.日志服务器的功能及作用 Eventlog Analyzer是用来分析和审计系统及事件日志的管理软件,能够对全网范围内的主机.服务器.网络设 ...
【转】gc日志分析工具
性能测试排查定位问题,分析调优过程中,会遇到要分析gc日志,人肉分析gc日志有时比较困难,相关图形化或命令行工具可以有效地帮助辅助分析. Gc日志参数通过在tomcat启动脚本中添加相关参数生成gc ...
GC之七--gc日志分析工具
性能测试排查定位问题,分析调优过程中,会遇到要分析gc日志,人肉分析gc日志有时比较困难,相关图形化或命令行工具可以有效地帮助辅助分析. Gc日志参数通过在tomcat启动脚本中添加相关参数生成gc ...

随机推荐

POI2014 FAR-FarmCraft 树形DP+贪心
题目链接 https://www.luogu.org/problem/P3574 题意翻译其实已经很明确了分析这题一眼就是贪心啊,但贪心的方法要思索一下,首先是考虑先走时间多的子树,但不太现实, ...
JS 剑指Offer（五）二叉树的重建
题目:输入某二叉树的前序遍历和中序遍历的结果,请重建该二叉树.假设输入的前序遍历和中序遍历的结果中都不含重复的数字. 题目分析:已知二叉树的前序和中序遍历,根据前序遍历和中序遍历的规则,前序遍历的第一 ...
MyBatis整合Spring原理分析
目录 MyBatis整合Spring原理分析 MapperScan的秘密简单总结假如不结合Spring框架,我们使用MyBatis时的一个典型使用方式如下: public class UserDa ...
github的学习使用以及将自己开发的app传上去。
主要参考的网址如下: https://www.cnblogs.com/sdcs/p/8270029.html https://www.cnblogs.com/sjhsszl/p/8708471.htm ...
Kafka，RocketMQ，RabbitMQ部署与使用体验
前言近期在研究各种消息队列方案,为了有一个直观的使用体验,我把Kafka,RocketMQ,RabbitMQ各自部署了一遍,并使用了最基本的生产与消费消息功能.在部署过程中也遇到一些问题,特此记录. ...
响应式web设计（Responsive web design）
在全面进入互联网时代后,随着各种移动设备的普及,移动互联网更加受到大众的青睐.由于移动互联网的使用量远远超出了传统互联网的使用量,移动设备也正在逐渐超越桌面设备.因为用户在移动设备上的使用习惯不同,U ...
leetcode 703. Kth Largest Element in a Stream & c++ priority_queue & minHeap/maxHeap
703. Kth Largest Element in a Stream & c++ priority_queue & minHeap/maxHeap 相关链接 leetcode c+ ...
PyCharm 项目打开窗口设置为当前还是新开一个怎么办？
前言: 我找这个设置找了好久,后来在一篇博文中才找到,现在记录下来一下,顺便带图解释一下设置步骤: File -> Setting -> Appearance & ...
AJ学IOS 之控制器view显示中view的父子关系及controller的父子关系_解决屏幕旋转不能传递事件问题
AJ分享,必须精品一:效果二:项目代码这个Demo用的几个控制器分别画了不通的xib,随便拖拽了几个空间,主要是几个按钮的切换,主要代码展示下: // // NYViewController.m ...
Julia基础语法字符和字符串
1.Julia字符串 2.字符