logstash filter 处理json
根据输入的json字段,分别建立索引。循环生成注册log和登录log保存到testlog文件中,结果如下:
{"method":"register","user_id":2933,"user_name":"name_91","level":27,"login_time":1470179550}
{"method":"login","user_id":1247,"user_name":"name_979","level":1,"register_time":1470179550}
{"method":"register","user_id":2896,"user_name":"name_1972","level":17,"login_time":1470179550}
{"method":"login","user_id":2411,"user_name":"name_2719","level":1,"register_time":1470179550}
{"method":"register","user_id":1588,"user_name":"name_1484","level":4,"login_time":1470179550}
{"method":"login","user_id":2507,"user_name":"name_1190","level":1,"register_time":1470179550}
{"method":"register","user_id":2382,"user_name":"name_234","level":21,"login_time":1470179550}
{"method":"login","user_id":1208,"user_name":"name_443","level":1,"register_time":1470179550}
{"method":"register","user_id":1331,"user_name":"name_1297","level":3,"login_time":1470179550}
{"method":"login","user_id":2809,"user_name":"name_743","level":1,"register_time":1470179550}
logstash目录下建立配置文件
vim config/json.conf

input {
file {
path => "/home/bona/logstash-2.3.4/testlog"
start_position => "beginning"
codec => "json"
}
} output {
elasticsearch {
hosts => ["192.168.68.135:9200"]
index => "data_%{method}"
}
}

重点是index中,%{method} 来匹配log中的method字段.
以上log就会分别建立data_login data_register两个索引, 要注意的是索引名称必须全部小写
以下是实例
原始数据:
{"countnum":2,"checktime":"2017-05-23 16:59:32"}
{"countnum":2,"checktime":"2017-05-23 16:59:32"}
1、无涉及字段类型转换 logstash filter 配置如下参数即可
if [type] == "onlinecount" {
json{
source => "message"
}
}
2、涉及字段类型转换
logstash filter
if [type] == "onlinecount" {
mutate{
split=>["message",","]
add_field => {
"coutnum" => "%{[message][0]}"
}
add_field => {
"checktime" => "%{[message][1]}"
}
remove_field => ["message"]
}
json{
source => "coutnum"
source => "checktime"
#convert => { "coutnum" => "integer" }
target => "coutnum"
target => "checktime"
}
}
kafka数据:{
{"cluster":"qy_api_v2_pool","body_bytes_sent":"8579","http_versioncode":"Android_32"}\n
{"cluster":"qy_api_v2_pool","body_bytes_sent":"8579","http_versioncode":"Android_33"}\n
{"cluster":"qy_api_v2_pool","body_bytes_sent":"8579","http_versioncode":"Android_34"}\n
....
}
kafka团队因考虑性能问题,将原始日志多条合并一条发送(每一条用换行符分割),这样我读的kafka就必须拆成一条一条的写入到ES,不然数据就不准确了,请问这种需求该如何处理呢?
已解决,开始走了弯路,用的下列方法导致还在一条数据
filter {
mutate {
split=>["message","
"]
}
正解方案
filter {
split {
field => "message"
}
还有一个小问题split中terminator默认是\n,但是我如下写法为什么切割不成功呢,不写terminator是可以的
filter {
split {
field => "message"
terminator => "\\n"
}
现有json:
{
"name":"zhangsan",
"friends":
{
"friend1":"lisi",
"friend2":"wangwu",
"msg":["haha","yaya"]
}
}
1
2
3
4
5
6
7
8
9
将其解析为:
{
"name":"zhangsan",
"friend1":"lisi",
"friend2":"wangwu",
"msg":["haha","yaya"]
}
1
2
3
4
5
6
logstash.conf
input
{
stdin
{
codec => json
}
}
filter
{
mutate
{
add_field => { "@friends" => "%{friends}" } #先新建一个新的字段,并将friends赋值给它
}
json
{
source => "@friends" #再进行解析
remove_field => [ "@alert","alert" ] #删除不必要的字段,也可以不用这语句
}
}
output
{
stdout { }
}
---------------------
作者:姚贤贤
来源:CSDN
原文:https://blog.csdn.net/u011311291/article/details/86743642
版权声明:本文为博主原创文章,转载请附上博文链接!
由于我们的埋点日志是嵌套json类型,要想最终所有字段展开来统计分析就必须把嵌套json展开。
- 日志格式如下:
2019-01-22 19:25:58 172.17.12.177 /statistics/EventAgent appkey=yiche&enc=0<ype=view&yc_log={"uuid":"73B333EB-EC87-4F9F-867B-A9BF38CBEBB2","mac":"02:00:00:00:00:00","uid":-1,"idfa":"2BFD67CF-ED60-4CF6-BA6E-FC0B18FDDDF8","osv":"iOS11.4.1","fac":"apple","mdl":"iPhone SE","req_id":"360C8C43-73AC-4429-9E43-2C08F4C1C425","itime":1548156351820,"os":"2","sn_id":"6B937D83-BFB2-4C22-85A8-5B3E82D9D0F1","dvid":"3676b52dc155e1eec3ca514f38736fd6","aptkn":"4fb9b2bffb808515aa0e9a5f5b17d826769e432f63d5cf87f7fb5ce4d67ef9f1","cha":"App Store","idfv":"B1EAD56F-E456-4FF2-A3C2-9A8FA0693C22","nt":4,"lg_vl":{"pfrom":"shouye","ptitle":"shouye"},"av":"10.3.3"} 218.15.255.124 200
- 最开始Logstash的配置文件如下:
input {
file {
path => ["/data/test_logstash.log"]
type => ["nginx_log"]
start_position => "beginning"
}
}
filter {
if [type] =~ "nginx_log" {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:create_time} %{IP:server_ip} %{URIPATH:uri} %{GREEDYDATA:args} %{IP:client_ip} %{NUMBER:status}" }
}
urldecode{
field =>args
}
kv {
source =>"args"
field_split =>"&"
remove_field => [ "args","@timestamp","message","path","@version","path","host" ]
}
json {
source => "yc_log"
remove_field => [ "yc_log" ]
}
}
}
output {
stdout { codec => rubydebug }
}
按照以上配置文件运行Logstash得到的结果如下:
{
"server_ip" => "172.17.12.177",
"cha" => "App Store",
"mdl" => "iPhone SE",
"type" => "nginx_log",
"mac" => "02:00:00:00:00:00",
"ptitle" => "shouye",
"appkey" => "yiche",
"idfv" => "B1EAD56F-E456-4FF2-A3C2-9A8FA0693C22",
"sn_id" => "6B937D83-BFB2-4C22-85A8-5B3E82D9D0F1",
"aptkn" => "4fb9b2bffb808515aa0e9a5f5b17d826769e432f63d5cf87f7fb5ce4d67ef9f1",
"av" => "10.3.3",
"os" => "2",
"idfa" => "2BFD67CF-ED60-4CF6-BA6E-FC0B18FDDDF8",
"uid" => -1,
"uuid" => "73B333EB-EC87-4F9F-867B-A9BF38CBEBB2",
"req_id" => "360C8C43-73AC-4429-9E43-2C08F4C1C425",
"status" => "200",
"uri" => "/statistics/EventAgent",
"enc" => "0",
"ltype" => "view",
"lg_vl" => {
"ptitle" => "shouye",
"pfrom" => "shouye"
},
"nt" => 4,
"pfrom" => "shouye",
"itime" => 1548156351820,
"client_ip" => "218.15.255.124",
"create_time" => "2019-01-22 19:25:58",
"dvid" => "3676b52dc155e1eec3ca514f38736fd6",
"fac" => "apple",
"lg_value" => "{\"pfrom\":\"shouye\",\"ptitle\":\"shouye\"}",
"osv" => "iOS11.4.1"
}
可以看到lg_vl字段仍然是json格式,没有解析出来。如果直接在配置文件中添加
json { source => "lg_vl" }
会报jsonParseException错。
- 正确做法
input {
file {
path => ["/data/test_logstash.log"]
type => ["nginx_log"]
start_position => "beginning"
}
}
filter {
if [type] =~ "nginx_log" {
grok {
match => { "message" => "%{TIMESTAMP_ISO8601:create_time} %{IP:server_ip} %{URIPATH:uri} %{GREEDYDATA:args} %{IP:client_ip} %{NUMBER:status}" }
}
urldecode{
field =>args
}
kv {
source =>"args"
field_split =>"&"
remove_field => [ "args","@timestamp","message","path","@version","path","host" ]
}
json {
source => "yc_log"
remove_field => [ "yc_log" ]
}
mutate {
add_field => { "lg_value" => "%{lg_vl}" }
}
json {
source => "lg_value"
remove_field => [ "lg_vl","lg_value" ]
}
}
}
output {
stdout { codec => rubydebug }
}
在解析完上一层json之后添加一个字段lg_value,再将lg_vl的内容赋值给lg_value;之后单独对lg_value进行json解析就可以了。解析完结果如下:
{
"type" => "nginx_log",
"nt" => 4,
"dvid" => "3676b52dc155e1eec3ca514f38736fd6",
"os" => "2",
"fac" => "apple",
"ltype" => "view",
"client_ip" => "218.15.255.124",
"itime" => 1548156351820,
"mac" => "02:00:00:00:00:00",
"idfa" => "2BFD67CF-ED60-4CF6-BA6E-FC0B18FDDDF8",
"uri" => "/statistics/EventAgent",
"aptkn" => "4fb9b2bffb808515aa0e9a5f5b17d826769e432f63d5cf87f7fb5ce4d67ef9f1",
"sn_id" => "6B937D83-BFB2-4C22-85A8-5B3E82D9D0F1",
"create_time" => "2019-01-22 19:25:58",
"osv" => "iOS11.4.1",
"req_id" => "360C8C43-73AC-4429-9E43-2C08F4C1C425",
"ptitle" => "shouye",
"av" => "10.3.3",
"server_ip" => "172.17.12.177",
"pfrom" => "shouye",
"enc" => "0",
"mdl" => "iPhone SE",
"cha" => "App Store",
"idfv" => "B1EAD56F-E456-4FF2-A3C2-9A8FA0693C22",
"uid" => -1,
"uuid" => "73B333EB-EC87-4F9F-867B-A9BF38CBEBB2",
"appkey" => "yiche",
"status" => "200"
}
完美,棒棒哒!!!
作者:神秘的寇先森
链接:https://www.jianshu.com/p/de06284e1484
来源:简书
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
Logstash替换字符串,解析json数据,修改数据类型,获取日志时间
在某些情况下,有些日志文本文件类json,但它的是单引号,具体格式如下,我们需要根据下列日志数据,获取正确的字段和字段类型
{'usdCnyRate': '6.728', 'futureIndex': '463.36', 'timestamp': '1532933162361'}
{'usdCnyRate': '6.728', 'futureIndex': '463.378', 'timestamp': '1532933222335'}
{'usdCnyRate': '6.728', 'futureIndex': '463.38', 'timestamp': '1532933348347'}
{'usdCnyRate': '6.728', 'futureIndex': '463.252', 'timestamp': '1532933366866'}
{'usdCnyRate': '6.728', 'futureIndex': '463.31', 'timestamp': '1532933372350'}
{'usdCnyRate': '6.728', 'futureIndex': '463.046', 'timestamp': '1532933426899'}
{'usdCnyRate': '6.728', 'futureIndex': '462.806', 'timestamp': '1532933432346'}
{'usdCnyRate': '6.728', 'futureIndex': '462.956', 'timestamp': '1532933438353'}
{'usdCnyRate': '6.728', 'futureIndex': '462.954', 'timestamp': '1532933456796'}
{'usdCnyRate': '6.728', 'futureIndex': '462.856', 'timestamp': '1532933492411'}
{'usdCnyRate': '6.728', 'futureIndex': '462.776', 'timestamp': '1532933564378'}
{'usdCnyRate': '6.728', 'futureIndex': '462.628', 'timestamp': '1532933576849'}
{'usdCnyRate': '6.728', 'futureIndex': '462.612', 'timestamp': '1532933588338'}
{'usdCnyRate': '6.728', 'futureIndex': '462.718', 'timestamp': '1532933636808'}
此时我们如果当json直接用logstash Json filter plugin来解析会如下报错
[WARN ] 2018-07-31 10:20:12.708 [Ruby-0-Thread-5@[main]>worker1: :1] json - Error parsing json {:source=>"message", :raw=>"{'usdCnyRate': '6.728', 'futureIndex': '462.134', 'timestamp': '1532933714371'}", :exception=>#<LogStash::Json::ParserError: Unexpected character (''' (code 39)): was expecting double-quote to start field name at [Source: (byte[])"{'usdCnyRate': '6.728', 'futureIndex': '462.134', 'timestamp': '1532933714371'}"; line: 1, column: 3]>}
此处我认为简单的做法是替换单引号为双引号,替换过程应用了logstash mutate gsub
一定要看清楚我10-12行的写法,作用为替换字符串,14-15行为解析json。我们还需要将usdCnyRate和futureIndex转为float类型(18-21行),将timestamp转为时间类型,并重新定义一个logdate来存储(23-25行)此处用到
logstash date filter plugin
input{
file {
path => "/usr/share/logstash/wb.cond/test.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter{
mutate {
gsub =>[
"message", "'", '"'
]
}
json {
source => "message"
}
mutate {
convert => {
"usdCnyRate" => "float"
"futureIndex" => "float"
}
}
date {
match => [ "timestamp", "UNIX_MS" ]
target => "logdate"
}
}
output{
stdout{
codec=>rubydebug
}
}
利用上述配置文件,我们能正确解析出日志文件的字段和类型
{
"message" => "{\"usdCnyRate\": \"6.728\", \"futureIndex\": \"463.378\", \"timestamp\": \"1532933222335\"}",
"@timestamp" => 2018-07-31T10:48:48.600Z,
"host" => "logstashvm0",
"path" => "/usr/share/logstash/wb.cond/test.log",
"@version" => "1",
"logdate" => 2018-07-30T06:47:02.335Z,
"usdCnyRate" => 6.728,
"timestamp" => "1532933222335",
"futureIndex" => 463.378
}
{
"message" => "{\"usdCnyRate\": \"6.728\", \"futureIndex\": \"463.252\", \"timestamp\": \"1532933366866\"}",
"@timestamp" => 2018-07-31T10:48:48.602Z,
"host" => "logstashvm0",
"path" => "/usr/share/logstash/wb.cond/test.log",
"@version" => "1",
"logdate" => 2018-07-30T06:49:26.866Z,
"usdCnyRate" => 6.728,
"timestamp" => "1532933366866",
"futureIndex" => 463.252
}
{
"message" => "{\"usdCnyRate\": \"6.728\", \"futureIndex\": \"463.31\", \"timestamp\": \"1532933372350\"}",
"@timestamp" => 2018-07-31T10:48:48.602Z,
"host" => "logstashvm0",
"path" => "/usr/share/logstash/wb.cond/test.log",
"@version" => "1",
"logdate" => 2018-07-30T06:49:32.350Z,
"usdCnyRate" => 6.728,
"timestamp" => "1532933372350",
"futureIndex" => 463.31
}
logstash filter 处理json的更多相关文章
- spring boot下使用logback或log4j生成符合Logstash标准的JSON格式
spring boot下使用logback或log4j生成符合Logstash标准的JSON格式 一.依赖 由于配置中使用了json格式的日志输出,所以需要引入如下依赖 "net.logst ...
- Logstash filter 的使用
原文地址:http://techlog.cn/article/list/10182917 概述 logstash 之所以强大和流行,与其丰富的过滤器插件是分不开的 过滤器提供的并不单单是过滤的功能,还 ...
- Logstash:解析 JSON 文件并导入到 Elasticsearch 中
转载自:https://elasticstack.blog.csdn.net/article/details/114383426 在今天的文章中,我们将详述如何使用 Logstash 来解析 JSON ...
- Logstash filter 插件之 date
使用 date 插件解析字段中的日期,然后使用该日期或时间戳作为事件的 logstash 时间戳.对于排序事件和导入旧数据,日期过滤器尤其重要.如果您在事件中没有得到正确的日期,那么稍后搜索它们可能会 ...
- (四)ELK Logstash filter
filter 官方详解 https://www.elastic.co/guide/en/logstash/current/filter-plugins.html apache 日志实例: in ...
- 【logstash】 - 使用json解析数
ilter-json:http://www.logstash.net/docs/1.4.2/filters/json json数据: {"account_number":995,& ...
- LogStash filter介绍(九)
LogStash plugins-filters-grok介绍 官方文档:https://www.elastic.co/guide/en/logstash/current/plugins-filter ...
- logstash filter grok 用法
在elk+filebeat都安装好,且明白了基本流程后,主要的就是写logstash的filter了,以此来解析特定格式的日志 logstash的filter是用插件实现的,grok是其中一个,用来解 ...
- Logstash filter 插件之 grok
本文简单介绍一下 Logstash 的过滤插件 grok. Grok 的主要功能 Grok 是 Logstash 最重要的插件.它可以解析任意文本并把它结构化.因此 Grok 是将非结构化的日志数据解 ...
随机推荐
- 【Springboot】Springboot整合Thymeleaf模板引擎
Thymeleaf Thymeleaf是跟Velocity.FreeMarker类似的模板引擎,它可以完全替代JSP,相较与其他的模板引擎,它主要有以下几个特点: 1. Thymeleaf在有网络和无 ...
- Java多线程知识整理
多线程 1. 多线程基础 多线程状态转换图 普通方法介绍 yeild yeild,线程让步.是当前线程执行完后所有线程又统一回到同一起跑线.让自己或者其他线程运行,并不是单纯的让给其他线程. join ...
- 2019/1.7 js面向对象笔记
面向对象 1.构造函数里的属性怎么看?看this,谁前面有this谁就是属性. num不是属性,是私有作用域下的私有变量. 2.如何查找面向对象中的this 1.构造函数的this指向实例对象 2.如 ...
- Ocelot + Consul + Registrator 基于Docker 实现服务发现、服务自动注册
目录 1. Consul集群搭建 1.1 F&Q Consul官方推荐的host网络模式运行 2. Registrator服务注册工具 2.1 F&Q Registrator悬挂服务 ...
- 一起学Android之ListView
本文以一个小例子,简述Android开发中ListView的相关应用,仅供学习分享使用. 概述 ListView是一个显示可滚动项目列表的视图组(view group),列表项通过适配器(Adapte ...
- 如何简单的构建Android?
原文链接:https://fernandocejas.com/2014/09/03/architecting-android-the-clean-way/ 过去的几个月中,在Tuenti上与同行例 ...
- node配置微信小程序解密消息以及推送消息
上一篇文章介绍过 微信小程序配置消息推送,没有看过的可以先去查看一下,这里就直接去把那个客服消息接口去解密那个消息了. 在这里我选择的还是json格式的加密. 也就是给小程序客服消息发送的消息都会被微 ...
- eclipse 导入gradle引入多模块项目,引入eclipse后变成了好几个工程
1.eclipse 导入gradle 项目 ,选择项目文件夹. 2.导入完成后,文档结构变成 ,多个子项目并列了,而且互不依赖,没有层级结构了. 3.点击项目目录,右上角这个小箭头,选择projec ...
- 基于GDAL库,读取海洋风场数据(.nc格式)c++版
经过这一段时间的对海洋数据的处理,接触了大量的与海洋相关的数据,例如海洋地形.海洋表面温度.盐度.湿度.云场.风场等数据,除了地形数据是grd格式外,其他的都是nc格式的数据.本文将以海洋风场数据为例 ...
- 如何删除Windows10操作系统资源管理器中的下载、图片、音乐、文档、视频、桌面、3D对象这7个文件夹
通过注册表删除,步骤如下: 1.按下win+R,输入regedit,打开注册表 2.找到位置:计算机\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Cur ...