logstash grok filter for custom logs -
i have 2 related questions. first how best grok logs have "messy" spacing , on, , second, i'll ask separately, how deal logs have arbitrary attribute-value pairs. (see: logstash grok filter logs arbitrary attribute-value pairs )
so first question, have log line looks this:
14:46:16.603 [http-nio-8080-exec-4] info metering - msg=93e6dd5e-c009-46b3-b9eb-f753ee3b889a create_job job=a820018e-7ad7-481a-97b0-bd705c3280ad data=71b1652e-16c8-4b33-9a57-f5fcb3d5de92
using http://grokdebug.herokuapp.com/ able come following grok pattern works line:
%{time:timestamp} %{notspace:http} %{word:loglevel}%{space}%{word:logtype} - msg=%{notspace:msg}%{space}%{word:action}%{space}job=%{notspace:job}%{space}data=%{notspace:data}
with following config file:
input { file { path => "/home/robyn/testlogs/trimmed_logs.txt" start_position => beginning sincedb_path => "/dev/null" # testing; allows reparsing } } filter { grok { match => {"message" => "%{time:timestamp} %{notspace:http} %{word:loglevel}%{space}%{word:logtype} - msg=%{notspace:msg}%{space}%{word:action}%{space}job=%{notspace:job}%{space}data=%{notspace:data}" } } } output { file { path => "/home/robyn/filteredlogs/trimmed_logs.out.txt" } }
i following output:
{"message":"14:46:16.603 [http-nio-8080-exec-4] info metering - msg=93e6dd5e-c009-46b3-b9eb-f753ee3b889a create_job job=a820018e-7ad7-481a-97b0-bd705c3280ad data=71b1652e-16c8-4b33-9a57-f5fcb3d5de92","@version":"1","@timestamp":"2015-08-07 t17:55:16.529z","host":"hlt-dev","path":"/home/robyn/testlogs/trimmed_logs.txt","timestamp":"14:46:16.603","http":"[http-nio-8080-exec-4]","loglevel":"info","logtype":"metering","msg":"93e6dd5e-c009-46b3-b9eb-f753ee3b889a","action":"create_job","job":"a820018e-7ad7-481a-97b0-bd705c3280ad","data":"71b1652e-16c8-4b33-9a57-f5fcb3d5de92"}
that's pretty want, feel it's kludgy pattern, particularly need use %{space} , %{nospace} much. suggests me i'm not doing best possible way. should creating more specific pattern hex ids? think need %{space} between loglevel , logtype because of space between info , metering in log, feels kludgy.
also how log's timestamp replace @timestamp seems time logstash ingested log, don't want/need.
obviously i'm getting started elk , grok, pointers useful resources appreciated.
there existing pattern can use instead of notspace
, it's uuid
. when there's single space, there's no need use space
pattern, can leave out. i'm using username
pattern (maybe wrongly named) sake of capturing http
field.
so go , have single space
pattern capture multiple spaces.
sample log line:
14:46:16.603 [http-nio-8080-exec-4] info metering - msg=93e6dd5e-c009-46b3-b9eb-f753ee3b889a create_job job=a820018e-7ad7-481a-97b0-bd705c3280ad data=71b1652e-16c8-4b33-9a57-f5fcb3d5de92
grok pattern:
%{time:timestamp} \[%{username:http}\] %{word:loglevel}%{space}%{word:logtype} - msg=%{uuid:msg} %{word:action} job=%{uuid:job} data=%{uuid:data}
grok spit out:
{ "timestamp": [ [ "14:46:16.603" ] ], "hour": [ [ "14" ] ], "minute": [ [ "46" ] ], "second": [ [ "16.603" ] ], "http": [ [ "http-nio-8080-exec-4" ] ], "loglevel": [ [ "info" ] ], "space": [ [ " " ] ], "logtype": [ [ "metering" ] ], "msg": [ [ "93e6dd5e-c009-46b3-b9eb-f753ee3b889a" ] ], "action": [ [ "create_job" ] ], "job": [ [ "a820018e-7ad7-481a-97b0-bd705c3280ad" ] ], "data": [ [ "71b1652e-16c8-4b33-9a57-f5fcb3d5de92" ] ] }
Comments
Post a Comment