Skip to main content

A tool to parse log files into csv, using a grok-like pattern

Project description

Log2csv

A fast tool to parse unstructured log files into structured csv, using a grok-like pattern.

log2csv [-p custom_pattern.grok|custom_pattern_dir]  [-o output.csv] -e '%{NUMBER:size} (?P<custom_name>regexpression) (?:content to ignore but match) %{IP: client} %{UserAgent: agent} %{URL: request_url}' nginx.log

Expression

%{PATTERN_NAME1: csv_field_name}
%{PATTERN_NAME2}

Grok File

PATTERN_NAME1 regexpression
PATTERN_NAME2 %{SUB_PATTERN: field_name}
# Comment

Example

sample.log

77.179.66.156 - - [25/Oct/2016:14:49:33 +0200] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.59 Safari/537.36"
77.179.66.156 - - [25/Oct/2016:14:49:34 +0200] "GET /favicon.ico HTTP/1.1" 404 571 "http://localhost:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.59 Safari/537.36"
77.179.66.156 - - [25/Oct/2016:14:50:44 +0200] "GET /adsasd HTTP/1.1" 404 571 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.59 Safari/537.36"
77.179.66.156 - - [07/Dec/2016:10:34:43 +0100] "GET / HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156 - - [07/Dec/2016:10:34:43 +0100] "GET /favicon.ico HTTP/1.1" 404 571 "http://localhost:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156 - - [07/Dec/2016:10:43:18 +0100] "GET /test HTTP/1.1" 404 571 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156 - - [07/Dec/2016:10:43:21 +0100] "GET /test HTTP/1.1" 404 571 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156 - - [07/Dec/2016:10:43:23 +0100] "GET /test1 HTTP/1.1" 404 571 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
127.0.0.1 - - [07/Dec/2016:11:04:37 +0100] "GET /test1 HTTP/1.1" 404 571 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
127.0.0.1 - - [07/Dec/2016:11:04:58 +0100] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0"
127.0.0.1 - - [07/Dec/2016:11:04:59 +0100] "GET / HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0"
127.0.0.1 - - [07/Dec/2016:11:05:07 +0100] "GET /taga HTTP/1.1" 404 169 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0"

log2csv command

log2csv -e '%{IP:ip} - - \[%{HTTPDATE:date}\] "%{WORD:http_method} %{URIPATH:path} HTTP/1.1" %{NUMBER:http_status} %{NUMBER:payload_bytes} "-" "%{DATA:user_agent}"' sample.log

output csv format

ip,date,http_method,path,http_status,payload_bytes,user_agent
77.179.66.156,25/Oct/2016:14:49:33 +0200,GET,/,200,612,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.59 Safari/537.36"
77.179.66.156,25/Oct/2016:14:50:44 +0200,GET,/adsasd,404,571,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.59 Safari/537.36"
77.179.66.156,07/Dec/2016:10:34:43 +0100,GET,/,200,612,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156,07/Dec/2016:10:43:18 +0100,GET,/test,404,571,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156,07/Dec/2016:10:43:21 +0100,GET,/test,404,571,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
77.179.66.156,07/Dec/2016:10:43:23 +0100,GET,/test1,404,571,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
127.0.0.1,07/Dec/2016:11:04:37 +0100,GET,/test1,404,571,"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
127.0.0.1,07/Dec/2016:11:04:58 +0100,GET,/,304,0,Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0
127.0.0.1,07/Dec/2016:11:04:59 +0100,GET,/,304,0,Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0
127.0.0.1,07/Dec/2016:11:05:07 +0100,GET,/taga,404,169,Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:49.0) Gecko/20100101 Firefox/49.0

Benchmark

Tested on Macbook Pro 2 cores 4 threads 8G memroy: 1.8G logs file costs 176s

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

log2csv-1.0.0.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

log2csv-1.0.0-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file log2csv-1.0.0.tar.gz.

File metadata

  • Download URL: log2csv-1.0.0.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/15.2 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/2.7.10

File hashes

Hashes for log2csv-1.0.0.tar.gz
Algorithm Hash digest
SHA256 015f26e014116071e8b8df45497ddf167c7527aac8c4bb5d09326586cbe4bdec
MD5 1d9408695b27402b333b2967cf0cd942
BLAKE2b-256 1b9d08990714d15a9d2e66d59be08d2695cea88f180df8515e0d1b690cd1cc80

See more details on using hashes here.

File details

Details for the file log2csv-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: log2csv-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/15.2 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/2.7.10

File hashes

Hashes for log2csv-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8782ec07008b12f3844d6bfee98f97ec20c8e995f86dd7f67db451ffd25f3bbe
MD5 d8c12904494a78f90961361eac8836bb
BLAKE2b-256 d29e5d8ed4715b51050d9573a318b12a2a1eb74c13405938b666583ad541bb08

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page