Skip to main content

handle *.log.fmt specifiers and regex conversion

Project description

logfmt1 is meant for universal log parsing, whilst reducing manual configuration or restricting to basic log variants. It handles *.log.fmt files to transform LogFormat / placeholder strings to regular expressions (with named capture groups).

{
   "class": "apache combined",
   "record": "%h %l %u %t \"%r\" %>s %b",
}

For instance would resolve to:

(?<remote_host>[\\w\\-.:]+) (?<remote_logname>[\\w\\-.:]+) (?<remote_user>[\\-\\w@.]+)
\\[?(?<request_time>\\d[\\d:\\w\\s:./\\-+,;]+)\\]? "(?<request_line>(?<request_method>\\w+)
(?<request_path>\\S+) (?<request_protocol>[\\w/\\d.]+))" (?<status>-|\\d\\d\\d)
(?<bytes_sent>\\d+|-)'

This python package currently just comes with:

  • .fmt definitions for apache + strftime + grok placeholders.

  • logex - a basic log extractor

  • And update-logfmt to create/rewrite *.log.fmt files globally.

It originated in modseccfg. You should ideally install the system package however:

apt install python3-logfmt1

This will yield the proper /usr/share/logfmt/ structure and the run-parts wrapper update-logfmt.

logfmt1

To manually craft a regex:

import logfmt1, json
fmt = json.load(open("/.../access.log.fmt", "r"))
rx = logfmt1.regex(fmt)
rx = logfmt1.rx2re(rx)   # turn into Python regex

Or with plain old guesswork / presuming a standard log format:

rx = logfmt1.regex({"class": "apache combined"})

Though that’s of course not the intended use case, and hinges on predefined formats in /usr/share/logfmt/.

logfmt1.logopen()

logopen(fn=…) is basically a file-like iterator that yields dictionaries rather than text strings.

for row in logfmt1.logopen(".../access.log"):
print(row["request_time"])

And it provides a basic regex/formatstring debugging feature (via debug=True parameter or with logex -D):

failed regex section

failed regex section

logex

Very crudementary extractor for log files:

logex .../access.log --tab @host @date +id

Which also handles the .fmt implicitly. (Kinda the whole point of this project.)

update-logfmt

The Python package does bundle a run-parts wrapper, but just the apache collector, and a local Python copy of the format database. It should discover all (Apache) *.log files nonetheless and pair them with .fmt declarations.

And that’s sort of the main aspect of this project. Establish .log.fmt files until application vendors come around to making logs parseable. The rules database structure is subject to change, and only one possible implementation. There might also be simpler approaches (grok mapping) to generate regexps for format strings.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

logfmt1-0.5.3-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file logfmt1-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: logfmt1-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 26.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.21.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.3

File hashes

Hashes for logfmt1-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 720b7841270b86b1b8be069148796910befbdd344ffd2dac366dc6d446ec9951
MD5 51e75992f6f0e21672735dfc1a4e7682
BLAKE2b-256 f29e83dea925d9d93d67bb6e8547d620ce0e426707d21d8402fdf42aeaf1e7db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page