Structured log triage. Stream a log, match against configurable patterns, get a summary.

These details have not been verified by PyPI

Project links

Project description

logtally

Stream a log, match it against configurable regex patterns, get a summary report. Memory-efficient (handles files larger than RAM), zero-config for common patterns, JSONL output for pipelines.

Why

grep is great when you know exactly what you're looking for. cProfile-style heavyweight log analyzers are overkill for "what broke in the last hour?" logtally is the in-between: point it at a log, get an honest breakdown of what's in there — by pattern, by severity, with a time range — in one command.

It's particularly useful when:

You inherited a service and want to know what kinds of errors are normal vs. new
A deployment went sideways and you want to find the noisy patterns fast
You need to feed structured match data into another tool via JSONL

What it looks like

$ logtally examples/sample-custom.log

[summary]
  Total lines:               55
  Matched lines:             39
  Total matches:             65
  Match rate:            70.91%
  Time range:      2026-04-29T09:00:01 to 2026-04-29T12:10:01 (3:10:00)

[by severity]
  info                7
  warn               24
  error              33
  critical            1

[top 10 patterns]
   1.       18  [error   ] generic_error  (Generic error-level log line)
   2.       12  [warn    ] generic_warning  (Generic warning-level log line)
   3.        5  [info    ] startup  (Service startup signal)
   4.        5  [warn    ] retry  (Operation was retried)
   5.        4  [error   ] connection_refused  (TCP connection refused by remote)
   6.        3  [warn    ] http_4xx  (HTTP 4xx response status)
   7.        3  [error   ] auth_failure  (Failed authentication or authorization)
   8.        3  [error   ] http_5xx  (HTTP 5xx response status)
   9.        2  [error   ] db_error  (Database error)
  10.        2  [warn    ] rate_limit  (Request was rate limited)

Note: a single line can match multiple patterns (e.g. an HTTP 500 line might also be a generic_error), which is why Total matches (65) is higher than Matched lines (39).

Install

From source (only path for now; PyPI publishing is on the roadmap):

git clone https://github.com/amadhusudan/logtally.git
cd logtally
pip install -e .

The only runtime dependency is pyyaml.

Usage

# Default: scan a file with bundled patterns, print summary
logtally app.log

# Show only top 20 patterns
logtally app.log --top 20

# Filter to error and critical only
logtally app.log --min-severity error

# Pipe from stdin
cat app.log | logtally -
journalctl -u myservice | logtally -

# Output every match as JSONL for downstream processing
logtally app.log --json > matches.jsonl

# Use your own pattern set
logtally app.log --config my-patterns.yaml

CLI reference

positional arguments:
  source                       Path to log file. Omit or use '-' for stdin.

options:
  --config, -c PATH            Path to YAML pattern config. Defaults to bundled patterns.
  --top, -n N                  Number of top patterns to show (default: 10).
  --min-severity, -s LEVEL     Only count matches at this severity or higher.
                               Choices: info, warn, error, critical.
  --json                       Output matches as JSONL to stdout instead of summary.
  --version, -v                Show version and exit.

Try it out

The repo ships with several sample logs in examples/ covering common formats. They double as smoke tests for the bundled patterns:

File	Format	What's in it
`sample-custom.log`	Generic app log — `2026-04-29 09:00:01 LEVEL logger msg`	deadlocks, retries, 4xx/5xx, auth failures, OOM
`sample-syslog.log`	RFC3164 syslog (yearless) — `Apr 29 09:00:01 host …`	service lifecycle, OOM, retries
`sample-systemd.log`	`journalctl` default output	unit transitions, sshd auth failures, kernel OOM kill
`sample-nginx-access.log`	nginx/Apache Combined Log Format	HTTP 4xx/5xx breakdown, bot scanners
`sample-json.log`	JSONL (zap/zerolog/pino style)	structured logs with `level`/`msg`/contextual fields

# HTTP status breakdown from nginx access logs
logtally examples/sample-nginx-access.log

# Only error/critical from structured JSON logs
logtally examples/sample-json.log --min-severity error

# systemd journal scan piped through, then count critical patterns
logtally examples/sample-systemd.log --json | jq -r '.pattern' | sort | uniq -c

Patterns

logtally ships with a default pattern set covering common signals: HTTP errors, timeouts, auth failures, deadlocks, OOMs, deprecation warnings, retries, and more. See logtally/_data/default.yaml for the full list.

You can write your own pattern file in the same format:

patterns:
  - name: my_custom_signal
    regex: "checkout flow stuck for user_id=\\d+"
    severity: error
    description: "Checkout hang"

  - name: gateway_5xx
    regex: " 5\\d{2} .* upstream "
    severity: critical
    description: "5xx response from upstream gateway"

Then run:

logtally app.log --config my-patterns.yaml

A pattern can match anywhere in a line (regexes use Python re.search with case-insensitive matching). One line can fire multiple patterns; each fired pattern counts independently in the top-N report.

Design notes

A few choices worth knowing about:

Streaming, not loading. logtally reads one line at a time. A 10 GB log file uses ~10 MB of memory, not 10 GB.
Lenient encoding. Real-world logs contain random non-UTF-8 bytes. logtally uses errors='replace' rather than crashing mid-scan.
No SQL, no database, no daemon. It's a single Python process that reads stdin or a file and prints to stdout. Compose it with shell pipelines.
JSONL output is the integration story. If you want a dashboard, alerting, or trend analysis, pipe --json output into whatever tool already does that well.

Performance

logtally is regex-bound, which means a few simple rules apply:

More patterns = slower scan (each pattern is tried against each line)
Anchored or specific patterns are faster than greedy ones
The bundled default set processes roughly 200K–400K lines/second on a modern laptop CPU

If you're scanning gigabyte-scale logs and want hard numbers, profile your specific config with sentinel-trace.

Testing

pip install -e ".[dev]"
pytest -v

Tests cover pattern compilation, severity filtering, line streaming (including encoding edge cases), aggregation logic, timestamp parsing (ISO, yearless syslog, and Apache/nginx formats), and output formatting. CI runs the suite against Python 3.9-3.12 on every push and pull request.

Roadmap

--bucket 1h — time-bucketed match counts for spotting bursts
--follow / -f — tail mode (like tail -f)
Anomaly detection — flag patterns whose rate jumps significantly within a time window
Multi-file aggregation — logtally logs/*.log with per-file breakdown
Pre-built pattern packs for nginx, syslog, Kubernetes events
PyPI publish

License

Apache 2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logtally-0.1.0.tar.gz (21.4 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

logtally-0.1.0-py3-none-any.whl (17.7 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file logtally-0.1.0.tar.gz.

File metadata

Download URL: logtally-0.1.0.tar.gz
Upload date: May 12, 2026
Size: 21.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for logtally-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f08bcf20224c679d4d4d1e1204ecc19a0f9dc540ff6138b801d5e5f589633527`
MD5	`eeb301d2e86ed2b2eddb7f595b47a0ea`
BLAKE2b-256	`14c9f3688cd9e7f2c2e0fef246f422c90c5eab8f364a1e760b0f54da33c273f1`

See more details on using hashes here.

File details

Details for the file logtally-0.1.0-py3-none-any.whl.

File metadata

Download URL: logtally-0.1.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 17.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for logtally-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f7837e106ec33cb699c484e7244d9195431b6e9bdb66c34835c240e99fbe5f86`
MD5	`4e3c4d3a739071329783e08ee262b693`
BLAKE2b-256	`bd5b942a6554a437f396e340cab09a8380f51b9760d63061494fa5cea0ad19f8`

See more details on using hashes here.

logtally 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

logtally

Why

What it looks like

Install

Usage

CLI reference

Try it out

Patterns

Design notes

Performance

Testing

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes