Deterministic log templating on top of Drain3, packaged as an artifact for AI agents.

Project description

AgenticAILogAnalyser

Python port of codag-drain that uses the upstream Python Drain3 package as its grouping engine. Same CLI surface, same output shape, same evidence-rich artifact, packaged as a single binary you can drop into any environment.

The intended consumer is an AI agent that needs to read a large log window under a fixed token budget. Instead of feeding the agent 1,400 raw lines, you feed it 8 templates with slot statistics and a few raw examples per group.

What it does

Takes a stream of log lines on stdin, groups near-duplicates with Drain3, and emits one templated line per group with:

the count of collapsed lines,
a derived <*> template,
per-slot stats (min / max / median for numeric slots, distinct values for enums, an auto-detected unit like ms or MB),
a few raw sample lines.

The intended consumer is an LLM agent that needs to read a large log window under a fixed token budget.

Real-world example

A 1,438-line Kiro IDE log compresses to 8 templates at ~180x compression:

[x1] [WebviewProcessMonitor] Service starting
[x4] update#setState <*> [idle,downloading,downloaded,ready]
[x14] [WebviewProcessMonitor] Tracking webview renderer: pid=<*>, origin=<*>, windowId=<*> [13773..87619 p50=87288.5]
[x1] update#setState checking for updates
[x14] Extension host with pid <*> exited with code: 0, signal: unknown. [13697..89755 p50=73921]
[x1395] No ptyHost heartbeat after 6 seconds
[x8] [WebviewProcessMonitor] Webview renderer process gone: pid=<*>
[x1] Extracting content from 1 URIs
[codag-drain-py] 1438 lines -> 8 templates (179.8x)

The dominant signal — 97% of the file being one repeating warning — is the first thing the model sees instead of being buried. Numeric ranges and enum values are preserved, so outliers and state distributions stay visible.

Install

From source:

pip install -e .

From source with the build extra (PyInstaller):

pip install -e ".[build]"

Usage

echo 'worker latency 20ms
worker latency 20ms
worker latency 20ms
worker latency 8400ms' | codag-drain-py --stats

[x4] worker latency <*> [20..8400ms p50=20ms]
[codag-drain-py] 4 lines -> 1 templates (4.0x)

JSON output:

echo 'worker ready shard=1' | codag-drain-py --format json

Choose a grouper:

cat logs.txt | codag-drain-py --grouper drain-stock

NDJSON input:

cat events.ndjson | codag-drain-py --json

Available groupers:

name	description
`drain` (default)	Drain3 with codag's compact-line tokenizer fallback
`drain-stock`	Drain3 with vanilla whitespace tokenization
`drain-delimited`	Drain with extra punctuation delimiters folded into whitespace
`drain-fullsearch`	Drain similarity over all same-length clusters (no prefix-tree)
`statistical`	Non-Drain control: IDF-weighted anchor co-occurrence

Build a single-file binary

./scripts/build_binary.sh
./dist/codag-drain-py --help

PyInstaller bundles the Python interpreter and drain3 into one file under dist/. Build on each OS / architecture you intend to ship.

Programmatic API

from codag_drain_py import LogLine, TemplaterConfig, template_logs

result = template_logs(
    [LogLine(message="latency 20ms"), LogLine(message="latency 8400ms")]
)
print(result.render())
print(result.to_json(indent=2))

TemplateIndex exposes the streaming variant:

from codag_drain_py import LogLine, TemplateIndex

idx = TemplateIndex()
for msg in some_iterator():
    idx.push(LogLine(message=msg))
print(idx.templates().render())

Tests

pip install -e ".[dev]"
pytest

Credits

Drain3 — the underlying log template miner from logpai. We use the published PyPI package directly.
codag-drain — the Rust project this Python port is modeled on. The compact-line tokenizer fallback, multi-member template derivation, slot profiling, and CLI surface all follow that design.
Drain paper — He et al., "Drain: An Online Log Parsing Approach with Fixed Depth Tree", ICWS 2017.

License

MIT. See LICENSE.

Layout

src/codag_drain_py/
    __init__.py     public exports
    __main__.py     `python -m codag_drain_py`
    cli.py          argparse + stdin pipeline
    compress.py     templater entry point + rendering
    grouper.py      Drain / DrainStock / DrainDelimited / FullSearch / Statistical
    input.py        heuristic line + NDJSON parsers
    lex.py          character-class tokenizer + lex template derivation
    profile.py      slot capture, numeric stats, distinct-value summaries
    stream.py       TemplateIndex streaming wrapper
    template.py     whitespace template derivation + capture regex
tests/
    test_compress.py
    test_input.py
scripts/
    build_binary.sh PyInstaller --onefile build

MCP server (use as a tool from Kiro / Claude / any MCP client)

The analyser ships with a built-in Model Context Protocol server. Once registered with Kiro or Claude Desktop, your assistant can call it as a tool to compress logs on demand without you piping anything through a shell.

What it exposes

Five tools, all served over stdio:

tool	description
`analyse_logs`	Compress an inline log body. Returns templated artifact + summary.
`analyse_log_file`	Same but reads the body from a local file path.
`stream_push`	Append lines to a named streaming session.
`stream_project`	Render templates over the accumulated session.
`stream_reset`	Clear a session.

Each tool accepts the full set of analyser options: grouper, sample_cap, template_clip, body_format, output_format.

Build the MCP binary

./scripts/build_mcp_binary.sh

This produces a single self-contained binary at dist/agentic-log-analyser-mcp (~22 MB). It bundles the Python interpreter, the analyser, drain3, and the MCP SDK — no Python install required on the machine that runs it.

Register with Kiro

Open Kiro's MCP config (Command Palette → "Open MCP Config" or edit .kiro/settings/mcp.json in your workspace, or ~/.kiro/settings/mcp.json for user-wide). Add:

{
  "mcpServers": {
    "agentic-log-analyser": {
      "command": "/absolute/path/to/dist/agentic-log-analyser-mcp",
      "args": [],
      "disabled": false,
      "autoApprove": ["analyse_logs", "analyse_log_file", "stream_project"]
    }
  }
}

There's a ready-to-paste example at examples/mcp_config_kiro.json. Reload the MCP config from the MCP Server view in the Kiro feature panel.

Register with Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) and merge in:

{
  "mcpServers": {
    "agentic-log-analyser": {
      "command": "/absolute/path/to/dist/agentic-log-analyser-mcp",
      "args": []
    }
  }
}

Restart Claude Desktop. The tools will appear in the tools menu.

Use it from a chat

In Kiro or Claude, just ask:

"Compress this log file and tell me what stands out: /Users/me/Desktop/logs/cloudtrail_event.txt"

The assistant will pick up analyse_log_file, call it with the path, and diagnose against the templated artifact instead of the raw bytes.

Debug from the CLI

To run the server manually and tail its output:

./dist/agentic-log-analyser-mcp

It speaks JSON-RPC over stdio. The repo's scripts/smoke_mcp_binary.py shows a real client roundtrip you can use as a reference.

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Jun 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_log_analyser-0.1.0.tar.gz (27.8 kB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentic_log_analyser-0.1.0-py3-none-any.whl (26.5 kB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file agentic_log_analyser-0.1.0.tar.gz.

File metadata

Download URL: agentic_log_analyser-0.1.0.tar.gz
Upload date: Jun 22, 2026
Size: 27.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for agentic_log_analyser-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`02755b2755e061d063af953712abfdddf96a8263324a09602359e8330c1f533b`
MD5	`5ac21c825f0fb1745b7d9f01b0870d56`
BLAKE2b-256	`9943277f0c89d5923ece067065caf429638b408869732c75da929bba7fc7c9e4`

See more details on using hashes here.

File details

Details for the file agentic_log_analyser-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentic_log_analyser-0.1.0-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 26.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for agentic_log_analyser-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cc4c42bd34b960f19e3893f5ffc4f2c88b33892aae622975a3b1618a21286b22`
MD5	`b43beaa385224d5e02c2b939d6eb7fc9`
BLAKE2b-256	`22755da6c0e33e2b30bfc62e388d90a3d9c77bab8c0403b6cbdc6c09fb5c8914`

See more details on using hashes here.

agentic-log-analyser 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

AgenticAILogAnalyser

What it does

Real-world example

Install

Usage

Build a single-file binary

Programmatic API

Tests

Credits

License

Layout

MCP server (use as a tool from Kiro / Claude / any MCP client)

What it exposes

Build the MCP binary

Register with Kiro

Register with Claude Desktop

Use it from a chat

Debug from the CLI

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes