Skip to main content

Provider-agnostic log analysis package with LLM support and LangGraph ReAct agent

Project description

pylogtracer

Provider-agnostic log analysis package with LLM support and a dynamic ReAct agent.

Python License PyPI


What is pylogtracer?

pylogtracer is a Python package that helps you analyze log files using LLMs. It works in two modes:

  • Library mode — direct function calls, no agent needed, works without LLM
  • Agent mode — ask free-form questions, a LangGraph ReAct agent decides which tools to call

It supports any LLM provider — Ollama (local), OpenAI, Anthropic, or any custom API.


Features

  • Provider agnostic — OpenAI, Anthropic, Ollama, or any OpenAI-compatible API
  • Keyword learning — LLM learns error patterns once, reuses them for free next time
  • ReAct agent — multi-step reasoning, calls multiple tools per question
  • Time resolver — understands "10am", "yesterday", "2 hours ago"
  • Cluster analysis — groups related errors into incidents automatically
  • Search — find any log entry by keyword, incident ID, or error snippet
  • No .env required — pass config directly to LogTracer

Installation

pip install pylogtracer

Quick Start

from pylogtracer import LogTracer

# Ollama (local, no API key needed)
tracer = LogTracer(
    file_path  = "app.log",
    llm_config = {
        "provider": "ollama",
        "model":    "qwen2.5:7b",
        "base_url": "http://localhost:11434"
    }
)

# Library mode — no LLM needed
print(tracer.summary())
print(tracer.error_frequency())
print(tracer.health_check())

# Agent mode — LLM required
print(tracer.ask("what caused the crash at 10am?"))
print(tracer.ask("show INC1000004 related logs and how long it lasted"))

Supported Providers

Provider Example Model API Key
Ollama qwen2.5:7b, llama3, mistral No
OpenAI gpt-4o-mini, gpt-4o Yes
Anthropic claude-3-5-haiku-20241022 Yes
Custom Any OpenAI-compatible API Optional
# OpenAI
tracer = LogTracer("app.log", llm_config={
    "provider": "openai",
    "model":    "gpt-4o-mini",
    "api_key":  "sk-..."
})

# Anthropic
tracer = LogTracer("app.log", llm_config={
    "provider": "anthropic",
    "model":    "claude-3-5-haiku-20241022",
    "api_key":  "sk-ant-..."
})

# Custom / vLLM / LM Studio
tracer = LogTracer("app.log", llm_config={
    "provider": "custom",
    "model":    "my-model",
    "base_url": "http://my-server:8000/v1",
    "api_key":  "optional"
})

Library Mode — All Methods

tracer = LogTracer("app.log")   # no LLM needed for library mode

# Overview
tracer.summary()
# {'total_entries': 100, 'total_errors': 30, 'total_clusters': 11,
#  'error_types': [...], 'first_error': '...', 'last_error': '...'}

# Error counts
tracer.error_frequency()
tracer.error_frequency(date="2024-03-01")
tracer.error_frequency(from_dt="2024-03-01 09:00:00", to_dt="2024-03-01 11:00:00")

# Filter errors
tracer.errors_by_date("2024-03-01")
tracer.errors_in_range("2024-03-01 09:00:00", "2024-03-01 11:00:00")

# Last incident
tracer.last_incident()

# System health
tracer.health_check()
# {'healthy': False, 'status': 'CRITICAL', 'total_errors': 30, ...}

# Incident duration
tracer.incident_duration()
# {'start': '...', 'end': '...', 'duration_human': '6 minutes 12 seconds', ...}

# Search
tracer.search("INC1000001")              # by incident ID
tracer.search("connection refused")      # by keyword
tracer.get_related_logs("INC1000004")   # all logs in same cluster
tracer.get_entry_details("INC1000004")  # full entry with traceback

# Root cause (LLM required)
tracer.root_cause_analysis()

Agent Mode — Ask Anything

tracer = LogTracer("app.log", llm_config={...})

# Simple questions
tracer.ask("what is the last error?")
tracer.ask("is the system healthy?")
tracer.ask("how many DB errors happened?")

# Time-based (auto-resolved — no need to specify exact timestamps)
tracer.ask("what errors happened at 10am?")
tracer.ask("show errors from yesterday")
tracer.ask("what happened 2 hours ago?")
tracer.ask("errors between 9am and 11am")

# Identifier search
tracer.ask("show me INC1000004 related logs")
tracer.ask("what happened with REQ-456?")

# Multi-step (agent calls multiple tools automatically)
tracer.ask("what caused the crash and how long did it last?")
tracer.ask("compare errors today vs yesterday")
tracer.ask("show INC1000004 related logs and diagnose the root cause")

How the Agent Works

The agent uses a LangGraph ReAct loop — it thinks, calls a tool, sees the result, and decides whether to call another tool or answer:

User: "what caused the crash and how long did it last?"
        ↓
  [think] → I need last_incident first
        ↓
  [tool]  → last_incident() → sees cluster
        ↓
  [think] → now I need root_cause and duration
        ↓
  [tool]  → root_cause() → LLM analysis
        ↓
  [tool]  → incident_duration() → 6 minutes 12 seconds
        ↓
  [think] → I have everything now
        ↓
  FINAL_ANSWER: "The crash was caused by..."

Time Resolution

The agent automatically understands relative time — no need for exact timestamps:

You say Resolved to
"10am" today 10:00:00 → 10:59:59
"yesterday 2pm" yesterday 14:00:00 → 14:59:59
"this morning" today 06:00:00 → 12:00:00
"2 hours ago" now - 2h → now
"last 30 minutes" now - 30m → now
"last night" yesterday 20:00:00 → today 06:00:00
"March 1" 2024-03-01

Architecture

from pylogtracer import LogTracer      ← single entry point

LogTracer
    ├── preprocessing/
    │   ├── smart_reader.py            log reading, filtering, search
    │   ├── error_extractor.py         clustering, deduplication
    │   └── error_type_classifier.py   regex + keyword learning + LLM
    │
    ├── agents/
    │   ├── qa_agent.py                LangGraph ReAct agent (ask())
    │   └── root_cause_analyzer.py     LLM root cause analysis
    │
    ├── multiagent/
    │   └── context_bridge.py          agent-to-agent context loop
    │
    ├── llm/
    │   └── llm_factory.py             provider-agnostic LLM factory
    │
    └── utils/
        └── time_resolver.py           relative time resolution

How Keyword Learning Works

The classifier uses a 3-pass system to minimize LLM calls:

Pass 1 — Named exception regex (free):
  "ConnectionError: timed out"  → ConnectionError ✓

Pass 2 — Keyword store (free, learned this session):
  "database connection refused" → DatabaseConnectionError ✓
  (learned from a previous LLM call this session)

Pass 3 — LLM batch (only truly unknown errors):
  LLM classifies + returns keywords for future use
  Keywords stored → next similar error is FREE

Configuration Options

LogTracer(
    file_path   = "app.log",    # path to log file
    llm_config  = {             # LLM provider config
        "provider":    "ollama",
        "model":       "qwen2.5:7b",
        "base_url":    "http://localhost:11434",
        "api_key":     "optional",
        "temperature": 0.0,
        "max_tokens":  1024,
    },
    gap_seconds = 60,           # seconds between entries to split incidents
    max_retries = 2,            # max times LLM can request more context
)

Requirements

langchain>=0.2.0
langchain-core>=0.2.0
langchain-openai>=0.1.0
langchain-anthropic>=0.1.0
langchain-ollama>=0.1.0
langgraph>=0.1.0
pydantic>=2.0.0
python-dotenv>=1.0.0

Running Tests

# Install dev dependencies
pip install pytest pytest-cov

# Run all tests
pytest tests/ -v

# Run specific test
pytest tests/test_smart_reader.py -v

CI/CD

Every push to main runs tests on Python 3.10, 3.11, and 3.12. Every GitHub Release automatically publishes to PyPI.


License

MIT


Contributing

Pull requests welcome! Please run tests before submitting.

pip install -e .
pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylogtracer-0.0.2.tar.gz (44.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylogtracer-0.0.2-py3-none-any.whl (45.7 kB view details)

Uploaded Python 3

File details

Details for the file pylogtracer-0.0.2.tar.gz.

File metadata

  • Download URL: pylogtracer-0.0.2.tar.gz
  • Upload date:
  • Size: 44.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pylogtracer-0.0.2.tar.gz
Algorithm Hash digest
SHA256 39b5a61429cdd8428e7cacd92e3af0fb613eef27524837b8a157852cb398b09e
MD5 903753be7c25da735b8b29e8af54710f
BLAKE2b-256 581380aefe506b60b56a04eddec7f5b5ccf0646a51e48eaebb00af4f97c24327

See more details on using hashes here.

File details

Details for the file pylogtracer-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: pylogtracer-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 45.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pylogtracer-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1012a4130f4d5f72d2e221791d10f3c9a4693195f6b3adad5811ef415fd726fe
MD5 41af4c1c2a8e2a3cb811cdad9760bc95
BLAKE2b-256 bd75d12983cff617ebfe5c5ac1ec3ba56f9ff27cb37e355bee8ca575349b1cb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page