Skip to main content

Provider-agnostic log analysis package with LLM support and LangGraph ReAct agent

Project description

pylogtracer

Provider-agnostic log analysis package with LLM support and a dynamic ReAct agent.

Python License PyPI


What is pylogtracer?

pylogtracer is a Python package that helps you analyze log files using LLMs. It works in two modes:

  • Library mode — direct function calls, no agent needed, works without LLM
  • Agent mode — ask free-form questions, a LangGraph ReAct agent decides which tools to call

It supports any LLM provider — Ollama (local), OpenAI, Anthropic, or any custom API.


Features

  • Provider agnostic — OpenAI, Anthropic, Ollama, or any OpenAI-compatible API
  • Keyword learning — LLM learns error patterns once, reuses them for free next time
  • ReAct agent — multi-step reasoning, calls multiple tools per question
  • Time resolver — understands "10am", "yesterday", "2 hours ago"
  • Cluster analysis — groups related errors into incidents automatically
  • Search — find any log entry by keyword, incident ID, or error snippet
  • No .env required — pass config directly to LogTracer

Installation

pip install pylogtracer

Quick Start

from pylogtracer import LogTracer

# Ollama (local, no API key needed)
tracer = LogTracer(
    file_path  = "app.log",
    llm_config = {
        "provider": "ollama",
        "model":    "qwen2.5:7b",
        "base_url": "http://localhost:11434"
    }
)

# Library mode — no LLM needed
print(tracer.summary())
print(tracer.error_frequency())
print(tracer.health_check())

# Agent mode — LLM required
print(tracer.ask("what caused the crash at 10am?"))
print(tracer.ask("show INC1000004 related logs and how long it lasted"))

Supported Providers

Provider Example Model API Key
Ollama qwen2.5:7b, llama3, mistral No
OpenAI gpt-4o-mini, gpt-4o Yes
Anthropic claude-3-5-haiku-20241022 Yes
Custom Any OpenAI-compatible API Optional
# OpenAI
tracer = LogTracer("app.log", llm_config={
    "provider": "openai",
    "model":    "gpt-4o-mini",
    "api_key":  "sk-..."
})

# Anthropic
tracer = LogTracer("app.log", llm_config={
    "provider": "anthropic",
    "model":    "claude-3-5-haiku-20241022",
    "api_key":  "sk-ant-..."
})

# Custom / vLLM / LM Studio
tracer = LogTracer("app.log", llm_config={
    "provider": "custom",
    "model":    "my-model",
    "base_url": "http://my-server:8000/v1",
    "api_key":  "optional"
})

Library Mode — All Methods

tracer = LogTracer("app.log")   # no LLM needed for library mode

# Overview
tracer.summary()
# {'total_entries': 100, 'total_errors': 30, 'total_clusters': 11,
#  'error_types': [...], 'first_error': '...', 'last_error': '...'}

# Error counts
tracer.error_frequency()
tracer.error_frequency(date="2024-03-01")
tracer.error_frequency(from_dt="2024-03-01 09:00:00", to_dt="2024-03-01 11:00:00")

# Filter errors
tracer.errors_by_date("2024-03-01")
tracer.errors_in_range("2024-03-01 09:00:00", "2024-03-01 11:00:00")

# Last incident
tracer.last_incident()

# System health
tracer.health_check()
# {'healthy': False, 'status': 'CRITICAL', 'total_errors': 30, ...}

# Incident duration
tracer.incident_duration()
# {'start': '...', 'end': '...', 'duration_human': '6 minutes 12 seconds', ...}

# Search
tracer.search("INC1000001")              # by incident ID
tracer.search("connection refused")      # by keyword
tracer.get_related_logs("INC1000004")   # all logs in same cluster
tracer.get_entry_details("INC1000004")  # full entry with traceback

# Root cause (LLM required)
tracer.root_cause_analysis()

Agent Mode — Ask Anything

tracer = LogTracer("app.log", llm_config={...})

# Simple questions
tracer.ask("what is the last error?")
tracer.ask("is the system healthy?")
tracer.ask("how many DB errors happened?")

# Time-based (auto-resolved — no need to specify exact timestamps)
tracer.ask("what errors happened at 10am?")
tracer.ask("show errors from yesterday")
tracer.ask("what happened 2 hours ago?")
tracer.ask("errors between 9am and 11am")

# Identifier search
tracer.ask("show me INC1000004 related logs")
tracer.ask("what happened with REQ-456?")

# Multi-step (agent calls multiple tools automatically)
tracer.ask("what caused the crash and how long did it last?")
tracer.ask("compare errors today vs yesterday")
tracer.ask("show INC1000004 related logs and diagnose the root cause")

How the Agent Works

The agent uses a LangGraph ReAct loop — it thinks, calls a tool, sees the result, and decides whether to call another tool or answer:

User: "what caused the crash and how long did it last?"
        ↓
  [think] → I need last_incident first
        ↓
  [tool]  → last_incident() → sees cluster
        ↓
  [think] → now I need root_cause and duration
        ↓
  [tool]  → root_cause() → LLM analysis
        ↓
  [tool]  → incident_duration() → 6 minutes 12 seconds
        ↓
  [think] → I have everything now
        ↓
  FINAL_ANSWER: "The crash was caused by..."

Time Resolution

The agent automatically understands relative time — no need for exact timestamps:

You say Resolved to
"10am" today 10:00:00 → 10:59:59
"yesterday 2pm" yesterday 14:00:00 → 14:59:59
"this morning" today 06:00:00 → 12:00:00
"2 hours ago" now - 2h → now
"last 30 minutes" now - 30m → now
"last night" yesterday 20:00:00 → today 06:00:00
"March 1" 2024-03-01

Architecture

from pylogtracer import LogTracer      ← single entry point

LogTracer
    ├── preprocessing/
    │   ├── smart_reader.py            log reading, filtering, search
    │   ├── error_extractor.py         clustering, deduplication
    │   └── error_type_classifier.py   regex + keyword learning + LLM
    │
    ├── agents/
    │   ├── qa_agent.py                LangGraph ReAct agent (ask())
    │   └── root_cause_analyzer.py     LLM root cause analysis
    │
    ├── multiagent/
    │   └── context_bridge.py          agent-to-agent context loop
    │
    ├── llm/
    │   └── llm_factory.py             provider-agnostic LLM factory
    │
    └── utils/
        └── time_resolver.py           relative time resolution

How Keyword Learning Works

The classifier uses a 3-pass system to minimize LLM calls:

Pass 1 — Named exception regex (free):
  "ConnectionError: timed out"  → ConnectionError ✓

Pass 2 — Keyword store (free, learned this session):
  "database connection refused" → DatabaseConnectionError ✓
  (learned from a previous LLM call this session)

Pass 3 — LLM batch (only truly unknown errors):
  LLM classifies + returns keywords for future use
  Keywords stored → next similar error is FREE

Configuration Options

LogTracer(
    file_path   = "app.log",    # path to log file
    llm_config  = {             # LLM provider config
        "provider":    "ollama",
        "model":       "qwen2.5:7b",
        "base_url":    "http://localhost:11434",
        "api_key":     "optional",
        "temperature": 0.0,
        "max_tokens":  1024,
    },
    gap_seconds = 60,           # seconds between entries to split incidents
    max_retries = 2,            # max times LLM can request more context
)

Requirements

langchain>=0.2.0
langchain-core>=0.2.0
langchain-openai>=0.1.0
langchain-anthropic>=0.1.0
langchain-ollama>=0.1.0
langgraph>=0.1.0
pydantic>=2.0.0
python-dotenv>=1.0.0

Running Tests

# Install dev dependencies
pip install pytest pytest-cov

# Run all tests
pytest tests/ -v

# Run specific test
pytest tests/test_smart_reader.py -v

CI/CD

Every push to main runs tests on Python 3.10, 3.11, and 3.12. Every GitHub Release automatically publishes to PyPI.


License

MIT


Contributing

Pull requests welcome! Please run tests before submitting.

pip install -e .
pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylogtracer-0.1.3.tar.gz (50.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylogtracer-0.1.3-py3-none-any.whl (52.0 kB view details)

Uploaded Python 3

File details

Details for the file pylogtracer-0.1.3.tar.gz.

File metadata

  • Download URL: pylogtracer-0.1.3.tar.gz
  • Upload date:
  • Size: 50.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pylogtracer-0.1.3.tar.gz
Algorithm Hash digest
SHA256 5474d1508735883e61fc60b5869485c72019a165415f957d18738218add5be74
MD5 2bbca87e22a8a653477662a025173a71
BLAKE2b-256 678ace11b9fcc13637db6fd98084ab3ed5413a430d1db8937d17c2a595b39899

See more details on using hashes here.

File details

Details for the file pylogtracer-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pylogtracer-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 52.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pylogtracer-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9f2065fdd0b6c354982c53fe8fea3f28d0de7a2fc39ed38a54972d4f8d745859
MD5 3b30b6737c50d5ee6692e185b93bce2b
BLAKE2b-256 84eb735b15a166387f870195d5bcae6a919480ad7b69df42a38b35053f1771b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page