Skip to main content

Provider-agnostic log analysis package with LLM support and LangGraph ReAct agent

Project description

pylogtracer

Provider-agnostic log analysis package with LLM support and a dynamic ReAct agent.

Python License PyPI


What is pylogtracer?

pylogtracer is a Python package that helps you analyze log files using LLMs. It works in two modes:

  • Library mode — direct function calls, no agent needed, works without LLM
  • Agent mode — ask free-form questions, a LangGraph ReAct agent decides which tools to call

It supports any LLM provider — Ollama (local), OpenAI, Anthropic, or any custom API.


Features

  • Provider agnostic — OpenAI, Anthropic, Ollama, or any OpenAI-compatible API
  • Keyword learning — LLM learns error patterns once, reuses them for free next time
  • ReAct agent — multi-step reasoning, calls multiple tools per question
  • Time resolver — understands "10am", "yesterday", "2 hours ago"
  • Cluster analysis — groups related errors into incidents automatically
  • Search — find any log entry by keyword, incident ID, or error snippet
  • No .env required — pass config directly to LogTracer

Installation

pip install pylogtracer

Quick Start

from pylogtracer import LogTracer

# Ollama (local, no API key needed)
tracer = LogTracer(
    file_path  = "app.log",
    llm_config = {
        "provider": "ollama",
        "model":    "qwen2.5:7b",
        "base_url": "http://localhost:11434"
    }
)

# Library mode — no LLM needed
print(tracer.summary())
print(tracer.error_frequency())
print(tracer.health_check())

# Agent mode — LLM required
print(tracer.ask("what caused the crash at 10am?"))
print(tracer.ask("show INC1000004 related logs and how long it lasted"))

Supported Providers

Provider Example Model API Key
Ollama qwen2.5:7b, llama3, mistral No
OpenAI gpt-4o-mini, gpt-4o Yes
Anthropic claude-3-5-haiku-20241022 Yes
Custom Any OpenAI-compatible API Optional
# OpenAI
tracer = LogTracer("app.log", llm_config={
    "provider": "openai",
    "model":    "gpt-4o-mini",
    "api_key":  "sk-..."
})

# Anthropic
tracer = LogTracer("app.log", llm_config={
    "provider": "anthropic",
    "model":    "claude-3-5-haiku-20241022",
    "api_key":  "sk-ant-..."
})

# Custom / vLLM / LM Studio
tracer = LogTracer("app.log", llm_config={
    "provider": "custom",
    "model":    "my-model",
    "base_url": "http://my-server:8000/v1",
    "api_key":  "optional"
})

Library Mode — All Methods

tracer = LogTracer("app.log")   # no LLM needed for library mode

# Overview
tracer.summary()
# {'total_entries': 100, 'total_errors': 30, 'total_clusters': 11,
#  'error_types': [...], 'first_error': '...', 'last_error': '...'}

# Error counts
tracer.error_frequency()
tracer.error_frequency(date="2024-03-01")
tracer.error_frequency(from_dt="2024-03-01 09:00:00", to_dt="2024-03-01 11:00:00")

# Filter errors
tracer.errors_by_date("2024-03-01")
tracer.errors_in_range("2024-03-01 09:00:00", "2024-03-01 11:00:00")

# Last incident
tracer.last_incident()

# System health
tracer.health_check()
# {'healthy': False, 'status': 'CRITICAL', 'total_errors': 30, ...}

# Incident duration
tracer.incident_duration()
# {'start': '...', 'end': '...', 'duration_human': '6 minutes 12 seconds', ...}

# Search
tracer.search("INC1000001")              # by incident ID
tracer.search("connection refused")      # by keyword
tracer.get_related_logs("INC1000004")   # all logs in same cluster
tracer.get_entry_details("INC1000004")  # full entry with traceback

# Root cause (LLM required)
tracer.root_cause_analysis()

Agent Mode — Ask Anything

tracer = LogTracer("app.log", llm_config={...})

# Simple questions
tracer.ask("what is the last error?")
tracer.ask("is the system healthy?")
tracer.ask("how many DB errors happened?")

# Time-based (auto-resolved — no need to specify exact timestamps)
tracer.ask("what errors happened at 10am?")
tracer.ask("show errors from yesterday")
tracer.ask("what happened 2 hours ago?")
tracer.ask("errors between 9am and 11am")

# Identifier search
tracer.ask("show me INC1000004 related logs")
tracer.ask("what happened with REQ-456?")

# Multi-step (agent calls multiple tools automatically)
tracer.ask("what caused the crash and how long did it last?")
tracer.ask("compare errors today vs yesterday")
tracer.ask("show INC1000004 related logs and diagnose the root cause")

How the Agent Works

The agent uses a LangGraph ReAct loop — it thinks, calls a tool, sees the result, and decides whether to call another tool or answer:

User: "what caused the crash and how long did it last?"
        ↓
  [think] → I need last_incident first
        ↓
  [tool]  → last_incident() → sees cluster
        ↓
  [think] → now I need root_cause and duration
        ↓
  [tool]  → root_cause() → LLM analysis
        ↓
  [tool]  → incident_duration() → 6 minutes 12 seconds
        ↓
  [think] → I have everything now
        ↓
  FINAL_ANSWER: "The crash was caused by..."

Time Resolution

The agent automatically understands relative time — no need for exact timestamps:

You say Resolved to
"10am" today 10:00:00 → 10:59:59
"yesterday 2pm" yesterday 14:00:00 → 14:59:59
"this morning" today 06:00:00 → 12:00:00
"2 hours ago" now - 2h → now
"last 30 minutes" now - 30m → now
"last night" yesterday 20:00:00 → today 06:00:00
"March 1" 2024-03-01

Architecture

from pylogtracer import LogTracer      ← single entry point

LogTracer
    ├── preprocessing/
    │   ├── smart_reader.py            log reading, filtering, search
    │   ├── error_extractor.py         clustering, deduplication
    │   └── error_type_classifier.py   regex + keyword learning + LLM
    │
    ├── agents/
    │   ├── qa_agent.py                LangGraph ReAct agent (ask())
    │   └── root_cause_analyzer.py     LLM root cause analysis
    │
    ├── multiagent/
    │   └── context_bridge.py          agent-to-agent context loop
    │
    ├── llm/
    │   └── llm_factory.py             provider-agnostic LLM factory
    │
    └── utils/
        └── time_resolver.py           relative time resolution

How Keyword Learning Works

The classifier uses a 3-pass system to minimize LLM calls:

Pass 1 — Named exception regex (free):
  "ConnectionError: timed out"  → ConnectionError ✓

Pass 2 — Keyword store (free, learned this session):
  "database connection refused" → DatabaseConnectionError ✓
  (learned from a previous LLM call this session)

Pass 3 — LLM batch (only truly unknown errors):
  LLM classifies + returns keywords for future use
  Keywords stored → next similar error is FREE

Configuration Options

LogTracer(
    file_path   = "app.log",    # path to log file
    llm_config  = {             # LLM provider config
        "provider":    "ollama",
        "model":       "qwen2.5:7b",
        "base_url":    "http://localhost:11434",
        "api_key":     "optional",
        "temperature": 0.0,
        "max_tokens":  1024,
    },
    gap_seconds = 60,           # seconds between entries to split incidents
    max_retries = 2,            # max times LLM can request more context
)

Requirements

langchain>=0.2.0
langchain-core>=0.2.0
langchain-openai>=0.1.0
langchain-anthropic>=0.1.0
langchain-ollama>=0.1.0
langgraph>=0.1.0
pydantic>=2.0.0
python-dotenv>=1.0.0

Running Tests

# Install dev dependencies
pip install pytest pytest-cov

# Run all tests
pytest tests/ -v

# Run specific test
pytest tests/test_smart_reader.py -v

CI/CD

Every push to main runs tests on Python 3.10, 3.11, and 3.12. Every GitHub Release automatically publishes to PyPI.


License

MIT


Contributing

Pull requests welcome! Please run tests before submitting.

pip install -e .
pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pylogtracer-0.1.2.tar.gz (44.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pylogtracer-0.1.2-py3-none-any.whl (45.7 kB view details)

Uploaded Python 3

File details

Details for the file pylogtracer-0.1.2.tar.gz.

File metadata

  • Download URL: pylogtracer-0.1.2.tar.gz
  • Upload date:
  • Size: 44.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pylogtracer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 133d152e7eefc29d612cbdcd4fe20c78880febe7131ceda23b85c6db49b93692
MD5 f60d4db5f37410aa761596b97b32973b
BLAKE2b-256 e9beeb68a6e5d76fd17bfc9d7755072cf82479caa87b45809b095b9a1f75eefb

See more details on using hashes here.

File details

Details for the file pylogtracer-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: pylogtracer-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 45.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pylogtracer-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 34d31e3000548a4d12b0cdc81390d451db80aebf080512feff1051ea9f7d146c
MD5 da6368f2df5733242465a9a504382392
BLAKE2b-256 52afe7bf465d3f029a8c0af08267355d50304b9beb2dd7c641e3b295ac5ea692

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page