Python SDK for Claude Code with Rust core

These details have not been verified by PyPI

Project links

Project description

Claude SDK for Python

A high-performance Python library for parsing and analyzing Claude Code session data. Built with Rust for speed, designed with Python developers in mind.

Installation
Quick Start
Core Concepts
API Reference
Examples
Performance
Troubleshooting
Development

Installation

Prerequisites

Python 3.8 or higher
pip or uv package manager

Install from PyPI (when published)

pip install claude-code-analytics

Install from source

# Clone the repository
git clone https://github.com/darinkishore/claude-code-analytics.git
cd claude-code-analytics

# Or using uv (recommended)
uv pip install ./python

Development installation

cd python
uv build

Quick Start

import claude_code_analytics

# Load a session from a JSONL file
session = claude_code_analytics.load("~/.claude/projects/myproject/session_20240101_120000.jsonl")

# Basic session info
print(f"Session ID: {session.session_id}")
print(f"Total cost: ${session.total_cost:.4f}")
print(f"Message count: {len(session.messages)}")
print(f"Tools used: {', '.join(session.tools_used)}")

# Iterate through messages
for message in session:
    print(f"{message.role}: {message.text[:100]}...")
    
# Find all your sessions
sessions = claude_code_analytics.find_sessions()
for session_path in sessions:
    print(f"Found session: {session_path}")

Core Concepts

Sessions

A Session represents a complete conversation with Claude, loaded from a JSONL file. Each session contains:

Messages exchanged between user and assistant
Tool executions and their results
Token usage and cost information
Conversation structure (including branches and sidechains)
Metadata and statistics

Messages

Messages are the individual exchanges in a conversation. Each message has:

role: Either "user" or "assistant"
text: The complete text content
tools: List of tools used (if any)
cost: Cost in USD for this specific message
timestamp: When the message was created
Threading information (uuid, parent_uuid)

Projects

A Project is a collection of related sessions, typically stored in the same directory. Projects provide aggregate statistics across all sessions.

Conversation Trees

The SDK automatically reconstructs the conversation structure, handling:

Linear conversations
Branching (when you retry or edit messages)
Sidechains (alternate conversation paths)
Orphaned messages (missing parents)

API Reference

Functions

`load(file_path: str | Path) -> Session`

Load a Claude Code session from a JSONL file.

session = claude_code_analytics.load("path/to/session.jsonl")

Parameters:

file_path: Path to the JSONL session file

Returns: Session object

Raises:

FileNotFoundError: If the file doesn't exist
ParseError: If the JSONL is malformed
ValidationError: If the session data is invalid

`find_sessions(base_path: Optional[str] = None, project: Optional[str] = None) -> List[Path]`

Discover Claude Code session files.

# Find all sessions
all_sessions = claude_code_analytics.find_sessions()

# Find sessions in a specific project
project_sessions = claude_code_analytics.find_sessions(project="myproject")

# Search in a custom location
custom_sessions = claude_code_analytics.find_sessions(base_path="/custom/path")

Parameters:

base_path: Root directory to search (default: ~/.claude/projects/)
project: Filter by specific project name

Returns: List of Path objects to session files

`find_projects(base_path: Optional[str] = None) -> List[Path]`

Find all Claude Code projects.

projects = claude_code_analytics.find_projects()
for project_path in projects:
    print(f"Project: {project_path.name}")

Parameters:

base_path: Root directory to search (default: ~/.claude/projects/)

Returns: List of Path objects to project directories

`load_project(project_identifier: str | Path, base_path: Optional[str] = None) -> Project`

Load an entire project with all its sessions.

# Load by project name
project = claude_code_analytics.load_project("myproject")

# Load by path
project = claude_code_analytics.load_project("/path/to/project")

print(f"Total sessions: {len(project.sessions)}")
print(f"Total cost: ${project.total_cost:.2f}")

Parameters:

project_identifier: Project name or path
base_path: Base path for project lookup (if using name)

Returns: Project object

Raises:

FileNotFoundError: If project doesn't exist
SessionError: If no valid sessions found

Classes

Session

Primary container for Claude Code session data.

Properties:

Property	Type	Description
`session_id`	`str`	Unique session identifier
`messages`	`List[Message]`	All messages in conversation order
`total_cost`	`float`	Total cost in USD
`tools_used`	`List[str]`	Unique tool names used
`duration`	`Optional[float]`	Session duration in seconds
`conversation_tree`	`ConversationTree`	Message threading structure
`metadata`	`SessionMetadata`	Detailed statistics
`tool_executions`	`List[ToolExecution]`	All tool runs
`tool_costs`	`Dict[str, float]`	Cost breakdown by tool
`cost_by_turn`	`List[float]`	Cost per message
`project_path`	`Optional[Path]`	Project directory
`project_name`	`Optional[str]`	Project name

Methods:

# Get main conversation (excluding sidechains)
main_messages = session.get_main_chain()

# Filter by role
user_messages = session.get_messages_by_role("user")
assistant_messages = session.get_messages_by_role("assistant")

# Find messages using specific tools
bash_messages = session.get_messages_by_tool("bash")

# Get a specific message
message = session.get_message_by_uuid("msg-uuid-123")

# Custom filtering
long_messages = session.filter_messages(lambda m: len(m.text) > 1000)

# Get conversation thread
thread = session.get_thread("msg-uuid-789")  # Returns path from root

# Iteration and length
for msg in session:
    print(msg.text)
    
print(f"Total messages: {len(session)}")

Message

Represents a single message in the conversation.

Properties:

Property	Type	Description
`role`	`str`	"user" or "assistant"
`text`	`str`	Complete text content
`model`	`Optional[str]`	Model used (e.g., "claude-3-sonnet-20240229")
`cost`	`Optional[float]`	Cost in USD
`tools`	`List[str]`	Tool names used
`stop_reason`	`Optional[str]`	Why generation stopped
`usage`	`Optional[TokenUsage]`	Token usage details
`timestamp`	`str`	RFC3339 timestamp
`uuid`	`str`	Unique identifier
`parent_uuid`	`Optional[str]`	Parent message UUID
`is_sidechain`	`bool`	Whether part of a sidechain
`cwd`	`Optional[Path]`	Working directory
`total_tokens`	`int`	Total token count
`input_tokens`	`int`	Input token count
`output_tokens`	`int`	Output token count

Methods:

# Check for tool usage
if message.has_tool_use():
    tools = message.get_tool_blocks()
    for tool in tools:
        print(f"Tool: {tool.name}, Input: {tool.input}")

# Get text content blocks
text_blocks = message.get_text_blocks()

# Get all content blocks with proper typing
for block in message.get_content_blocks():
    if isinstance(block, claude_code_analytics.TextBlock):
        print(f"Text: {block.text}")
    elif isinstance(block, claude_code_analytics.ToolUseBlock):
        print(f"Tool: {block.name}")

Project

Container for multiple sessions in a project.

Properties:

Property	Type	Description
`name`	`str`	Project name
`sessions`	`List[Session]`	All sessions in project
`total_cost`	`float`	Aggregate cost
`total_messages`	`int`	Total message count
`tool_usage_count`	`Dict[str, int]`	Tool usage frequency
`total_duration`	`Optional[float]`	Total time in seconds

project = claude_code_analytics.load_project("myproject")

# Analyze tool usage patterns
for tool, count in project.tool_usage_count.items():
    avg_per_session = count / len(project.sessions)
    print(f"{tool}: {count} uses ({avg_per_session:.1f} per session)")

# Find expensive sessions
expensive = [s for s in project.sessions if s.total_cost > 1.0]

ToolExecution

Complete record of a tool invocation.

Properties:

Property	Type	Description
`tool_name`	`str`	Name of the tool
`input`	`Dict[str, Any]`	Input parameters
`output`	`ToolResult`	Execution result
`duration_ms`	`Optional[int]`	Execution time
`timestamp`	`str`	When executed

Methods:

# Check success
if execution.is_success():
    print(f"{execution.tool_name} completed in {execution.duration_ms}ms")
else:
    print(f"Failed: {execution.output.stderr}")

ConversationTree

Tree structure representing conversation flow.

Properties:

Property	Type	Description
`root_messages`	`List[ConversationNode]`	Root nodes
`orphaned_messages`	`List[str]`	Messages with missing parents
`circular_references`	`List[str]`	Circular reference UUIDs
`stats`	`ConversationStats`	Tree statistics

Methods:

tree = session.conversation_tree

# Get tree metrics
print(f"Max depth: {tree.max_depth()}")
print(f"Branch points: {tree.count_branches()}")

# Traverse tree
def walk_tree(node, depth=0):
    print("  " * depth + node.message.text[:50])
    for child in node.children:
        walk_tree(child, depth + 1)

for root in tree.root_messages:
    walk_tree(root)

Exceptions

# Exception hierarchy
claude_code_analytics.ClaudeSDKError          # Base exception
├── claude_code_analytics.ParseError          # JSONL parsing failed
├── claude_code_analytics.ValidationError     # Invalid data
└── claude_code_analytics.SessionError        # Session-specific issues

# Example handling
try:
    session = claude_code_analytics.load("session.jsonl")
except claude_code_analytics.ParseError as e:
    print(f"Failed to parse: {e}")
except claude_code_analytics.ClaudeSDKError as e:
    print(f"SDK error: {e}")

Examples

Basic Session Analysis

import claude_code_analytics

# Load session
session = claude_code_analytics.load("session.jsonl")

# Print summary
print(f"Session: {session.session_id}")
print(f"Duration: {session.duration / 60:.1f} minutes" if session.duration else "Duration unknown")
print(f"Messages: {len(session)} ({len(session.get_messages_by_role('user'))} from user)")
print(f"Cost: ${session.total_cost:.4f}")
print(f"Tools: {', '.join(session.tools_used) or 'None'}")

# Analyze token usage
total_tokens = sum(msg.total_tokens for msg in session.messages)
print(f"Total tokens: {total_tokens:,}")

Tool Usage Patterns

import claude_code_analytics
from collections import defaultdict

session = claude_code_analytics.load("session.jsonl")

# Count tool usage by message
tool_messages = defaultdict(list)
for msg in session.messages:
    if msg.has_tool_use():
        for tool in msg.tools:
            tool_messages[tool].append(msg)

# Print tool usage summary
for tool, messages in sorted(tool_messages.items()):
    print(f"\n{tool}: {len(messages)} uses")
    
    # Show first few uses
    for msg in messages[:3]:
        preview = msg.text[:100].replace('\n', ' ')
        print(f"  - {preview}...")

Cost Analysis Across Projects

import claude_code_analytics

# Find all projects
projects = claude_code_analytics.find_projects()

# Analyze costs
project_costs = []
for project_path in projects:
    try:
        project = claude_code_analytics.load_project(project_path)
        project_costs.append((project.name, project.total_cost, len(project.sessions)))
    except Exception as e:
        print(f"Failed to load {project_path}: {e}")

# Sort by cost
project_costs.sort(key=lambda x: x[1], reverse=True)

# Print report
print("Project Cost Analysis")
print("-" * 50)
for name, cost, session_count in project_costs:
    avg_cost = cost / session_count if session_count > 0 else 0
    print(f"{name:20} ${cost:8.2f} ({session_count:3} sessions, avg ${avg_cost:.2f})")

Conversation Flow Analysis

import claude_code_analytics

session = claude_code_analytics.load("session.jsonl")
tree = session.conversation_tree

# Find branching points
for root in tree.root_messages:
    def find_branches(node, path=[]):
        current_path = path + [node.message.uuid]
        
        if len(node.children) > 1:
            print(f"\nBranch point at message {len(current_path)}:")
            print(f"  {node.message.text[:100]}...")
            print(f"  Branches into {len(node.children)} paths")
            
        for child in node.children:
            find_branches(child, current_path)
    
    find_branches(root)

# Analyze sidechains
sidechain_messages = [m for m in session.messages if m.is_sidechain]
if sidechain_messages:
    print(f"\nFound {len(sidechain_messages)} sidechain messages")

Exporting Session Data

import claude_code_analytics
import json
import csv

session = claude_code_analytics.load("session.jsonl")

# Export to JSON
export_data = {
    "session_id": session.session_id,
    "total_cost": session.total_cost,
    "messages": [
        {
            "role": msg.role,
            "text": msg.text,
            "cost": msg.cost,
            "timestamp": msg.timestamp,
            "tools": msg.tools
        }
        for msg in session.messages
    ]
}

with open("session_export.json", "w") as f:
    json.dump(export_data, f, indent=2)

# Export tool usage to CSV
with open("tool_usage.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["Timestamp", "Tool", "Duration (ms)", "Success"])
    
    for exec in session.tool_executions:
        writer.writerow([
            exec.timestamp,
            exec.tool_name,
            exec.duration_ms or "N/A",
            exec.is_success()
        ])

Performance

The Claude SDK is built with Rust for exceptional performance:

Parsing speed: 1000+ messages per second
Memory efficient: Streaming parser for large files
Zero-copy strings: Minimal memory allocation
Thread safe: Can be used in multi-threaded applications

Benchmarks

File Size	Messages	Parse Time	Memory Usage
100 KB	50	<10ms	2 MB
1 MB	500	<50ms	8 MB
10 MB	5000	<300ms	35 MB
100 MB	50000	<3s	350 MB

Troubleshooting

Common Issues

ImportError: No module named 'claude_code_analytics'

Solution: Ensure you've installed the package:

pip install claude-code-analytics
# or for development
uv build

FileNotFoundError when loading sessions

Solution: Check the file path and ensure you have read permissions:

import os
path = os.path.expanduser("~/.claude/projects/myproject/session.jsonl")
if os.path.exists(path):
    session = claude_code_analytics.load(path)

ParseError: Invalid JSONL format

Solution: Ensure the file is a valid Claude Code session file:

# Check first few lines
head -n 5 session.jsonl

# Validate JSON
python -m json.tool session.jsonl

High memory usage with large files

Solution: Process sessions in batches:

# Instead of loading all sessions at once
sessions = []
for path in claude_code_analytics.find_sessions(project="large_project"):
    session = claude_code_analytics.load(path)
    # Process session
    del session  # Free memory

Debug Mode

Enable detailed logging for troubleshooting:

import logging
logging.basicConfig(level=logging.DEBUG)

# Now SDK operations will print debug info
session = claude_code_analytics.load("session.jsonl")

Development

Building from source

# Clone repository
git clone https://github.com/yourusername/claude-code-analytics.git
cd claude-code-analytics

# Build Rust library
cargo build --release

# Build Python package
uv build

Running tests

# Rust tests
cargo test

# Python tests
uv build
uv run -m pytest tests/

The Python test suite includes fixtures for malformed JSONL and a multi-megabyte session to ensure ParseError is raised correctly and large files load successfully.

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with:

PyO3 - Rust bindings for Python
Maturin - Build and publish Rust Python extensions
Serde - Serialization framework for Rust

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Full API Docs

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 28, 2025

0.1.0

Jun 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_code_analytics-0.1.1.tar.gz (938.5 kB view details)

Uploaded Jun 28, 2025 Source

Built Distribution

claude_code_analytics-0.1.1-cp311-abi3-macosx_11_0_arm64.whl (469.4 kB view details)

Uploaded Jun 28, 2025 CPython 3.11+macOS 11.0+ ARM64

File details

Details for the file claude_code_analytics-0.1.1.tar.gz.

File metadata

Download URL: claude_code_analytics-0.1.1.tar.gz
Upload date: Jun 28, 2025
Size: 938.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.9

File hashes

Hashes for claude_code_analytics-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`d67837156e955e6d67193465c0e824168d0b35b09ff685263c9407bcabf6e5c2`
MD5	`8027ffeec65a2e971185191585e5b71a`
BLAKE2b-256	`c81a9f75f27017c43b2c644187f479525748aeab67489843fc8b82041b771ac0`

See more details on using hashes here.

File details

Details for the file claude_code_analytics-0.1.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: claude_code_analytics-0.1.1-cp311-abi3-macosx_11_0_arm64.whl
Upload date: Jun 28, 2025
Size: 469.4 kB
Tags: CPython 3.11+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.9

File hashes

Hashes for claude_code_analytics-0.1.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`dc28ceedbc4df872468ded97b6afcce052c01209ba29b89a17644192a8dc75fa`
MD5	`a01e233a47f73cbef0a94a91964e8f17`
BLAKE2b-256	`8e49294890d887d30c4e1018a67bc46180b3ef30803d2ff0ccdfe33e1aacd771`

See more details on using hashes here.

claude-code-analytics 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Claude SDK for Python

Table of Contents

Installation

Prerequisites

Install from PyPI (when published)

Install from source

Development installation

Quick Start

Core Concepts

Sessions

Messages

Projects

Conversation Trees

API Reference

Functions

load(file_path: str | Path) -> Session

find_sessions(base_path: Optional[str] = None, project: Optional[str] = None) -> List[Path]

find_projects(base_path: Optional[str] = None) -> List[Path]

load_project(project_identifier: str | Path, base_path: Optional[str] = None) -> Project

Classes

Session

Message

Project

ToolExecution

ConversationTree

Exceptions

Examples

Basic Session Analysis

Tool Usage Patterns

Cost Analysis Across Projects

Conversation Flow Analysis

Exporting Session Data

Performance

Benchmarks

Troubleshooting

Common Issues

ImportError: No module named 'claude_code_analytics'

FileNotFoundError when loading sessions

ParseError: Invalid JSONL format

High memory usage with large files

Debug Mode

Development

Building from source

Running tests

Contributing

License

Acknowledgments

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`load(file_path: str | Path) -> Session`

`find_sessions(base_path: Optional[str] = None, project: Optional[str] = None) -> List[Path]`

`find_projects(base_path: Optional[str] = None) -> List[Path]`

`load_project(project_identifier: str | Path, base_path: Optional[str] = None) -> Project`