Local-first AI agent framework. Built for models that aren't perfect.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

labeveryday

These details have not been verified by PyPI

Project links

Project description

FreeAgent SDK

A clean local agent SDK for Ollama, vLLM, and OpenAI-compatible servers.

Streaming. Multi-turn out of the box. Markdown skills and memory. Built-in telemetry. Single dependency.

pip install freeagent-sdk

Links: Documentation · Tutorial · Changelog · Contributing · Examples · Evaluation data

Why FreeAgent

Local-first: works with Ollama and vLLM — your data never leaves your machine
Streaming everywhere: token-level streaming with semantic events
Multi-turn that just works: conversation state managed automatically with pluggable strategies
Markdown is first-class: skills and memory are human-readable .md files with frontmatter
Zero-config: auto-detects model size and tunes defaults — works on 2B and 70B alike
Inspectable: agent.trace() shows exactly what happened
Fast: actually 2% faster than raw Ollama API (HTTP connection reuse)
Honest: real benchmark data in this README, not marketing

Quick Start

CLI

# One-shot query with streaming
freeagent ask qwen3:8b "What's the capital of France?"

# Interactive chat
freeagent chat qwen3:8b

# List available models
freeagent models

Python

from freeagent import Agent

agent = Agent(model="qwen3:8b")
print(agent.run("What is Python?"))

Streaming

Real token-by-token streaming, even for tool-using agents:

from freeagent import Agent
from freeagent.events import TokenEvent, ToolCallEvent, ToolResultEvent

agent = Agent(model="qwen3:8b", tools=[weather])

for event in agent.run_stream("What's the weather in Tokyo?"):
    if isinstance(event, TokenEvent):
        print(event.text, end="", flush=True)
    elif isinstance(event, ToolCallEvent):
        print(f"\n[Calling {event.name}...]")
    elif isinstance(event, ToolResultEvent):
        print(f"[{event.name} -> {'ok' if event.success else 'fail'} ({event.duration_ms:.0f}ms)]")

Async version: async for event in agent.arun_stream("query"):

Event types: RunStartEvent, TokenEvent, ToolCallEvent, ToolResultEvent, ValidationErrorEvent, RetryEvent, IterationEvent, RunCompleteEvent.

Custom Tools

from freeagent import Agent, tool

@tool
def weather(city: str) -> dict:
    """Get current weather for a city."""
    return {"city": city, "temp": 72, "condition": "sunny"}

agent = Agent(model="qwen3:8b", tools=[weather])
print(agent.run("What's the weather in Portland?"))

Multi-Turn Conversations

agent = Agent(model="qwen3:8b", tools=[weather])
agent.run("What's the weather in Tokyo?")
agent.run("Convert that to Celsius")  # remembers Tokyo was 85°F

Strategies

from freeagent import Agent, SlidingWindow, TokenWindow

# Default: SlidingWindow(max_turns=20)
agent = Agent(model="qwen3:8b")

# Token-based budget (better for small context models)
agent = Agent(model="qwen3:4b", conversation=TokenWindow(max_tokens=3000))

# Stateless mode (each run independent)
agent = Agent(model="qwen3:8b", conversation=None)

Session Persistence

agent = Agent(model="qwen3:8b", session="my-chat")
agent.run("Hello!")
# Later, in a new process:
agent = Agent(model="qwen3:8b", session="my-chat")  # restores conversation

Inspecting Runs

Every run is fully traced. See exactly what happened:

agent.run("What's 347 * 29?")

# One-line summary
print(agent.last_run.summary())
# Run 1: qwen3:8b (native) 2300ms, 2 iters, 1 tools

# Full timeline
print(agent.trace())
# +     0ms  model_call_start     iter=0
# +   800ms  tool_call            calc(expression='347*29')
# +   802ms  tool_result          calc -> ok (2ms)
# +   803ms  model_call_start     iter=1

# Markdown report
print(agent.last_run.to_markdown())

Model-Aware Defaults

FreeAgent auto-detects model capabilities from Ollama and tunes itself:

# Auto-tuned: detects 2B model, strips skills and memory tool
agent = Agent(model="gemma4:e2b")

# Auto-tuned: detects 8B model, keeps full defaults
agent = Agent(model="qwen3:8b")

# Override auto-tuning
agent = Agent(model="gemma4:e2b", bundled_skills=True, memory_tool=True)

# Disable auto-tuning entirely
agent = Agent(model="qwen3:8b", auto_tune=False)

Access detected info: agent.model_info.parameter_count, agent.model_info.context_length, agent.model_info.capabilities.

Skills (Markdown Prompt Extensions)

---
name: nba-analyst
description: Basketball statistics expert
tools: [search, calculator]
---

You are an NBA analyst. Always cite your sources.
When comparing players, use per-game averages.

agent = Agent(model="qwen3:8b", tools=[search, calculator], skills=["./my-skills"])

Bundled skills load automatically. User skills extend them — duplicate names override.

Memory (Markdown-Backed)

Every agent has built-in memory stored as human-readable .md files:

.freeagent/memory/
├── MEMORY.md          # Index
├── user.md            # auto_load: true → in system prompt
├── facts.md           # Accumulated facts
└── 2026-04-05.md      # Daily log

The agent gets a memory tool with actions: read, write, append, search, list. Only the index and auto_load files go into the system prompt — everything else is on demand.

Multi-Provider Support

from freeagent import Agent, VLLMProvider, OpenAICompatProvider

# vLLM
provider = VLLMProvider(model="qwen3-8b")
agent = Agent(model="qwen3-8b", provider=provider, tools=[my_tool])

# Any OpenAI-compatible server
provider = OpenAICompatProvider(model="llama3.1:8b", base_url="http://localhost:1234")
agent = Agent(model="llama3.1:8b", provider=provider, tools=[my_tool])

Telemetry

Built-in, always on:

agent.run("What's the weather?")
print(agent.metrics)               # quick summary
print(agent.metrics.tool_stats())  # per-tool breakdown
agent.metrics.to_json("m.json")   # export

Optional OpenTelemetry: pip install freeagent-sdk[otel]

MCP Support

from freeagent.mcp import connect

async with connect("npx -y @modelcontextprotocol/server-filesystem /tmp") as tools:
    agent = Agent(model="qwen3:8b", tools=tools)
    result = await agent.arun("List files in /tmp")

Install with: pip install freeagent-sdk[mcp]

Real Performance

Tested against the raw Ollama API with the same eval suite (100+ cases, 4 models). Full data in evaluation/.

Multi-Turn Conversations (6 conversations, 15 turns)

Model	Raw Ollama	FreeAgent
qwen3:8b	93%	87%
qwen3:4b	93%	87%
llama3.1:8b	87%	80%
gemma4:e2b (2B)	N/A	80%

Tool Calling Accuracy (8 cases)

Model	Raw Ollama	FreeAgent
qwen3:8b	75%	75%
qwen3:4b	100%	88%
llama3.1:8b	62%	75% (+13%)

Streaming Latency (median of 3 runs)

Model	Chat TTFT	Chat Total	Tool TTFT	Tool Total
qwen3:8b	12.8s	13.9s	5.2s	10.0s
qwen3:4b	14.7s	14.5s	28.2s	31.6s
llama3.1:8b	1.5s	1.4s	1.8s	2.1s
gemma4:e2b	4.7s	5.1s	8.2s	12.1s

TTFT ≈ total for chat (generation is fast once started). Tool TTFT includes tool execution round-trip.

Auto-Tune (v0.3.1)

Model	auto_tune=True	All On	Manual Strip	Delta vs All On
qwen3:8b	91%	91%	—	+0%
qwen3:4b	91%	91%	—	+0%
llama3.1:8b	100%	100%	—	+0%
gemma4:e2b	91%	55%	73%	+36%

Auto-tune detects gemma4:e2b as a small model and strips bundled skills + memory tool. This improves accuracy from 55% → 91%.

Honest Caveats

Guardrails rarely fire: 0/40 real rescues in adversarial testing. Modern models handle fuzzy names and type coercion natively.
Multi-turn gap to raw Ollama is noise: 87% vs 93% — re-running failures produces passes. Non-deterministic.
Skills help qwen3:4b but hurt gemma4:e2b — fixed by auto-tune, which strips them for small models.
Streaming TTFT ≈ total time on small models: generation is fast, model thinking dominates latency.

Full analysis: evaluation/THESIS_ANALYSIS.md

Tested Models

Model	Size	Mode	Reliability
Qwen3 8B	8.2B	Native	Very Good
Qwen3 4B	4.0B	Native	Good (best with skills)
Llama 3.1 8B	8.0B	Native	Good
Gemma4 E2B	5.1B	Native	Good (auto-tuned)

Requirements

Python 3.11+
Ollama running locally (ollama serve)
A model pulled (ollama pull qwen3:8b)

Documentation

Tutorial — 5-minute walkthrough from install to working agent
Website — landing page and feature overview
Examples — runnable scripts covering tools, memory, hooks, MCP
Evaluation data — benchmark results and thesis analysis
Changelog — release history
Contributing — how to run tests, add skills, submit PRs

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

labeveryday

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.3

Apr 9, 2026

0.3.2

Apr 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

freeagent_sdk-0.3.3.tar.gz (74.9 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

freeagent_sdk-0.3.3-py3-none-any.whl (60.7 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file freeagent_sdk-0.3.3.tar.gz.

File metadata

Download URL: freeagent_sdk-0.3.3.tar.gz
Upload date: Apr 9, 2026
Size: 74.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for freeagent_sdk-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`dccb78805c0c493c270c15e1e15aa0492b61638f8e42a42a23c8cd830fbf55ae`
MD5	`5cb1e1b503120a5c1fdd577e39294398`
BLAKE2b-256	`45992c0fc6e93c8b89b7bc904a73fa340a20f7cd5f828fe5e3681ea51b90c72e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for freeagent_sdk-0.3.3.tar.gz:

Publisher: publish.yml on labeveryday/freeagent-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: freeagent_sdk-0.3.3.tar.gz
- Subject digest: dccb78805c0c493c270c15e1e15aa0492b61638f8e42a42a23c8cd830fbf55ae
- Sigstore transparency entry: 1263233270
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: labeveryday/freeagent-sdk@4459bb015f730829a347392b6d00660eee83631c
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/labeveryday
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4459bb015f730829a347392b6d00660eee83631c
- Trigger Event: push

File details

Details for the file freeagent_sdk-0.3.3-py3-none-any.whl.

File metadata

Download URL: freeagent_sdk-0.3.3-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 60.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for freeagent_sdk-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0e58fe56ccedc37744da90953be498780339adf85606f31db229fcc7691522e3`
MD5	`eac44acd3f2582407e7e35c751148ab9`
BLAKE2b-256	`38a1d190152bc34f68f7bf0c343fee0ee1515a2d06f3b09d999445a0761542cf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for freeagent_sdk-0.3.3-py3-none-any.whl:

Publisher: publish.yml on labeveryday/freeagent-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: freeagent_sdk-0.3.3-py3-none-any.whl
- Subject digest: 0e58fe56ccedc37744da90953be498780339adf85606f31db229fcc7691522e3
- Sigstore transparency entry: 1263233276
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: labeveryday/freeagent-sdk@4459bb015f730829a347392b6d00660eee83631c
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/labeveryday
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4459bb015f730829a347392b6d00660eee83631c
- Trigger Event: push

freeagent-sdk 0.3.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FreeAgent SDK

Why FreeAgent

Quick Start

CLI

Python

Streaming

Custom Tools

Multi-Turn Conversations

Strategies

Session Persistence

Inspecting Runs

Model-Aware Defaults

Skills (Markdown Prompt Extensions)

Memory (Markdown-Backed)

Multi-Provider Support

Telemetry

MCP Support

Real Performance

Multi-Turn Conversations (6 conversations, 15 turns)

Tool Calling Accuracy (8 cases)

Streaming Latency (median of 3 runs)

Auto-Tune (v0.3.1)

Honest Caveats

Tested Models

Requirements

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance