Skip to main content

Open-source agent learning platform

Project description

Marlo

Open-source agent learning platform. Capture agent behavior, evaluate outcomes, and turn failures into learnings that make your agents better over time.

PyPI npm CI License: MIT

The Problem

AI agents fail silently in production. The same mistakes repeat across runs because there is no feedback loop -- failures are logged and forgotten. Marlo closes this gap.

How It Works

1. Capture    Record every LLM call, tool call, and outcome
2. Evaluate   Score task results automatically using Gemini
3. Learn      Generate structured learnings from failures
4. Apply      Inject learnings into future agent runs

Your agent gets smarter with every failure instead of repeating the same mistakes.

Quick Start

Three commands to get running:

pip install marlo-sdk
export GEMINI_API_KEY=your-key    # https://aistudio.google.com/apikey
marlo run                          # starts server + dashboard + postgres

Dashboard at localhost:5173. API at localhost:8000. Requires Docker.

Stop with marlo stop.

Python SDK

pip install marlo-sdk
import marlo

marlo.init()

# Register your agent
marlo.agent(
    name="support-bot",
    system_prompt="You are a customer support agent.",
    tools=[{"name": "lookup_order", "description": "Find order by ID"}],
)

# Track a task
with marlo.task(thread_id="thread-123", agent="support-bot") as t:
    t.input("Where is my order?")

    # Retrieve learnings from past failures
    learnings = t.get_learnings()
    # Inject into your prompt: system_prompt += learnings["learnings_text"]

    t.tool("lookup_order", {"id": "ORD-456"}, {"status": "shipped"})
    t.output("Your order has been shipped and arrives tomorrow.")

marlo.shutdown()

Multi-Agent

with marlo.task(thread_id="thread-1", agent="orchestrator") as parent:
    parent.input("Research AI trends and write a report")

    with parent.child(agent="researcher") as child:
        child.input("Search for recent AI developments")
        child.output("Found 3 relevant sources...")

    parent.output("Report complete.")

TypeScript SDK

npm install @marshmallo/marlo
import { init, agent, task, shutdown } from "@marshmallo/marlo";

await init();

agent("support-bot", "You are a customer support agent.", [
  { name: "lookup_order", description: "Find order by ID" },
]);

const t = task("thread-123", "support-bot").start();
t.input("Where is my order?");

const learnings = await t.getLearnings();
// Inject learnings into your prompt if available

t.tool("lookup_order", { id: "ORD-456" }, { status: "shipped" });
t.output("Your order has been shipped and arrives tomorrow.");
t.end();

await shutdown();

Dashboard

The dashboard gives you visibility into your agents:

  • Traces -- browse sessions, tasks, LLM calls, and tool calls
  • Learnings -- review, approve, or reject generated learnings
  • Agents -- see registered agents, their tools, and model configs
  • Rewards -- track evaluation scores over time
  • Copilot -- ask questions about your agent data in natural language

No login required. Runs locally at localhost:5173.

Architecture

SDK (Python/TS)  --->  FastAPI Server  --->  PostgreSQL
                            |
                        Dashboard

The SDK captures events and sends them to the server over HTTP. The server stores trajectories in PostgreSQL, runs evaluation with Gemini, and generates learnings. The dashboard reads from the same server.

marlo run starts all three services via Docker Compose.

Configuration

Variable Required Default Description
GEMINI_API_KEY Yes -- Google Gemini key for evaluation
MARLO_ENDPOINT No http://localhost:8000 Server URL

Repository Structure

marlo/              Python SDK + FastAPI server
sdks/typescript/    TypeScript SDK (@marshmallo/marlo on npm)
dashboard/          React dashboard (Vite + TanStack Router)
docker/             Docker Compose config for local stack
tests/              Python test suite

Releases

Releases are tag-driven. Bump the version, merge to main, then tag:

# Python SDK
git tag v0.2.0 && git push origin v0.2.0

# TypeScript SDK
git tag ts-v0.2.0 && git push origin ts-v0.2.0

Tags trigger CI to publish to PyPI/npm and create a GitHub Release automatically.

Package Registry Tag format
marlo-sdk PyPI v*.*.*
@marshmallo/marlo npm ts-v*.*.*

Contributing

See CONTRIBUTING.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marlo_sdk-0.1.0.tar.gz (158.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

marlo_sdk-0.1.0-py3-none-any.whl (199.3 kB view details)

Uploaded Python 3

File details

Details for the file marlo_sdk-0.1.0.tar.gz.

File metadata

  • Download URL: marlo_sdk-0.1.0.tar.gz
  • Upload date:
  • Size: 158.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for marlo_sdk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 24ce4b340c4a5cc35e86c0f46c037eb34560de860e033f34245dd0a60625e544
MD5 b1811ca269699492e38c863a4bd38b27
BLAKE2b-256 bab1a841385ebd79cdc194346c006b9d9482017ee769abebc05eb44ae5836c7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for marlo_sdk-0.1.0.tar.gz:

Publisher: publish-to-pypi.yml on Marshmallo-AI/Marlo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file marlo_sdk-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: marlo_sdk-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 199.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for marlo_sdk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b17cbf9e011cd882629fe766a3c6a915c60a119e16887abc15b67881cd99bf2
MD5 f264b9ddc8ed7b155ddba7455c9f353c
BLAKE2b-256 1c4f00318439f83671944f80e2864c0da7190b703ba0a717492ef31a2a31edc4

See more details on using hashes here.

Provenance

The following attestation bundles were made for marlo_sdk-0.1.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on Marshmallo-AI/Marlo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page