Skip to main content

AI Agent SDK — observability, memory, and analytics for LLM applications. Provider-agnostic. Tracks token usage, tool calls, conversations, and enables shared team memory.

Project description

Pentatonic

AI Agent SDK

Observability, memory, and analytics for LLM applications.
Run locally or use hosted TES. JavaScript & Python.

npm PyPI License


Table of Contents

Overview

Two ways to use the SDK:

Local Memory -- Run a fully private memory system on your own machine. PostgreSQL + pgvector + Ollama in Docker. No API keys, no cloud. Your agent gets persistent, searchable memory backed by multi-signal retrieval and HyDE query expansion.

Hosted TES -- Connect to Pentatonic's Thing Event System for production-grade observability, higher-dimensional embeddings, conversation analytics, and team-wide shared memory.

Both paths work with Claude Code and OpenClaw. The plugins auto-search on every prompt and auto-store every conversation turn.

Local Memory (self-hosted)

Run the full memory stack locally. Requires Docker and ~4GB disk for models.

1. Set up

npx @pentatonic-ai/ai-agent-sdk memory

This starts PostgreSQL + pgvector, Ollama, and the memory server. It pulls embedding and chat models, and writes the local config.

2. Install the Claude Code plugin

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

That's it. The plugin hooks automatically search memories on every prompt and store every conversation turn. Fully local, fully private.

What you get

  • Automatic memory -- every conversation turn is stored with embeddings and HyDE query expansion
  • Semantic search -- multi-signal retrieval combining vector similarity, BM25 full-text, recency decay, and access frequency
  • Memory layers -- episodic (recent), semantic (consolidated), procedural (how-to), working (temporary)
  • Distilled memory -- a background LLM pass extracts atomic facts from each raw turn and stores each as its own node in the semantic layer, linked back to the source. A query like "what does Phil drink?" matches "Phil drinks cortado" more reliably than a mixed paragraph covering food, drinks, and hobbies. Default-on; the raw turn is still preserved.
  • Decay and consolidation -- memories fade over time; frequently accessed ones get promoted

Change models

EMBEDDING_MODEL=mxbai-embed-large LLM_MODEL=qwen2.5:7b npx @pentatonic-ai/ai-agent-sdk memory

Raspberry Pi

Pi 5 with 8GB RAM runs the full stack. nomic-embed-text (~300MB) + llama3.2:3b (~2GB) leaves plenty of headroom.

Use as a library

import { createMemorySystem } from '@pentatonic-ai/ai-agent-sdk/memory';

const memory = createMemorySystem({
  db: pgPool,
  embedding: { url: 'http://localhost:11434/v1', model: 'nomic-embed-text' },
  llm: { url: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
});

await memory.migrate();
await memory.ensureLayers('my-app');
await memory.ingest('User prefers dark mode', { clientId: 'my-app' });
const results = await memory.search('preferences', { clientId: 'my-app' });

Hosted TES

Connect to Pentatonic's hosted infrastructure for production use.

1. Create an account

npx @pentatonic-ai/ai-agent-sdk init

This walks you through account creation, email verification, and API key generation. You'll get:

TES_ENDPOINT=https://your-company.api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

2. Install

npm install @pentatonic-ai/ai-agent-sdk
pip install pentatonic-ai-agent-sdk

What you get (in addition to local features)

  • Higher-dimensional embeddings -- NV-Embed-v2 (4096d) for better retrieval accuracy
  • Conversation analytics -- session metrics, search attribution, dead-end detection
  • Team-wide shared memory -- semantic search across your team's AI interactions
  • Admin dashboard -- visualize conversations, token usage, and memory explorer
  • Multi-tenancy -- isolated databases per client

Claude Code Plugin

Works with both local and hosted setups. Install once, switch modes via config.

Install via marketplace

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

Set up

For hosted TES:

/tes-memory:tes-setup

For local memory:

npx @pentatonic-ai/ai-agent-sdk memory

What it tracks

  • Every conversation turn -- user messages, assistant responses, tool calls, duration
  • Automatic memory search -- relevant memories injected as context on every prompt
  • Automatic memory storage -- every turn stored with embeddings and HyDE queries
  • Token usage -- input, output, cache read, cache creation tokens per turn

OpenClaw Plugin

Works with both local and hosted setups. Just tell OpenClaw to set it up.

Install

openclaw plugins install @pentatonic-ai/openclaw-memory-plugin

Set up

Tell OpenClaw:

Set up pentatonic memory

The agent will ask whether you want local (private, Docker-based) or hosted (Pentatonic TES cloud), then walk you through the rest. For hosted mode, it handles account creation, email verification, and API key generation conversationally.

Or use the CLI directly:

openclaw pentatonic-memory local

What it does

OpenClaw's context engine hooks fire on every lifecycle event:

  • Ingest -- every user and assistant message is stored with embeddings and HyDE query expansion, then distilled into atomic facts in the background (see Distilled memory)
  • Assemble -- relevant memories are injected as system prompt context before every model run
  • Compact -- decay cycle runs when the context window fills
  • After turn -- high-access memories get consolidated to the semantic layer

Plus agent-callable tools: memory_search, memory_store, memory_layers.

Configuration

After setup, config lives in ~/.openclaw/pentatonic-memory.json. To switch modes, run setup again or edit directly.

You can also configure via openclaw.json:

{
  "plugins": {
    "slots": { "contextEngine": "pentatonic-memory" },
    "entries": {
      "pentatonic-memory": {
        "enabled": true,
        "config": {
          "database_url": "postgres://memory:memory@localhost:5433/memory",
          "embedding_url": "http://localhost:11435/v1",
          "embedding_model": "nomic-embed-text",
          "llm_url": "http://localhost:11435/v1",
          "llm_model": "llama3.2:3b"
        }
      }
    }
  }
}

For hosted mode, replace the config block with:

{
  "tes_endpoint": "https://your-company.api.pentatonic.com",
  "tes_client_id": "your-company",
  "tes_api_key": "tes_your-company_xxxxx"
}

SDK: Wrap Your LLM Client

JavaScript

import { TESClient } from "@pentatonic-ai/ai-agent-sdk";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from pentatonic_agent_events import TESClient

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Supported Providers

Provider Detection Intercepted Method
OpenAI client.chat.completions.create chat.completions.create()
Anthropic client.messages.create messages.create()
Workers AI client.run (JS only) run()

All other methods pass through unchanged.

API Reference

TESClient(config)

Param Type Default Description
clientId string required Your tenant identifier
apiKey string required TES API key
endpoint string required TES instance URL
userId string null User identifier for attribution
captureContent boolean true Include message content in events
maxContentLength number 4096 Truncate content beyond this length

tes.wrap(client, opts?)

Returns an instrumented proxy. Every intercepted call emits a CHAT_TURN event.

Option Type Default Description
sessionId string auto-generated UUID Links events from the same conversation
metadata object {} Custom fields on every event

tes.session(opts?)

Returns a Session for manual event emission.

session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })

Emits a CHAT_TURN event with accumulated data, then resets.

normalizeResponse(raw)

Standalone utility to normalize any LLM response:

import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";

const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);

Health Checks (doctor)

Run a full health check of your SDK install at any time:

npx @pentatonic-ai/ai-agent-sdk doctor

doctor auto-detects which install path you're on (Local Memory, Hosted TES, or self-hosted Pentatonic platform) and runs only the checks that apply. Exit code is 0 for all-clear, 1 for warnings, 2 for critical.

Common flags:

npx @pentatonic-ai/ai-agent-sdk doctor --json     # machine-readable
npx @pentatonic-ai/ai-agent-sdk doctor --alert    # silent unless issues
npx @pentatonic-ai/ai-agent-sdk doctor --no-plugins
npx @pentatonic-ai/ai-agent-sdk doctor --path local

What gets checked:

  • Universal — Node version, disk space, SDK config-file permissions
  • Local Memory — Postgres + pgvector + migrations, embedding/LLM endpoints, memory server port
  • Hosted TES — endpoint reachable, API key authenticates
  • Self-hosted platform — HybridRAG, Qdrant, Neo4j, vLLM (each optional, skipped when its env var is unset)

Plugins

Drop a .mjs file into ~/.config/pentatonic-ai/doctor-plugins/ to add your own checks. Useful for app-specific things — internal APIs, ingest freshness, custom infrastructure — without forking the SDK.

// ~/.config/pentatonic-ai/doctor-plugins/my-app.mjs
export default {
  name: "my-app",
  checks: [
    {
      name: "internal API",
      severity: "warning",
      run: async () => {
        const res = await fetch("https://internal/health");
        return res.ok
          ? { ok: true, msg: "200 OK" }
          : { ok: false, msg: `HTTP ${res.status}` };
      },
    },
  ],
};

See packages/doctor/README.md for the full plugin contract and programmatic API.

Architecture

        +-------------------+     +-------------------+
        | Claude Code Plugin|     |  OpenClaw Plugin   |
        | (hooks: auto-     |     | (context engine:   |
        |  search + store)  |     |  ingest, assemble, |
        +--------+----------+     |  compact, tools)   |
                 |                +--------+----------+
                 |                         |
                 +------------+------------+
                              |
                  +-----------+-----------+
                  |                       |
            Local Memory            Hosted TES
            (Docker)                (Cloud)
                  |                       |
       +----+----+----+          +---+----+---+
       |    |    |    |          |   |    |   |
      PG  Ollama MCP HTTP      PG  R2  Queue Workers
      pgvector        API     pgvector       Modules

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_ai_agent_sdk-0.5.1.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file pentatonic_ai_agent_sdk-0.5.1.tar.gz.

File metadata

  • Download URL: pentatonic_ai_agent_sdk-0.5.1.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.5.1.tar.gz
Algorithm Hash digest
SHA256 052611d6482bea424b2f7574f687fc63d8e9dd405cf63b9091d3545e74ced486
MD5 ed4c2463b5f535d73e1a2b1446f9c68f
BLAKE2b-256 a1fb9b6db279ab0b508c8abd88d3ae10b926b8b5b7461deb0705007204b08682

See more details on using hashes here.

File details

Details for the file pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 163ccb4a3490210d9f56e5926a1c956e1cddb3acc25ef5bddf5a91bb681cbaf0
MD5 8e8271d539daf3396d91ed07f850c846
BLAKE2b-256 d0a193fa7fc7b1f07776d48459cf56c0205b724d76cb1ea3ad513e2c6819a1d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page