Skip to main content

AI Agent SDK — observability, memory, and analytics for LLM applications. Provider-agnostic. Tracks token usage, tool calls, conversations, and enables shared team memory.

Project description

Pentatonic

AI Agent SDK

Observability, memory, and analytics for LLM applications.
Run locally or use hosted TES. JavaScript & Python.

npm PyPI License


Table of Contents

Overview

Two ways to use the SDK:

Local Memory -- Run a fully private memory system on your own machine. PostgreSQL + pgvector + Ollama in Docker. No API keys, no cloud. Your agent gets persistent, searchable memory backed by multi-signal retrieval and HyDE query expansion.

Hosted TES -- Connect to Pentatonic's Thing Event System for production-grade observability, higher-dimensional embeddings, conversation analytics, and team-wide shared memory.

Both paths work with Claude Code and OpenClaw. The plugins auto-search on every prompt and auto-store every conversation turn.

Local Memory (self-hosted)

Run the full memory stack locally. Requires Docker and ~4GB disk for models.

1. Set up

npx @pentatonic-ai/ai-agent-sdk memory

This starts PostgreSQL + pgvector, Ollama, and the memory server. It pulls embedding and chat models, and writes the local config.

2. Install the Claude Code plugin

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

That's it. The plugin hooks automatically search memories on every prompt and store every conversation turn. Fully local, fully private.

What you get

  • Automatic memory -- every conversation turn is stored with embeddings and HyDE query expansion
  • Semantic search -- multi-signal retrieval combining vector similarity, BM25 full-text, recency decay, and access frequency
  • Memory layers -- episodic (recent), semantic (consolidated), procedural (how-to), working (temporary)
  • Decay and consolidation -- memories fade over time; frequently accessed ones get promoted

Change models

EMBEDDING_MODEL=mxbai-embed-large LLM_MODEL=qwen2.5:7b npx @pentatonic-ai/ai-agent-sdk memory

Raspberry Pi

Pi 5 with 8GB RAM runs the full stack. nomic-embed-text (~300MB) + llama3.2:3b (~2GB) leaves plenty of headroom.

Use as a library

import { createMemorySystem } from '@pentatonic-ai/ai-agent-sdk/memory';

const memory = createMemorySystem({
  db: pgPool,
  embedding: { url: 'http://localhost:11434/v1', model: 'nomic-embed-text' },
  llm: { url: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
});

await memory.migrate();
await memory.ensureLayers('my-app');
await memory.ingest('User prefers dark mode', { clientId: 'my-app' });
const results = await memory.search('preferences', { clientId: 'my-app' });

Hosted TES

Connect to Pentatonic's hosted infrastructure for production use.

1. Create an account

npx @pentatonic-ai/ai-agent-sdk init

This walks you through account creation, email verification, and API key generation. You'll get:

TES_ENDPOINT=https://your-company.api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

2. Install

npm install @pentatonic-ai/ai-agent-sdk
pip install pentatonic-ai-agent-sdk

What you get (in addition to local features)

  • Higher-dimensional embeddings -- NV-Embed-v2 (4096d) for better retrieval accuracy
  • Conversation analytics -- session metrics, search attribution, dead-end detection
  • Team-wide shared memory -- semantic search across your team's AI interactions
  • Admin dashboard -- visualize conversations, token usage, and memory explorer
  • Multi-tenancy -- isolated databases per client

Claude Code Plugin

Works with both local and hosted setups. Install once, switch modes via config.

Install via marketplace

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

Set up

For hosted TES:

/tes-memory:tes-setup

For local memory:

npx @pentatonic-ai/ai-agent-sdk memory

What it tracks

  • Every conversation turn -- user messages, assistant responses, tool calls, duration
  • Automatic memory search -- relevant memories injected as context on every prompt
  • Automatic memory storage -- every turn stored with embeddings and HyDE queries
  • Token usage -- input, output, cache read, cache creation tokens per turn

OpenClaw Plugin

Works with both local and hosted setups. Just tell OpenClaw to set it up.

Install

openclaw plugins install @pentatonic-ai/ai-agent-sdk

Set up

Tell OpenClaw:

Set up pentatonic memory

The agent will ask whether you want local (private, Docker-based) or hosted (Pentatonic TES cloud), then walk you through the rest. For hosted mode, it handles account creation, email verification, and API key generation conversationally.

Or use the CLI directly:

openclaw pentatonic-memory local

What it does

OpenClaw's context engine hooks fire on every lifecycle event:

  • Ingest -- every user and assistant message is stored with embeddings and HyDE query expansion
  • Assemble -- relevant memories are injected as system prompt context before every model run
  • Compact -- decay cycle runs when the context window fills
  • After turn -- high-access memories get consolidated to the semantic layer

Plus agent-callable tools: memory_search, memory_store, memory_layers.

Configuration

After setup, config lives in ~/.openclaw/pentatonic-memory.json. To switch modes, run setup again or edit directly.

You can also configure via openclaw.json:

{
  "plugins": {
    "slots": { "contextEngine": "pentatonic-memory" },
    "entries": {
      "pentatonic-memory": {
        "enabled": true,
        "config": {
          "database_url": "postgres://memory:memory@localhost:5433/memory",
          "embedding_url": "http://localhost:11435/v1",
          "embedding_model": "nomic-embed-text",
          "llm_url": "http://localhost:11435/v1",
          "llm_model": "llama3.2:3b"
        }
      }
    }
  }
}

For hosted mode, replace the config block with:

{
  "tes_endpoint": "https://your-company.api.pentatonic.com",
  "tes_client_id": "your-company",
  "tes_api_key": "tes_your-company_xxxxx"
}

SDK: Wrap Your LLM Client

JavaScript

import { TESClient } from "@pentatonic-ai/ai-agent-sdk";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from pentatonic_agent_events import TESClient

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Supported Providers

Provider Detection Intercepted Method
OpenAI client.chat.completions.create chat.completions.create()
Anthropic client.messages.create messages.create()
Workers AI client.run (JS only) run()

All other methods pass through unchanged.

API Reference

TESClient(config)

Param Type Default Description
clientId string required Your tenant identifier
apiKey string required TES API key
endpoint string required TES instance URL
userId string null User identifier for attribution
captureContent boolean true Include message content in events
maxContentLength number 4096 Truncate content beyond this length

tes.wrap(client, opts?)

Returns an instrumented proxy. Every intercepted call emits a CHAT_TURN event.

Option Type Default Description
sessionId string auto-generated UUID Links events from the same conversation
metadata object {} Custom fields on every event

tes.session(opts?)

Returns a Session for manual event emission.

session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })

Emits a CHAT_TURN event with accumulated data, then resets.

normalizeResponse(raw)

Standalone utility to normalize any LLM response:

import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";

const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);

Architecture

        +-------------------+     +-------------------+
        | Claude Code Plugin|     |  OpenClaw Plugin   |
        | (hooks: auto-     |     | (context engine:   |
        |  search + store)  |     |  ingest, assemble, |
        +--------+----------+     |  compact, tools)   |
                 |                +--------+----------+
                 |                         |
                 +------------+------------+
                              |
                  +-----------+-----------+
                  |                       |
            Local Memory            Hosted TES
            (Docker)                (Cloud)
                  |                       |
       +----+----+----+          +---+----+---+
       |    |    |    |          |   |    |   |
      PG  Ollama MCP HTTP      PG  R2  Queue Workers
      pgvector        API     pgvector       Modules

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_ai_agent_sdk-0.4.2.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pentatonic_ai_agent_sdk-0.4.2-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file pentatonic_ai_agent_sdk-0.4.2.tar.gz.

File metadata

  • Download URL: pentatonic_ai_agent_sdk-0.4.2.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.4.2.tar.gz
Algorithm Hash digest
SHA256 ab4c5bb68bab60541dc7e3ae8a0adaaf8715a2c8371b7ee6eacfa662d7056650
MD5 045fe738726eb3bc4fc057251df094d0
BLAKE2b-256 c95bebdf476ece6fdf6d113ab75f34dd03384a9394e94bcd751e6b2d7197fb01

See more details on using hashes here.

File details

Details for the file pentatonic_ai_agent_sdk-0.4.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pentatonic_ai_agent_sdk-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cda3adfe753c3c8722c35319aaa7ce5a2a6ee10375d89cd0a93d272a4cf67be7
MD5 de587951ce59f4d2a005964e9c0487b0
BLAKE2b-256 90d51fc9e2dbbf6f096bf6c85beaf5b849a0b3ea15a28b87f73fc91e872f7d73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page