Skip to main content

AI Agent SDK — observability, memory, and analytics for LLM applications. Provider-agnostic. Tracks token usage, tool calls, conversations, and enables shared team memory.

Project description

Pentatonic

AI Agent SDK

Observability, memory, and analytics for LLM applications.
Run locally or use hosted TES. JavaScript & Python.

npm PyPI License


Table of Contents

Overview

Two ways to use the SDK:

Local Memory -- Run a fully private memory system on your own machine. PostgreSQL + pgvector + Ollama in Docker. No API keys, no cloud. Your agent gets persistent, searchable memory backed by multi-signal retrieval and HyDE query expansion.

Hosted TES -- Connect to Pentatonic's Thing Event System for production-grade observability, higher-dimensional embeddings, conversation analytics, and team-wide shared memory.

Both paths work with Claude Code and OpenClaw. The plugins auto-search on every prompt and auto-store every conversation turn.

Local Memory (self-hosted)

Run the full memory stack locally. Requires Docker and ~4GB disk for models.

1. Set up

npx @pentatonic-ai/ai-agent-sdk memory

This starts PostgreSQL + pgvector, Ollama, and the memory server. It pulls embedding and chat models, and writes the local config.

2. Install the Claude Code plugin

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

That's it. The plugin hooks automatically search memories on every prompt and store every conversation turn. Fully local, fully private.

What you get

  • Automatic memory -- every conversation turn is stored with embeddings and HyDE query expansion
  • Semantic search -- multi-signal retrieval combining vector similarity, BM25 full-text, recency decay, and access frequency
  • Memory layers -- episodic (recent), semantic (consolidated), procedural (how-to), working (temporary)
  • Decay and consolidation -- memories fade over time; frequently accessed ones get promoted

Change models

EMBEDDING_MODEL=mxbai-embed-large LLM_MODEL=qwen2.5:7b npx @pentatonic-ai/ai-agent-sdk memory

Raspberry Pi

Pi 5 with 8GB RAM runs the full stack. nomic-embed-text (~300MB) + llama3.2:3b (~2GB) leaves plenty of headroom.

Use as a library

import { createMemorySystem } from '@pentatonic-ai/ai-agent-sdk/memory';

const memory = createMemorySystem({
  db: pgPool,
  embedding: { url: 'http://localhost:11434/v1', model: 'nomic-embed-text' },
  llm: { url: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
});

await memory.migrate();
await memory.ensureLayers('my-app');
await memory.ingest('User prefers dark mode', { clientId: 'my-app' });
const results = await memory.search('preferences', { clientId: 'my-app' });

Hosted TES

Connect to Pentatonic's hosted infrastructure for production use.

1. Create an account

npx @pentatonic-ai/ai-agent-sdk init

This walks you through account creation, email verification, and API key generation. You'll get:

TES_ENDPOINT=https://your-company.api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

2. Install

npm install @pentatonic-ai/ai-agent-sdk
pip install pentatonic-ai-agent-sdk

What you get (in addition to local features)

  • Higher-dimensional embeddings -- NV-Embed-v2 (4096d) for better retrieval accuracy
  • Conversation analytics -- session metrics, search attribution, dead-end detection
  • Team-wide shared memory -- semantic search across your team's AI interactions
  • Admin dashboard -- visualize conversations, token usage, and memory explorer
  • Multi-tenancy -- isolated databases per client

Claude Code Plugin

Works with both local and hosted setups. Install once, switch modes via config.

Install via marketplace

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

Set up

For hosted TES:

/tes-memory:tes-setup

For local memory:

npx @pentatonic-ai/ai-agent-sdk memory

What it tracks

  • Every conversation turn -- user messages, assistant responses, tool calls, duration
  • Automatic memory search -- relevant memories injected as context on every prompt
  • Automatic memory storage -- every turn stored with embeddings and HyDE queries
  • Token usage -- input, output, cache read, cache creation tokens per turn

OpenClaw Plugin

Works with both local and hosted setups. Just tell OpenClaw to set it up.

Install

openclaw plugins install -l ./packages/memory/src/openclaw

Set up

Tell OpenClaw:

Set up pentatonic memory

The agent will ask whether you want local (private, Docker-based) or hosted (Pentatonic TES cloud), then walk you through the rest. For hosted mode, it handles account creation, email verification, and API key generation conversationally.

Or use the CLI directly:

openclaw pentatonic-memory local

What it does

OpenClaw's context engine hooks fire on every lifecycle event:

  • Ingest -- every user and assistant message is stored with embeddings and HyDE query expansion
  • Assemble -- relevant memories are injected as system prompt context before every model run
  • Compact -- decay cycle runs when the context window fills
  • After turn -- high-access memories get consolidated to the semantic layer

Plus agent-callable tools: memory_search, memory_store, memory_layers.

Configuration

After setup, config lives in ~/.openclaw/pentatonic-memory.json. To switch modes, run setup again or edit directly.

You can also configure via openclaw.json:

{
  "plugins": {
    "slots": { "contextEngine": "pentatonic-memory" },
    "entries": {
      "pentatonic-memory": {
        "enabled": true,
        "config": {
          "database_url": "postgres://memory:memory@localhost:5433/memory",
          "embedding_url": "http://localhost:11435/v1",
          "embedding_model": "nomic-embed-text",
          "llm_url": "http://localhost:11435/v1",
          "llm_model": "llama3.2:3b"
        }
      }
    }
  }
}

For hosted mode, replace the config block with:

{
  "tes_endpoint": "https://your-company.api.pentatonic.com",
  "tes_client_id": "your-company",
  "tes_api_key": "tes_your-company_xxxxx"
}

SDK: Wrap Your LLM Client

JavaScript

import { TESClient } from "@pentatonic-ai/ai-agent-sdk";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from pentatonic_agent_events import TESClient

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Supported Providers

Provider Detection Intercepted Method
OpenAI client.chat.completions.create chat.completions.create()
Anthropic client.messages.create messages.create()
Workers AI client.run (JS only) run()

All other methods pass through unchanged.

API Reference

TESClient(config)

Param Type Default Description
clientId string required Your tenant identifier
apiKey string required TES API key
endpoint string required TES instance URL
userId string null User identifier for attribution
captureContent boolean true Include message content in events
maxContentLength number 4096 Truncate content beyond this length

tes.wrap(client, opts?)

Returns an instrumented proxy. Every intercepted call emits a CHAT_TURN event.

Option Type Default Description
sessionId string auto-generated UUID Links events from the same conversation
metadata object {} Custom fields on every event

tes.session(opts?)

Returns a Session for manual event emission.

session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })

Emits a CHAT_TURN event with accumulated data, then resets.

normalizeResponse(raw)

Standalone utility to normalize any LLM response:

import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";

const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);

Architecture

        +-------------------+     +-------------------+
        | Claude Code Plugin|     |  OpenClaw Plugin   |
        | (hooks: auto-     |     | (context engine:   |
        |  search + store)  |     |  ingest, assemble, |
        +--------+----------+     |  compact, tools)   |
                 |                +--------+----------+
                 |                         |
                 +------------+------------+
                              |
                  +-----------+-----------+
                  |                       |
            Local Memory            Hosted TES
            (Docker)                (Cloud)
                  |                       |
       +----+----+----+          +---+----+---+
       |    |    |    |          |   |    |   |
      PG  Ollama MCP HTTP      PG  R2  Queue Workers
      pgvector        API     pgvector       Modules

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_ai_agent_sdk-0.4.1.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pentatonic_ai_agent_sdk-0.4.1-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file pentatonic_ai_agent_sdk-0.4.1.tar.gz.

File metadata

  • Download URL: pentatonic_ai_agent_sdk-0.4.1.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.4.1.tar.gz
Algorithm Hash digest
SHA256 ca5afd5ceeab9fb8ec629674cd442b24d417bc3342801900cdbdf2b5b7e9ea25
MD5 9e1301ad5c38f21dc156b4bfbab0aa9a
BLAKE2b-256 b3e11fd67e2e4ba966939507b237110dbde3d065dce48fb07ce725f71b113146

See more details on using hashes here.

File details

Details for the file pentatonic_ai_agent_sdk-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pentatonic_ai_agent_sdk-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b126c9b594a9df48c2545bd6d13cb8b428c319956634d367cdec2660ba41b455
MD5 7371c42a821d5a160736721938f6b214
BLAKE2b-256 9be449acf07847a57fc073ae7b3c66f0879c4f08ce398e40354cc6111eb92e17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page