pentatonic-ai-agent-sdk

AI Agent SDK — observability, memory, and analytics for LLM applications. Provider-agnostic. Tracks token usage, tool calls, conversations, and enables shared team memory.

These details have not been verified by PyPI

Project links

Project description

Pentatonic

AI Agent SDK

Observability, memory, and analytics for LLM applications.
Run locally or use hosted TES. JavaScript & Python.

Overview
Local Memory (self-hosted)
Hosted TES
Claude Code Plugin
OpenClaw Plugin
SDK: Wrap Your LLM Client
Supported Providers
API Reference
Health Checks (doctor)
Architecture

Overview

Two ways to use the SDK:

Local Memory -- Run a fully private memory system on your own machine. PostgreSQL + pgvector + Ollama in Docker. No API keys, no cloud. Your agent gets persistent, searchable memory backed by multi-signal retrieval and HyDE query expansion.

Hosted TES -- Connect to Pentatonic's Thing Event System for production-grade observability, higher-dimensional embeddings, conversation analytics, and team-wide shared memory.

Both paths work with Claude Code and OpenClaw. The plugins auto-search on every prompt and auto-store every conversation turn.

Local Memory (self-hosted)

Run the full memory stack locally. Requires Docker and ~4GB disk for models.

1. Set up

npx @pentatonic-ai/ai-agent-sdk memory

This starts PostgreSQL + pgvector, Ollama, and the memory server. It pulls embedding and chat models, and writes the local config.

2. Install the Claude Code plugin

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

That's it. The plugin hooks automatically search memories on every prompt and store every conversation turn. Fully local, fully private.

What you get

Automatic memory -- every conversation turn is stored with embeddings and HyDE query expansion
Semantic search -- multi-signal retrieval combining vector similarity, BM25 full-text, recency decay, and access frequency
Memory layers -- episodic (recent), semantic (consolidated), procedural (how-to), working (temporary)
Distilled memory -- a background LLM pass extracts atomic facts from each raw turn and stores each as its own node in the semantic layer, linked back to the source. A query like "what does Phil drink?" matches "Phil drinks cortado" more reliably than a mixed paragraph covering food, drinks, and hobbies. Default-on; the raw turn is still preserved.
Decay and consolidation -- memories fade over time; frequently accessed ones get promoted

Change models

EMBEDDING_MODEL=mxbai-embed-large LLM_MODEL=qwen2.5:7b npx @pentatonic-ai/ai-agent-sdk memory

Raspberry Pi

Pi 5 with 8GB RAM runs the full stack. nomic-embed-text (~300MB) + llama3.2:3b (~2GB) leaves plenty of headroom.

Use as a library

import { createMemorySystem } from '@pentatonic-ai/ai-agent-sdk/memory';

const memory = createMemorySystem({
  db: pgPool,
  embedding: { url: 'http://localhost:11434/v1', model: 'nomic-embed-text' },
  llm: { url: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
});

await memory.migrate();
await memory.ensureLayers('my-app');
await memory.ingest('User prefers dark mode', { clientId: 'my-app' });
const results = await memory.search('preferences', { clientId: 'my-app' });

Hosted TES

Connect to Pentatonic's hosted infrastructure for production use.

1. Create an account

npx @pentatonic-ai/ai-agent-sdk init

This walks you through account creation, email verification, and API key generation. You'll get:

TES_ENDPOINT=https://your-company.api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

2. Install

npm install @pentatonic-ai/ai-agent-sdk

pip install pentatonic-ai-agent-sdk

What you get (in addition to local features)

Higher-dimensional embeddings -- NV-Embed-v2 (4096d) for better retrieval accuracy
Conversation analytics -- session metrics, search attribution, dead-end detection
Team-wide shared memory -- semantic search across your team's AI interactions
Admin dashboard -- visualize conversations, token usage, and memory explorer
Multi-tenancy -- isolated databases per client

Claude Code Plugin

Works with both local and hosted setups. Install once, switch modes via config.

Install via marketplace

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

Set up

For hosted TES:

/tes-memory:tes-setup

For local memory:

npx @pentatonic-ai/ai-agent-sdk memory

What it tracks

Every conversation turn -- user messages, assistant responses, tool calls, duration
Automatic memory search -- relevant memories injected as context on every prompt
Automatic memory storage -- every turn stored with embeddings and HyDE queries
Token usage -- input, output, cache read, cache creation tokens per turn

OpenClaw Plugin

Works with both local and hosted setups. Just tell OpenClaw to set it up.

Install

openclaw plugins install @pentatonic-ai/openclaw-memory-plugin

Set up

Tell OpenClaw:

Set up pentatonic memory

The agent will ask whether you want local (private, Docker-based) or hosted (Pentatonic TES cloud), then walk you through the rest. For hosted mode, it handles account creation, email verification, and API key generation conversationally.

Or use the CLI directly:

openclaw pentatonic-memory local

What it does

OpenClaw's context engine hooks fire on every lifecycle event:

Ingest -- every user and assistant message is stored with embeddings and HyDE query expansion, then distilled into atomic facts in the background (see Distilled memory)
Assemble -- relevant memories are injected as system prompt context before every model run
Compact -- decay cycle runs when the context window fills
After turn -- high-access memories get consolidated to the semantic layer

Plus agent-callable tools: memory_search, memory_store, memory_layers.

Configuration

After setup, config lives in ~/.openclaw/pentatonic-memory.json. To switch modes, run setup again or edit directly.

You can also configure via openclaw.json:

{
  "plugins": {
    "slots": { "contextEngine": "pentatonic-memory" },
    "entries": {
      "pentatonic-memory": {
        "enabled": true,
        "config": {
          "database_url": "postgres://memory:memory@localhost:5433/memory",
          "embedding_url": "http://localhost:11435/v1",
          "embedding_model": "nomic-embed-text",
          "llm_url": "http://localhost:11435/v1",
          "llm_model": "llama3.2:3b"
        }
      }
    }
  }
}

For hosted mode, replace the config block with:

{
  "tes_endpoint": "https://your-company.api.pentatonic.com",
  "tes_client_id": "your-company",
  "tes_api_key": "tes_your-company_xxxxx"
}

SDK: Wrap Your LLM Client

JavaScript

import { TESClient } from "@pentatonic-ai/ai-agent-sdk";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from pentatonic_agent_events import TESClient

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Supported Providers

Provider	Detection	Intercepted Method
OpenAI	`client.chat.completions.create`	`chat.completions.create()`
Anthropic	`client.messages.create`	`messages.create()`
Workers AI	`client.run` (JS only)	`run()`

All other methods pass through unchanged.

API Reference

`TESClient(config)`

Param	Type	Default	Description
`clientId`	`string`	required	Your tenant identifier
`apiKey`	`string`	required	TES API key
`endpoint`	`string`	required	TES instance URL
`userId`	`string`	`null`	User identifier for attribution
`captureContent`	`boolean`	`true`	Include message content in events
`maxContentLength`	`number`	`4096`	Truncate content beyond this length

`tes.wrap(client, opts?)`

Returns an instrumented proxy. Every intercepted call emits a CHAT_TURN event.

Option	Type	Default	Description
`sessionId`	`string`	auto-generated UUID	Links events from the same conversation
`metadata`	`object`	`{}`	Custom fields on every event

`tes.session(opts?)`

Returns a Session for manual event emission.

`session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })`

Emits a CHAT_TURN event with accumulated data, then resets.

`normalizeResponse(raw)`

Standalone utility to normalize any LLM response:

import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";

const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);

Health Checks (`doctor`)

Run a full health check of your SDK install at any time:

npx @pentatonic-ai/ai-agent-sdk doctor

doctor auto-detects which install path you're on (Local Memory, Hosted TES, or self-hosted Pentatonic platform) and runs only the checks that apply. Exit code is 0 for all-clear, 1 for warnings, 2 for critical.

Common flags:

npx @pentatonic-ai/ai-agent-sdk doctor --json     # machine-readable
npx @pentatonic-ai/ai-agent-sdk doctor --alert    # silent unless issues
npx @pentatonic-ai/ai-agent-sdk doctor --no-plugins
npx @pentatonic-ai/ai-agent-sdk doctor --path local

What gets checked:

Universal — Node version, disk space, SDK config-file permissions
Local Memory — Postgres + pgvector + migrations, embedding/LLM endpoints, memory server port
Hosted TES — endpoint reachable, API key authenticates
Self-hosted platform — HybridRAG, Qdrant, Neo4j, vLLM (each optional, skipped when its env var is unset)

Plugins

Drop a .mjs file into ~/.config/pentatonic-ai/doctor-plugins/ to add your own checks. Useful for app-specific things — internal APIs, ingest freshness, custom infrastructure — without forking the SDK.

// ~/.config/pentatonic-ai/doctor-plugins/my-app.mjs
export default {
  name: "my-app",
  checks: [
    {
      name: "internal API",
      severity: "warning",
      run: async () => {
        const res = await fetch("https://internal/health");
        return res.ok
          ? { ok: true, msg: "200 OK" }
          : { ok: false, msg: `HTTP ${res.status}` };
      },
    },
  ],
};

See packages/doctor/README.md for the full plugin contract and programmatic API.

Architecture

        +-------------------+     +-------------------+
        | Claude Code Plugin|     |  OpenClaw Plugin   |
        | (hooks: auto-     |     | (context engine:   |
        |  search + store)  |     |  ingest, assemble, |
        +--------+----------+     |  compact, tools)   |
                 |                +--------+----------+
                 |                         |
                 +------------+------------+
                              |
                  +-----------+-----------+
                  |                       |
            Local Memory            Hosted TES
            (Docker)                (Cloud)
                  |                       |
       +----+----+----+          +---+----+---+
       |    |    |    |          |   |    |   |
      PG  Ollama MCP HTTP      PG  R2  Queue Workers
      pgvector        API     pgvector       Modules

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.6

Apr 23, 2026

0.5.5

Apr 23, 2026

0.5.4

Apr 23, 2026

0.5.3

Apr 21, 2026

0.5.2

Apr 21, 2026

This version

0.5.1

Apr 21, 2026

0.5.0

Apr 21, 2026

0.4.3

Apr 15, 2026

0.4.2

Apr 15, 2026

0.4.1

Apr 15, 2026

0.4.0b1 pre-release

Apr 14, 2026

0.3.0

Mar 30, 2026

0.3.0b3 pre-release

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_ai_agent_sdk-0.5.1.tar.gz (13.8 kB view details)

Uploaded Apr 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl (16.6 kB view details)

Uploaded Apr 21, 2026 Python 3

File details

Details for the file pentatonic_ai_agent_sdk-0.5.1.tar.gz.

File metadata

Download URL: pentatonic_ai_agent_sdk-0.5.1.tar.gz
Upload date: Apr 21, 2026
Size: 13.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.5.1.tar.gz
Algorithm	Hash digest
SHA256	`052611d6482bea424b2f7574f687fc63d8e9dd405cf63b9091d3545e74ced486`
MD5	`ed4c2463b5f535d73e1a2b1446f9c68f`
BLAKE2b-256	`a1fb9b6db279ab0b508c8abd88d3ae10b926b8b5b7461deb0705007204b08682`

See more details on using hashes here.

File details

Details for the file pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl.

File metadata

Download URL: pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl
Upload date: Apr 21, 2026
Size: 16.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.5.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`163ccb4a3490210d9f56e5926a1c956e1cddb3acc25ef5bddf5a91bb681cbaf0`
MD5	`8e8271d539daf3396d91ed07f850c846`
BLAKE2b-256	`d0a193fa7fc7b1f07776d48459cf56c0205b724d76cb1ea3ad513e2c6819a1d3`

See more details on using hashes here.

pentatonic-ai-agent-sdk 0.5.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AI Agent SDK

Table of Contents

Overview

Local Memory (self-hosted)

1. Set up

2. Install the Claude Code plugin

What you get

Change models

Raspberry Pi

Use as a library

Hosted TES

1. Create an account

2. Install

What you get (in addition to local features)

Claude Code Plugin

Install via marketplace

Set up

What it tracks

OpenClaw Plugin

Install

Set up

What it does

Configuration

SDK: Wrap Your LLM Client

Supported Providers

API Reference

TESClient(config)

tes.wrap(client, opts?)

tes.session(opts?)

session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })

normalizeResponse(raw)

Health Checks (doctor)

Plugins

Architecture

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`TESClient(config)`

`tes.wrap(client, opts?)`

`tes.session(opts?)`

`session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })`

`normalizeResponse(raw)`

Health Checks (`doctor`)