pentatonic-ai-agent-sdk

AI Agent SDK — observability, memory, and analytics for LLM applications. Provider-agnostic. Tracks token usage, tool calls, conversations, and enables shared team memory.

These details have not been verified by PyPI

Project links

Project description

Pentatonic

AI Agent SDK

Observability, memory, and analytics for LLM applications.
Run locally or use hosted TES. JavaScript & Python.

Overview
Local Memory (self-hosted)
Hosted TES
Claude Code Plugin
SDK: Wrap Your LLM Client
Supported Providers
API Reference
Architecture

Overview

Two ways to use the SDK:

Local Memory -- Run a fully private memory system on your own machine. PostgreSQL + pgvector + Ollama in Docker. No API keys, no cloud. Your agent gets persistent, searchable memory backed by multi-signal retrieval and HyDE query expansion.

Hosted TES -- Connect to Pentatonic's Thing Event System for production-grade observability, higher-dimensional embeddings, conversation analytics, and team-wide shared memory.

Both paths use the same Claude Code plugin. The hooks auto-search on every prompt and auto-store every conversation turn.

Local Memory (self-hosted)

Run the full memory stack locally. Requires Docker and ~4GB disk for models.

1. Set up

npx @pentatonic-ai/ai-agent-sdk memory

This starts PostgreSQL + pgvector, Ollama, and the memory server. It pulls embedding and chat models, and writes the local config.

2. Install the Claude Code plugin

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

That's it. The plugin hooks automatically search memories on every prompt and store every conversation turn. Fully local, fully private.

What you get

Automatic memory -- every conversation turn is stored with embeddings and HyDE query expansion
Semantic search -- multi-signal retrieval combining vector similarity, BM25 full-text, recency decay, and access frequency
Memory layers -- episodic (recent), semantic (consolidated), procedural (how-to), working (temporary)
Decay and consolidation -- memories fade over time; frequently accessed ones get promoted

Change models

EMBEDDING_MODEL=mxbai-embed-large LLM_MODEL=qwen2.5:7b npx @pentatonic-ai/ai-agent-sdk memory

Raspberry Pi

Pi 5 with 8GB RAM runs the full stack. nomic-embed-text (~300MB) + llama3.2:3b (~2GB) leaves plenty of headroom.

Use as a library

import { createMemorySystem } from '@pentatonic-ai/ai-agent-sdk/memory';

const memory = createMemorySystem({
  db: pgPool,
  embedding: { url: 'http://localhost:11434/v1', model: 'nomic-embed-text' },
  llm: { url: 'http://localhost:11434/v1', model: 'llama3.2:3b' },
});

await memory.migrate();
await memory.ensureLayers('my-app');
await memory.ingest('User prefers dark mode', { clientId: 'my-app' });
const results = await memory.search('preferences', { clientId: 'my-app' });

Hosted TES

Connect to Pentatonic's hosted infrastructure for production use.

1. Create an account

npx @pentatonic-ai/ai-agent-sdk init

This walks you through account creation, email verification, and API key generation. You'll get:

TES_ENDPOINT=https://your-company.api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

2. Install

npm install @pentatonic-ai/ai-agent-sdk

pip install pentatonic-ai-agent-sdk

What you get (in addition to local features)

Higher-dimensional embeddings -- NV-Embed-v2 (4096d) for better retrieval accuracy
Conversation analytics -- session metrics, search attribution, dead-end detection
Team-wide shared memory -- semantic search across your team's AI interactions
Admin dashboard -- visualize conversations, token usage, and memory explorer
Multi-tenancy -- isolated databases per client

Claude Code Plugin

Works with both local and hosted setups. Install once, switch modes via config.

Install via marketplace

/plugin marketplace add Pentatonic-Ltd/ai-agent-sdk
/plugin install tes-memory@pentatonic-ai

Set up

For hosted TES:

/tes-memory:tes-setup

For local memory:

npx @pentatonic-ai/ai-agent-sdk memory

What it tracks

Every conversation turn -- user messages, assistant responses, tool calls, duration
Automatic memory search -- relevant memories injected as context on every prompt
Automatic memory storage -- every turn stored with embeddings and HyDE queries
Token usage -- input, output, cache read, cache creation tokens per turn

SDK: Wrap Your LLM Client

JavaScript

import { TESClient } from "@pentatonic-ai/ai-agent-sdk";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123" });
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Python

from pentatonic_agent_events import TESClient

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

ai = tes.wrap(OpenAI(), session_id="conv-123")
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Supported Providers

Provider	Detection	Intercepted Method
OpenAI	`client.chat.completions.create`	`chat.completions.create()`
Anthropic	`client.messages.create`	`messages.create()`
Workers AI	`client.run` (JS only)	`run()`

All other methods pass through unchanged.

API Reference

`TESClient(config)`

Param	Type	Default	Description
`clientId`	`string`	required	Your tenant identifier
`apiKey`	`string`	required	TES API key
`endpoint`	`string`	required	TES instance URL
`userId`	`string`	`null`	User identifier for attribution
`captureContent`	`boolean`	`true`	Include message content in events
`maxContentLength`	`number`	`4096`	Truncate content beyond this length

`tes.wrap(client, opts?)`

Returns an instrumented proxy. Every intercepted call emits a CHAT_TURN event.

Option	Type	Default	Description
`sessionId`	`string`	auto-generated UUID	Links events from the same conversation
`metadata`	`object`	`{}`	Custom fields on every event

`tes.session(opts?)`

Returns a Session for manual event emission.

`session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })`

Emits a CHAT_TURN event with accumulated data, then resets.

`normalizeResponse(raw)`

Standalone utility to normalize any LLM response:

import { normalizeResponse } from "@pentatonic-ai/ai-agent-sdk";

const { content, model, usage, toolCalls } = normalizeResponse(openaiResponse);

Architecture

                    +-----------------------+
                    |   Claude Code Plugin  |
                    |   (hooks: auto-search |
                    |    + auto-store)      |
                    +-----------+-----------+
                                |
                    +-----------+-----------+
                    |                       |
              Local Memory            Hosted TES
              (Docker)                (Cloud)
                    |                       |
         +----+----+----+          +---+----+---+
         |    |    |    |          |   |    |   |
        PG  Ollama MCP HTTP      PG  R2  Queue Workers
        pgvector        API     pgvector       Modules

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.6

Apr 23, 2026

0.5.5

Apr 23, 2026

0.5.4

Apr 23, 2026

0.5.3

Apr 21, 2026

0.5.2

Apr 21, 2026

0.5.1

Apr 21, 2026

0.5.0

Apr 21, 2026

0.4.3

Apr 15, 2026

0.4.2

Apr 15, 2026

0.4.1

Apr 15, 2026

This version

0.4.0b1 pre-release

Apr 14, 2026

0.3.0

Mar 30, 2026

0.3.0b3 pre-release

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_ai_agent_sdk-0.4.0b1.tar.gz (12.0 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pentatonic_ai_agent_sdk-0.4.0b1-py3-none-any.whl (14.7 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file pentatonic_ai_agent_sdk-0.4.0b1.tar.gz.

File metadata

Download URL: pentatonic_ai_agent_sdk-0.4.0b1.tar.gz
Upload date: Apr 14, 2026
Size: 12.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.4.0b1.tar.gz
Algorithm	Hash digest
SHA256	`e54ddff5f404c1b2c6e39e38b925cb31cecca2d0fe28266b38d8d2e4d32fc098`
MD5	`55919978fc065305802b5996e8749214`
BLAKE2b-256	`4db3e674845f08019543d952caa2e42f3575ff353b1afde4d36e70f4b9eccf02`

See more details on using hashes here.

File details

Details for the file pentatonic_ai_agent_sdk-0.4.0b1-py3-none-any.whl.

File metadata

Download URL: pentatonic_ai_agent_sdk-0.4.0b1-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 14.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for pentatonic_ai_agent_sdk-0.4.0b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2696ab6a4a7529016dc3176890c167372f20463a0dcd7eb7fc7f7a33c48ecab7`
MD5	`f75b53a5059b39b047ec8b107b8fafb3`
BLAKE2b-256	`9f79717ce5574a21936253b9ad1bcbc93991f262d21cbb13cea51b80bfe31afa`

See more details on using hashes here.

pentatonic-ai-agent-sdk 0.4.0b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AI Agent SDK

Table of Contents

Overview

Local Memory (self-hosted)

1. Set up

2. Install the Claude Code plugin

What you get

Change models

Raspberry Pi

Use as a library

Hosted TES

1. Create an account

2. Install

What you get (in addition to local features)

Claude Code Plugin

Install via marketplace

Set up

What it tracks

SDK: Wrap Your LLM Client

Supported Providers

API Reference

TESClient(config)

tes.wrap(client, opts?)

tes.session(opts?)

session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })

normalizeResponse(raw)

Architecture

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`TESClient(config)`

`tes.wrap(client, opts?)`

`tes.session(opts?)`

`session.emitChatTurn({ userMessage, assistantResponse, turnNumber? })`

`normalizeResponse(raw)`