Skip to main content

LLM observability SDK — track token usage, tool calls, and conversations via Pentatonic TES

Project description

@pentatonic-ai/agent-events

LLM observability SDK — track token usage, tool calls, and conversations via Pentatonic TES.

Provider-agnostic: automatically wraps OpenAI, Anthropic, and Cloudflare Workers AI clients. Available for both JavaScript and Python.

Getting Started

1. Create an account and get your API key

npx @pentatonic-ai/agent-events init

This will walk you through:

  • Creating a Pentatonic account (email, company name, password)
  • Choosing a data region (EU or US)
  • Email verification
  • Generating your API key

At the end you'll see your credentials:

TES_ENDPOINT=https://api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

Add these to your environment (.env, secrets manager, etc.) and the CLI will install the SDK for you.

2. Or install manually

If you already have an account, install the SDK directly:

npm install @pentatonic-ai/agent-events
pip install pentatonic-agent-events

You can create API keys in the Pentatonic dashboard.

Quick Start

JavaScript

import { TESClient } from "@pentatonic-ai/agent-events";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

Python

from pentatonic_agent_events import TESClient
import os

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

Wrap any LLM client (automatic tracking)

tes.wrap() auto-detects your client and intercepts every call — each one emits a CHAT_TURN event automatically. Pass an optional sessionId to link events from the same conversation, and metadata to attach custom fields.

JavaScript — OpenAI

import OpenAI from "openai";

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123", metadata: { userId: "u_1" } });

// Every create() call automatically emits a CHAT_TURN event
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

ai.sessionId; // "conv-123" — or auto-generated UUID if not provided

Python — OpenAI

from openai import OpenAI

ai = tes.wrap(OpenAI(), session_id="conv-123", metadata={"user_id": "u_1"})

# Every create() call automatically emits a CHAT_TURN event
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

ai.session_id  # "conv-123" — or auto-generated UUID if not provided

JavaScript — Anthropic

import Anthropic from "@anthropic-ai/sdk";

const claude = tes.wrap(new Anthropic());

const result = await claude.messages.create({
  model: "claude-sonnet-4-6-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello!" }],
});

Python — Anthropic

from anthropic import Anthropic

claude = tes.wrap(Anthropic())

result = claude.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

JavaScript — Cloudflare Workers AI

// Cloudflare Workers AI binding
const ai = tes.wrap(env.AI, { sessionId: sid, metadata: { shop: shopDomain } });

// run() is intercepted automatically
const result = await ai.run("@cf/meta/llama-3.1-8b-instruct", {
  messages: [{ role: "user", content: "Hello!" }],
});

Note: Workers AI is a Cloudflare-specific binding and is only available in JavaScript.

Tool-calling loops

For multi-round tool loops, just keep calling the wrapped client. Each create()/run() call emits its own event, and they're linked by sessionId. The dashboard aggregates tokens, tool calls, and turns per session automatically.

JavaScript

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-101" });

// Round 1: AI requests a tool call — emits event with tool_calls
const r1 = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Find me running shoes" }],
  tools: [searchTool],
});

// Execute tool, feed results back...

// Round 2: AI responds with final answer — emits another event
const r2 = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [...messages, { role: "tool", content: toolResult }],
});

// That's it. No manual emit needed. Both events share sessionId "conv-101".

Python

ai = tes.wrap(OpenAI(), session_id="conv-101")

r1 = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Find me running shoes"}],
    tools=[search_tool],
)

# Execute tool, feed results back...

r2 = ai.chat.completions.create(
    model="gpt-4o",
    messages=[*messages, {"role": "tool", "content": tool_result}],
)

# No manual emit needed.

Manual session (full control)

If you don't want to use tes.wrap(), create a session directly:

JavaScript

const session = tes.session({
  sessionId: "conv-123",
  metadata: { userId: "u_456" },
});

// Call your LLM however you like
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "What is 2+2?" }],
});

// Record the response (accumulates tokens, tool calls, model)
session.record(response);

// Emit when the turn is complete
await session.emitChatTurn({
  userMessage: "What is 2+2?",
  assistantResponse: response.choices[0].message.content,
});

Python

session = tes.session(
    session_id="conv-123",
    metadata={"user_id": "u_456"},
)

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)

session.record(response)

session.emit_chat_turn(
    user_message="What is 2+2?",
    assistant_response=response["choices"][0]["message"]["content"],
)

API Reference

TESClient

Creates a new client.

JavaScript

new TESClient({ clientId, apiKey, endpoint, headers?, captureContent?, maxContentLength? })

Python

TESClient(client_id, api_key, endpoint, headers=None, capture_content=True, max_content_length=4096)
Param (JS / Python) Type Default Description
clientId / client_id string required Your application/tenant identifier
apiKey / api_key string required TES service API key (sent as x-service-key header)
endpoint / endpoint string required TES instance URL (must be https://, except localhost for dev)
headers / headers object / dict {} Additional headers to include in every request
captureContent / capture_content boolean / bool true / True Whether to include message content in events
maxContentLength / max_content_length number / int 4096 Truncate content beyond this length

tes.wrap(client, opts?)

Returns a Proxy (JS) or wrapper (Python) around any supported LLM client. Every intercepted call emits a CHAT_TURN event automatically.

JavaScript

const ai = tes.wrap(client, { sessionId, metadata });

Python

ai = tes.wrap(client, session_id=None, metadata=None)
Option (JS / Python) Type Default Description
sessionId / session_id string crypto.randomUUID() / uuid.uuid4() Links events from the same conversation
metadata / metadata object / dict {} Custom fields included in every emitted event

Auto-detects the provider:

Client Detection Intercepted method
OpenAI client.chat.completions.create chat.completions.create()
Anthropic client.messages.create messages.create()
Workers AI client.run (JS only) run()

All other methods/properties pass through unchanged. The wrapped client exposes ai.sessionId (JS) or ai.session_id (Python).

tes.session(opts?)

Returns a Session instance.

Option (JS / Python) Type Default Description
sessionId / session_id string crypto.randomUUID() / uuid.uuid4() Conversation/session identifier
metadata / metadata object / dict {} Extra fields included in every emitted event

session.record(rawResponse)

Normalizes an LLM response and accumulates token usage, tool calls, and model info. Accepts responses from any supported provider. Returns the normalized response.

session.emitChatTurn() / session.emit_chat_turn()

Sends a CHAT_TURN event to TES with accumulated usage data, then resets counters.

Param (JS / Python) Type Description
userMessage / user_message string The user's message
assistantResponse / assistant_response string The assistant's response
turnNumber / turn_number number / int Optional turn number

session.emitToolUse() / session.emit_tool_use()

Sends a TOOL_USE event for individual tool invocations.

Param (JS / Python) Type Description
tool / tool string Tool name
args / args object / dict Tool arguments
resultSummary / result_summary string Optional result summary
durationMs / duration_ms number / int Optional duration in milliseconds
turnNumber / turn_number number / int Optional turn number

session.emitSessionStart() / session.emit_session_start()

Sends a SESSION_START event.

session.totalUsage / session.total_usage

Returns current accumulated usage: { prompt_tokens, completion_tokens, total_tokens, ai_rounds }.

normalizeResponse(raw) / normalize_response(raw)

Standalone utility to normalize any LLM response into a consistent shape:

JavaScript

import { normalizeResponse } from "@pentatonic-ai/agent-events";

const normalized = normalizeResponse(openaiResponse);
// { content, model, usage: { prompt_tokens, completion_tokens }, toolCalls: [{ tool, args }] }

Python

from pentatonic_agent_events import normalize_response

normalized = normalize_response(openai_response)
# { "content", "model", "usage": { "prompt_tokens", "completion_tokens" }, "tool_calls": [{ "tool", "args" }] }

Note: In Python, the normalized response uses tool_calls (snake_case) instead of toolCalls (camelCase).

Events Emitted

All events are sent to the TES GraphQL API (emitEvent mutation) authenticated via x-service-key and x-client-id headers.

Event Type Entity Type When
CHAT_TURN conversation Every create()/run() call via wrap(), or manually via session.emitChatTurn()
TOOL_USE conversation Via session.emitToolUse() (manual only)
SESSION_START conversation Via session.emitSessionStart() (manual only)

Supported Providers

Provider Auto-wrap Manual session Response normalization
OpenAI (and compatible: Azure, Groq, Together, Mistral) JS + Python JS + Python JS + Python
Anthropic JS + Python JS + Python JS + Python
Cloudflare Workers AI JS only JS only JS + Python

Security

  • HTTPS enforced: The SDK rejects non-HTTPS endpoints (except localhost for development)
  • API key protection: Stored as a non-enumerable property (JS) or private attribute (Python) — won't appear in JSON.stringify, repr(), or error reporters
  • Content controls: Set captureContent: false (JS) or capture_content=False (Python) to omit message content from events, or use maxContentLength / max_content_length to truncate
  • No runtime dependencies: Both the JavaScript and Python SDKs have zero external runtime dependencies

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_agent_events-0.3.0b1.tar.gz (93.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pentatonic_agent_events-0.3.0b1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file pentatonic_agent_events-0.3.0b1.tar.gz.

File metadata

File hashes

Hashes for pentatonic_agent_events-0.3.0b1.tar.gz
Algorithm Hash digest
SHA256 6fe5331477567049876ffacb470be5239380e6d079318e2c186f5617a008aa0f
MD5 aa502be125545c64775b6f349ea82a44
BLAKE2b-256 c2bc6ff9a265f0646dbb81a591d5b385961034b43e2e1c44fa1236221731b906

See more details on using hashes here.

File details

Details for the file pentatonic_agent_events-0.3.0b1-py3-none-any.whl.

File metadata

File hashes

Hashes for pentatonic_agent_events-0.3.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 72b54aae089ae11ed527bd6810067f52996bef0ec5f513d18dbb7b29d8eef7bc
MD5 17434034ef8090ef450dab83bf2ac10d
BLAKE2b-256 92cf16a092bdcfbdf877b38f8fefc5d4bedf208c0b4d856fb4afdf9f81360d56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page