Skip to main content

LLM observability SDK — track token usage, tool calls, and conversations via Pentatonic TES

Project description

@pentatonic-ai/agent-events

LLM observability SDK — track token usage, tool calls, and conversations via Pentatonic TES.

Provider-agnostic: automatically wraps OpenAI, Anthropic, and Cloudflare Workers AI clients. Available for both JavaScript and Python.

Getting Started

1. Create an account and get your API key

npx @pentatonic-ai/agent-events init

This will walk you through:

  • Creating a Pentatonic account (email, company name, password)
  • Choosing a data region (EU or US)
  • Email verification
  • Generating your API key

At the end you'll see your credentials:

TES_ENDPOINT=https://api.pentatonic.com
TES_CLIENT_ID=your-company
TES_API_KEY=tes_your-company_xxxxx

Add these to your environment (.env, secrets manager, etc.) and the CLI will install the SDK for you.

2. Or install manually

If you already have an account, install the SDK directly:

npm install @pentatonic-ai/agent-events
pip install pentatonic-agent-events

You can create API keys in the Pentatonic dashboard.

Quick Start

JavaScript

import { TESClient } from "@pentatonic-ai/agent-events";

const tes = new TESClient({
  clientId: process.env.TES_CLIENT_ID,
  apiKey: process.env.TES_API_KEY,
  endpoint: process.env.TES_ENDPOINT,
});

Python

from pentatonic_agent_events import TESClient
import os

tes = TESClient(
    client_id=os.environ["TES_CLIENT_ID"],
    api_key=os.environ["TES_API_KEY"],
    endpoint=os.environ["TES_ENDPOINT"],
)

Wrap any LLM client (automatic tracking)

tes.wrap() auto-detects your client and intercepts every call — each one emits a CHAT_TURN event automatically. Pass an optional sessionId to link events from the same conversation, and metadata to attach custom fields.

JavaScript — OpenAI

import OpenAI from "openai";

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-123", metadata: { userId: "u_1" } });

// Every create() call automatically emits a CHAT_TURN event
const result = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

ai.sessionId; // "conv-123" — or auto-generated UUID if not provided

Python — OpenAI

from openai import OpenAI

ai = tes.wrap(OpenAI(), session_id="conv-123", metadata={"user_id": "u_1"})

# Every create() call automatically emits a CHAT_TURN event
result = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

ai.session_id  # "conv-123" — or auto-generated UUID if not provided

JavaScript — Anthropic

import Anthropic from "@anthropic-ai/sdk";

const claude = tes.wrap(new Anthropic());

const result = await claude.messages.create({
  model: "claude-sonnet-4-6-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello!" }],
});

Python — Anthropic

from anthropic import Anthropic

claude = tes.wrap(Anthropic())

result = claude.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

JavaScript — Cloudflare Workers AI

// Cloudflare Workers AI binding
const ai = tes.wrap(env.AI, { sessionId: sid, metadata: { shop: shopDomain } });

// run() is intercepted automatically
const result = await ai.run("@cf/meta/llama-3.1-8b-instruct", {
  messages: [{ role: "user", content: "Hello!" }],
});

Note: Workers AI is a Cloudflare-specific binding and is only available in JavaScript.

Tool-calling loops

For multi-round tool loops, just keep calling the wrapped client. Each create()/run() call emits its own event, and they're linked by sessionId. The dashboard aggregates tokens, tool calls, and turns per session automatically.

JavaScript

const ai = tes.wrap(new OpenAI(), { sessionId: "conv-101" });

// Round 1: AI requests a tool call — emits event with tool_calls
const r1 = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Find me running shoes" }],
  tools: [searchTool],
});

// Execute tool, feed results back...

// Round 2: AI responds with final answer — emits another event
const r2 = await ai.chat.completions.create({
  model: "gpt-4o",
  messages: [...messages, { role: "tool", content: toolResult }],
});

// That's it. No manual emit needed. Both events share sessionId "conv-101".

Python

ai = tes.wrap(OpenAI(), session_id="conv-101")

r1 = ai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Find me running shoes"}],
    tools=[search_tool],
)

# Execute tool, feed results back...

r2 = ai.chat.completions.create(
    model="gpt-4o",
    messages=[*messages, {"role": "tool", "content": tool_result}],
)

# No manual emit needed.

Manual session (full control)

If you don't want to use tes.wrap(), create a session directly:

JavaScript

const session = tes.session({
  sessionId: "conv-123",
  metadata: { userId: "u_456" },
});

// Call your LLM however you like
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "What is 2+2?" }],
});

// Record the response (accumulates tokens, tool calls, model)
session.record(response);

// Emit when the turn is complete
await session.emitChatTurn({
  userMessage: "What is 2+2?",
  assistantResponse: response.choices[0].message.content,
});

Python

session = tes.session(
    session_id="conv-123",
    metadata={"user_id": "u_456"},
)

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)

session.record(response)

session.emit_chat_turn(
    user_message="What is 2+2?",
    assistant_response=response["choices"][0]["message"]["content"],
)

API Reference

TESClient

Creates a new client.

JavaScript

new TESClient({ clientId, apiKey, endpoint, headers?, userId?, captureContent?, maxContentLength? })

Python

TESClient(client_id, api_key, endpoint, headers=None, user_id=None, capture_content=True, max_content_length=4096)
Param (JS / Python) Type Default Description
clientId / client_id string required Your application/tenant identifier
apiKey / api_key string required TES service API key (sent as x-service-key header)
endpoint / endpoint string required TES instance URL (must be https://, except localhost for dev)
headers / headers object / dict {} Additional headers to include in every request
userId / user_id string null / None Optional user identifier — included as data.attributes.userId on every event. Enables user-scoped memory and attribution.
captureContent / capture_content boolean / bool true / True Whether to include message content in events
maxContentLength / max_content_length number / int 4096 Truncate content beyond this length

tes.wrap(client, opts?)

Returns a Proxy (JS) or wrapper (Python) around any supported LLM client. Every intercepted call emits a CHAT_TURN event automatically.

JavaScript

const ai = tes.wrap(client, { sessionId, userId, metadata });

Python

ai = tes.wrap(client, session_id=None, user_id=None, metadata=None)
Option (JS / Python) Type Default Description
sessionId / session_id string crypto.randomUUID() / uuid.uuid4() Links events from the same conversation
userId / user_id string Inherits from client Override the user identifier for this wrapped instance
metadata / metadata object / dict {} Custom fields included in every emitted event

Auto-detects the provider:

Client Detection Intercepted method
OpenAI client.chat.completions.create chat.completions.create()
Anthropic client.messages.create messages.create()
Workers AI client.run (JS only) run()

All other methods/properties pass through unchanged. The wrapped client exposes ai.sessionId (JS) or ai.session_id (Python).

tes.session(opts?)

Returns a Session instance.

Option (JS / Python) Type Default Description
sessionId / session_id string crypto.randomUUID() / uuid.uuid4() Conversation/session identifier
metadata / metadata object / dict {} Extra fields included in every emitted event

session.record(rawResponse)

Normalizes an LLM response and accumulates token usage, tool calls, and model info. Accepts responses from any supported provider. Returns the normalized response.

session.emitChatTurn() / session.emit_chat_turn()

Sends a CHAT_TURN event to TES with accumulated usage data, then resets counters.

Param (JS / Python) Type Description
userMessage / user_message string The user's message
assistantResponse / assistant_response string The assistant's response
turnNumber / turn_number number / int Optional turn number

session.emitToolUse() / session.emit_tool_use()

Sends a TOOL_USE event for individual tool invocations.

Param (JS / Python) Type Description
tool / tool string Tool name
args / args object / dict Tool arguments
resultSummary / result_summary string Optional result summary
durationMs / duration_ms number / int Optional duration in milliseconds
turnNumber / turn_number number / int Optional turn number

session.emitSessionStart() / session.emit_session_start()

Sends a SESSION_START event.

session.totalUsage / session.total_usage

Returns current accumulated usage: { prompt_tokens, completion_tokens, total_tokens, ai_rounds }.

normalizeResponse(raw) / normalize_response(raw)

Standalone utility to normalize any LLM response into a consistent shape:

JavaScript

import { normalizeResponse } from "@pentatonic-ai/agent-events";

const normalized = normalizeResponse(openaiResponse);
// { content, model, usage: { prompt_tokens, completion_tokens }, toolCalls: [{ tool, args }] }

Python

from pentatonic_agent_events import normalize_response

normalized = normalize_response(openai_response)
# { "content", "model", "usage": { "prompt_tokens", "completion_tokens" }, "tool_calls": [{ "tool", "args" }] }

Note: In Python, the normalized response uses tool_calls (snake_case) instead of toolCalls (camelCase).

Events Emitted

All events are sent to the TES GraphQL API (emitEvent mutation) authenticated via x-service-key and x-client-id headers.

Event Type Entity Type When
CHAT_TURN conversation Every create()/run() call via wrap(), or manually via session.emitChatTurn()
TOOL_USE conversation Via session.emitToolUse() (manual only)
SESSION_START conversation Via session.emitSessionStart() (manual only)

Supported Providers

Provider Auto-wrap Manual session Response normalization
OpenAI (and compatible: Azure, Groq, Together, Mistral) JS + Python JS + Python JS + Python
Anthropic JS + Python JS + Python JS + Python
Cloudflare Workers AI JS only JS only JS + Python

Security

  • HTTPS enforced: The SDK rejects non-HTTPS endpoints (except localhost for development)
  • API key protection: Stored as a non-enumerable property (JS) or private attribute (Python) — won't appear in JSON.stringify, repr(), or error reporters
  • Content controls: Set captureContent: false (JS) or capture_content=False (Python) to omit message content from events, or use maxContentLength / max_content_length to truncate
  • No runtime dependencies: Both the JavaScript and Python SDKs have zero external runtime dependencies

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pentatonic_agent_events-0.3.0b3.tar.gz (94.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pentatonic_agent_events-0.3.0b3-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file pentatonic_agent_events-0.3.0b3.tar.gz.

File metadata

File hashes

Hashes for pentatonic_agent_events-0.3.0b3.tar.gz
Algorithm Hash digest
SHA256 eeb593351fa1e29937befce09003c0d2764ed8bd520c47cad7a836127a64fd64
MD5 db9cef02566b5e9a14c2cca092bc8c67
BLAKE2b-256 1a98650324a81eae2cc38e85e7061f4df7c4b55f8b091df9a424e409b318e0f7

See more details on using hashes here.

File details

Details for the file pentatonic_agent_events-0.3.0b3-py3-none-any.whl.

File metadata

File hashes

Hashes for pentatonic_agent_events-0.3.0b3-py3-none-any.whl
Algorithm Hash digest
SHA256 f21159ef20cf193fb5b6ec334d02fd3d8ea2b7030a9f48eb4ceb05c09b9977d7
MD5 c0e891a6111ec8786519512205ea0a00
BLAKE2b-256 da105184ee7e50fdf3e58db7eac1f0fd4d8d50b10274ef4b86e067ba82506e6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page