Skip to main content

Letta tools for the Ejentum Reasoning Harness. Four agent-callable functions (harness_reasoning, harness_code, harness_anti_deception, harness_memory) registered with a Letta server via tools.upsert_from_function. Each call returns a task-matched cognitive operation engineered in two layers: a natural-language procedure plus an executable reasoning topology (graph DAG with gates, parallel branches, and meta-cognitive exits).

Project description

letta-ejentum

Letta tools for the Ejentum Reasoning Harness. Four agent-callable functions (harness_reasoning, harness_code, harness_anti_deception, harness_memory) that upload to a Letta server via client.tools.upsert_from_function, plus a register_ejentum_tools(client) one-liner that does all four in a call.

Each operation in the Ejentum library (679 of them, organized across four harnesses) is engineered in two layers:

  • a natural-language procedure the model can read, naming the steps to take and the failure pattern to refuse, and
  • an executable reasoning topology: a graph-shaped plan over those steps. The plan names explicit decision points where the model branches, parallel branches that run and rejoin, bounded loops that run until convergence, named meta-cognitive moments where the model is asked to stop, look at its own working, and re-enter at a specific step, plus escape paths for when the prescribed plan stops fitting the task at hand.

The natural-language layer tells the model what to do. The topology layer pins down how those steps connect: where to decide, where to loop, where to stop and look at itself. Together they act as a persistent attention anchor that survives long context windows and multi-turn execution chains, which is precisely where a model's own reasoning template typically decays.

Letta is a particularly natural host for the harness because Letta agents are stateful by design (core memory, archival memory, recall memory). The harness_memory tool is meant for exactly this kind of long-running stateful context: sharpening an observation the agent has already formed about cross-turn drift.

Installation

pip install letta-ejentum

Configuration

This shim is different from most: harness functions execute on the Letta server, not in the caller's process. So EJENTUM_API_KEY must be set in the Letta deployment's environment, not the local shell. See the Letta docs on tool-env configuration for your deployment (self-hosted, Letta Cloud, etc.).

Get an Ejentum API key at https://ejentum.com/pricing (free and paid tiers).

Usage

One-liner (recommended)

import os
from letta_client import Letta
from letta_ejentum import register_ejentum_tools

client = Letta(api_key=os.environ["LETTA_API_KEY"])

tools = register_ejentum_tools(client)
tool_ids = [t.id for t in tools]

agent = client.agents.create(
    model="anthropic/claude-sonnet-4-6",
    embedding="openai/text-embedding-3-small",
    tool_ids=tool_ids,
)

response = client.agents.messages.create(
    agent_id=agent.id,
    messages=[
        {"role": "user", "content":
            "We've spent three months on the GraphQL gateway. "
            "Should we keep going or pivot to REST?"},
    ],
)

Register one tool at a time

from letta_client import Letta
from letta_ejentum import harness_anti_deception

client = Letta(api_key="...")

tool = client.tools.upsert_from_function(func=harness_anti_deception)

Human-in-the-loop staging

# Every harness call will require manual approval before execution
tools = register_ejentum_tools(client, default_requires_approval=True)

The four tools

Function Best for Library size
harness_reasoning Analytical, diagnostic, planning, multi-step tasks spanning abstraction, time, causality, simulation, spatial, and metacognition 311 operations
harness_code Code generation, refactoring, review, and debugging across the software-engineering layer 128 operations
harness_anti_deception Prompts that pressure the agent to validate, certify, or soften an honest assessment 139 operations
harness_memory Sharpening an observation already formed about cross-turn drift. Filter-oriented, not write-oriented. Format query as "I noticed X. This might mean Y. Sharpen: Z." 101 operations

What an injection looks like

A real reasoning mode response on the query investigate why our nightly ETL job has started failing intermittently over the past two weeks; nothing in the code or schema has changed:

[NEGATIVE GATE]
The server's response time was accepted as average, despite a suspicious
rhythm break in its timing pattern.

[PROCEDURE]
Step 1: Establish baseline timing profiles by extracting historical
durations and intervals for each event type. Step 2: Compare each observed
timing against its baseline and compute deviation magnitude. Step 3:
Classify anomalies as too fast, too slow, too early, or too late, and rank
by severity. ... Step 5: If deviation exceeds two standard deviations,
probe root cause by tracing upstream dependencies. ...

[REASONING TOPOLOGY]
S1:durations -> FIXED_POINT[baselines] -> N{dismiss_timing_deviations_
without_investigation} -> for_each: S2:compare -> S3:deviation ->
G1{>2sigma?} --yes-> S4:classify -> S5:probe_cause -> FLAG -> continue --no->
S6:validate -> continue -> all_checked -> OUT:anomaly_report

[TARGET PATTERN]
Establish timing baselines by extracting historical response intervals.
Compare current server response time to this baseline. ...

[FALSIFICATION TEST]
If no event timing is flagged as suspiciously fast or slow relative to
baseline, temporal anomaly detection was not active.

Amplify: timing baseline comparison; anomaly classification; security
context elevation
Suppress: average timing acceptance; outlier normalization

The agent reads both the natural-language [PROCEDURE] and the graph-logic [REASONING TOPOLOGY] before generating its user-facing answer. The bracketed labels are instructions to the agent, not content to display.

Why the unusual design

Letta's tool model is fundamentally different from BaseTool subclasses (LangChain, agno, smolagents) or factory toolsets (PydanticAI, CrewAI). Tools are plain Python functions whose source is serialized and executed in Letta's sandboxed runtime. That forces:

  • Imports inside the function body, not at module top. Letta's serializer captures what the function needs at execution time.
  • No constructor, no instance state. Configuration lives in the Letta server's environment (EJENTUM_API_KEY).
  • Google-style docstrings, which Letta parses into the OpenAI tool schema.

This shim respects all three constraints. The four functions are intentionally verbose (some imports and the API URL repeated four times) because each one must stand alone for Letta's serializer.

API reference

from letta_ejentum import (
    harness_reasoning,
    harness_code,
    harness_anti_deception,
    harness_memory,
    HARNESS_FUNCTIONS,           # tuple of all four
    register_ejentum_tools,      # uploads all four to a Letta server
)

register_ejentum_tools(
    client,                                # letta_client.Letta instance
    default_requires_approval: bool = False,
) -> list[letta_client.types.Tool]

Each function returns a string. Errors are returned as human-readable strings (no exceptions cross the function boundary, so an agent step never crashes the run).

MCP alternative. This package uses Letta's tool-upload mechanism. Letta also has an MCP client that can consume the hosted Ejentum MCP endpoint at https://api.ejentum.com/mcp with Bearer auth. The PyPI package skips that wiring and keeps tool-attach down to one line.

Compatibility

  • Python 3.10+
  • letta-client>=0.1.0
  • requests>=2.31.0 (only used as a soft dep for local testing; the actual requests call happens inside the function on the Letta server, which provides its own runtime)

Resources

License

MIT

Measured effects

The Ejentum harness is benchmarked publicly under CC BY 4.0 at github.com/ejentum/benchmarks:

  • ELEPHANT sycophancy: 5.8% composite on GPT-4o (40 real Reddit scenarios)
  • LiveCodeBench Hard: 85.7% to 100% on Claude Opus (28 competitive programming tasks)
  • Memory retention: 50% fewer stale facts served (20-turn implicit state changes)
  • Plus per-harness numbers across BBH/CausalBench/MuSR, ARC-AGI-3, SciCode, and perception tasks

Methodology, scenarios, run scripts, and raw outputs are all in-repo.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

letta_ejentum-0.1.0.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

letta_ejentum-0.1.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file letta_ejentum-0.1.0.tar.gz.

File metadata

  • Download URL: letta_ejentum-0.1.0.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for letta_ejentum-0.1.0.tar.gz
Algorithm Hash digest
SHA256 97b16cbc441b34d4096b8ff2ea9ed5dac799a39c5c5b8aef8bdc585da163a9c4
MD5 109c5e7defdbf21128371adc246621d2
BLAKE2b-256 02aa8b468796bfc34eb4abaa8d4ffa0b52aafdb2e9cde43c0c2ceac35fed3650

See more details on using hashes here.

Provenance

The following attestation bundles were made for letta_ejentum-0.1.0.tar.gz:

Publisher: release.yml on ejentum/letta-ejentum

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file letta_ejentum-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: letta_ejentum-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for letta_ejentum-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49eae9be9be0ebb77de387cf5992c9d89930d6cd36109232e3765f7fe675bcc7
MD5 b8365f63d9611628c5feb33bc085a1d1
BLAKE2b-256 354c6b917110c23b3d1a4885fadb0a7ffafba1bf4ee2afa73e8426eb05030b12

See more details on using hashes here.

Provenance

The following attestation bundles were made for letta_ejentum-0.1.0-py3-none-any.whl:

Publisher: release.yml on ejentum/letta-ejentum

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page