Official Python SDK for Latence API

These details have not been verified by PyPI

Project links

Project description

Latence

Latence Python SDK

Catch hallucinations, drift, and unused context before your users do.
Groundedness scoring for RAG pipelines and AI coding agents, with a one-call path to upgrade data quality — from messy input files to fully generated markdown and knowledge graphs — as well as a high-performance retrieval engine (OSS).

Charge your RAG pipelines and harnesses based on real data.

Quickstart • Trace • Upgrade Data Quality • Upgrade Retrieval • Trace Reference • Full Tutorial

Quickstart

pip install latence
export LATENCE_API_KEY="lat_..."

from latence import Latence

client = Latence()  # reads LATENCE_API_KEY from the environment

r = client.experimental.trace.rag(
    response_text="Paris is the capital of France.",
    raw_context="France's capital city is Paris.",
)
print(r.score, r.band, r.context_coverage_ratio, r.context_unused_ratio)

That's it. You now know whether the answer was grounded, how much of your retrieved context was actually used, and whether to trust it.

Step 1 — Trace your answers

Three lanes, one mental model. Pick the one that matches what your app is doing right now.

RAG groundedness — did the answer actually come from your context?

from latence import Latence

client = Latence()

r = client.experimental.trace.rag(
    response_text="Paris is the capital of France.",
    raw_context="France's capital city is Paris.",
)

print(r.score)                   # 0.0 - 1.0
print(r.band)                    # "green" | "amber" | "red" | "unknown"
print(r.context_coverage_ratio)  # how much of the answer is grounded in context
print(r.context_unused_ratio)    # how much retrieved context was dead weight

Code agents — catch phantom APIs and drift turn-over-turn

Chain turns with the opaque next_session_state handoff. The SDK never forces you to track session internals.

turn1 = client.experimental.trace.code(
    response_text="def add(a, b): return a + b",
    raw_context="# utils.py\ndef sub(a, b): return a - b",
    response_language_hint="python",
)

turn2 = client.experimental.trace.code(
    response_text="def mul(a, b): return a * b",
    raw_context="# utils.py\ndef sub(a, b): return a - b",
    response_language_hint="python",
    session_state=turn1.next_session_state,   # chain turns
)

print(turn2.band)
print(turn2.session_signals.recommendation)   # "continue" | "re_anchor" | "fresh_chat"

Hosted Trace pricing is $0.008/request by default. For higher-cost quality mode, pass profile="quality" to trace.rag(...) or trace.code(...); quality requests bill at $0.016/request.

Session rollup — one scoreboard for a live session

Stateless, CPU-only, sub-ms on the pod. Safe to call on every keystroke.

rollup = client.experimental.trace.rollup(turns=[turn1, turn2])

print(rollup.noise_pct)              # fraction of turns flagged as noise
print(rollup.retrieval_waste_pct)    # fraction of retrieved context left unused
print(rollup.model_drift_pct)        # fraction of turns with drift
print(rollup.reason_code_histogram)  # why the turns failed, aggregated
print(rollup.risk_band_trail)        # per-turn band, chronological
print(rollup.recommendations)        # actionable session-level advice

What the signals tell you to do next

The numbers above are not diagnostics. They are routing rules:

Signal	Meaning	Next step
`band` amber/red, low `context_coverage_ratio`	The answer isn't grounded in what you retrieved.	Upgrade data quality — your upstream documents are the bottleneck.
High `context_unused_ratio`, `retrieval_waste_pct > 30%`	You retrieved the wrong chunks.	Upgrade retrieval — your retriever is the bottleneck.
`session_signals.recommendation = "re_anchor"` / `"fresh_chat"` on the code lane	Session drift is compounding.	Reset the agent's context on the next turn.

Full reference: Trace docs and SDK tutorial §18.

Async

Every method above has an await-able twin under AsyncLatence:

from latence import AsyncLatence

async with AsyncLatence() as client:
    r = await client.experimental.trace.rag(
        response_text="Paris is the capital of France.",
        raw_context="France's capital city is Paris.",
    )

Step 2 — Upgrade data quality

Trace is showing low coverage or amber/red bands? The model is rarely the problem. It's usually the upstream data: un-OCR'd PDFs, missing entities, unresolved references. The Latence Data Intelligence Pipeline cleans that in one call.

job = client.pipeline.run(files=["contract.pdf"])
pkg = job.wait_for_completion()

print(pkg.document.markdown)                         # clean markdown
print(pkg.entities.summary)                          # {"total": 142, "by_type": {...}}
print(pkg.knowledge_graph.summary.total_relations)   # 87
pkg.download_archive("./results.zip")

Smart defaults: OCR → entity extraction → relation extraction. Configure any step explicitly:

job = client.pipeline.run(
    files=["contract.pdf"],
    steps={
        "ocr": {"mode": "performance"},
        "redaction": {"mode": "balanced", "redact": True},
        "extraction": {"label_mode": "hybrid", "threshold": 0.3},
        "relation_extraction": {"resolve_entities": True},
    },
)

Every run returns a structured DataPackage:

pkg.document — markdown + per-page layout (OCR)
pkg.entities — entity list + summary (extraction)
pkg.knowledge_graph — entities + relations + graph summary (relation extraction)
pkg.redaction — cleaned text + PII list (redaction)
pkg.compression — compressed text + ratio (compression)
pkg.quality — per-stage confidence, latency, cost

Power users: the typed PipelineBuilder accepts YAML and validates client-side. See docs/pipelines.md for the full orchestration reference (DAG execution, resumable jobs, progress callbacks).

Corpus-level: Dataset Intelligence

Feed pipeline outputs into client.experimental.dataset_intelligence_service to build corpus-wide knowledge graphs, ontologies, and enriched feature spaces with incremental ingestion:

Tier	Method	What it does
1	`di.enrich()`	Semantic feature vectors (CPU-only, fast)
2	`di.build_graph()`	Entity resolution, knowledge graph, link prediction
3	`di.build_ontology()`	Concept clustering, hierarchy induction
Full	`di.run()`	All three tiers sequentially

See docs/dataset_intelligence.md.

Step 3 — Upgrade retrieval

If Trace keeps flagging a high context_unused_ratio, or the session rollup shows retrieval_waste_pct > 30%, your model isn't the problem — your retrieval engine is shipping the wrong chunks.

→ ColSearch — High Performance Late Interaction and multimodal search engine

ColSearch is our late-interaction retrieval engine: token-level ColBERT recall, native multimodal search over PDFs and images, and a drop-in replacement for the retrieval step in your RAG stack. Wire it in and context_unused_ratio collapses.

Error handling

from latence import (
    LatenceError, AuthenticationError, InsufficientCreditsError,
    RateLimitError, JobError, JobTimeoutError, TransportError,
)

try:
    r = client.experimental.trace.rag(
        response_text="Paris is the capital of France.",
        raw_context="France's capital city is Paris.",
    )
except AuthenticationError:
    ...  # 401
except InsufficientCreditsError:
    ...  # 402
except RateLimitError as e:
    ...  # 429, retry after e.retry_after
except JobError as e:
    ...  # pipeline job failed; check e.is_resumable
except TransportError:
    ...  # network / DNS

The SDK retries on 429 and 5xx with exponential backoff (default 2 retries, respects Retry-After).

Configuration

export LATENCE_API_KEY="lat_your_key"

from latence import Latence
import latence

client = Latence(
    api_key="lat_...",       # or LATENCE_API_KEY env var
    base_url="https://...",  # or LATENCE_BASE_URL env var
    timeout=60.0,            # request timeout (default: 60s)
    max_retries=2,           # retry attempts (default: 2)
)

latence.setup_logging("DEBUG")  # logs every HTTP request/response

Resources


Trace reference	docs/trace.md — parameters and full response schema
Full tutorial	SDK_TUTORIAL.md — every service, every parameter
API docs	docs.latence.ai
Portal	app.latence.ai

_{MIT License • latence.ai}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

May 9, 2026

0.1.6

May 7, 2026

0.1.5

May 7, 2026

0.1.4

May 5, 2026

This version

0.1.3

Apr 28, 2026

0.1.2

Apr 24, 2026

0.1.1

Mar 26, 2026

0.1.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

latence-0.1.3.tar.gz (211.1 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

latence-0.1.3-py3-none-any.whl (119.4 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file latence-0.1.3.tar.gz.

File metadata

Download URL: latence-0.1.3.tar.gz
Upload date: Apr 28, 2026
Size: 211.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for latence-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`d259838cd9c57917c6568ce0c757337c7c17f06ad4114f10e7029e7cc2b1a97b`
MD5	`a3e0a456ec35ac393bd1c766e531173a`
BLAKE2b-256	`852fce6c657ec148de09b385852a500950b237cd57c67bc2d58c4b993118dd99`

See more details on using hashes here.

Provenance

The following attestation bundles were made for latence-0.1.3.tar.gz:

Publisher: publish.yml on latenceainew/latence-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: latence-0.1.3.tar.gz
- Subject digest: d259838cd9c57917c6568ce0c757337c7c17f06ad4114f10e7029e7cc2b1a97b
- Sigstore transparency entry: 1396370642
- Sigstore integration time: Apr 28, 2026
Source repository:
- Permalink: latenceainew/latence-python@86765d05a05791b748bd796807b03e7517359b99
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/latenceainew
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@86765d05a05791b748bd796807b03e7517359b99
- Trigger Event: release

File details

Details for the file latence-0.1.3-py3-none-any.whl.

File metadata

Download URL: latence-0.1.3-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 119.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for latence-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c72f607b4cd837b94abc7becb3201d047e8bc6e254a1d7c0f14954be6374b1ee`
MD5	`18888f0af61d29fb4af55e240277d262`
BLAKE2b-256	`4c6671449efec30ed9401f0f6b988ecf5ba9188ff2d1bdcc16cf270ea99a39e0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for latence-0.1.3-py3-none-any.whl:

Publisher: publish.yml on latenceainew/latence-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: latence-0.1.3-py3-none-any.whl
- Subject digest: c72f607b4cd837b94abc7becb3201d047e8bc6e254a1d7c0f14954be6374b1ee
- Sigstore transparency entry: 1396370650
- Sigstore integration time: Apr 28, 2026
Source repository:
- Permalink: latenceainew/latence-python@86765d05a05791b748bd796807b03e7517359b99
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/latenceainew
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@86765d05a05791b748bd796807b03e7517359b99
- Trigger Event: release

latence 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Latence Python SDK

Quickstart

Step 1 — Trace your answers

RAG groundedness — did the answer actually come from your context?

Code agents — catch phantom APIs and drift turn-over-turn

Session rollup — one scoreboard for a live session

What the signals tell you to do next

Async

Step 2 — Upgrade data quality

Corpus-level: Dataset Intelligence

Step 3 — Upgrade retrieval

Error handling

Configuration

Resources

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance