Token-lean agent↔data query interface — curated DuckDB primitives, not raw SQL passthrough.

These details have not been verified by PyPI

Project links

Project description

lacon

token-lean agent↔data query interface — query files, don't read them

Before/After • Install • What You Get • Primitives • HITL • Python API • CLI Reference

When an agent needs to understand a CSV or Parquet file, the naive approach is to dump the whole file into context. A 5 MB CSV is ~350k tokens. lacon is the opposite: agents query the file and get back only the answer — shaped, capped, and token-costed. DuckDB does the querying under the hood; lacon's value is the interface and the result shaping.

Sibling to datoon (cheaper representation once data is in the prompt) — lacon's lever is to not put the data there at all.

Before / After

Without lacon — dumps file into context

# agent reads the whole file
content = open("sales.csv").read()
# → 350k tokens for a 5 MB file
prompt = f"Analyze this data:\n{content}"

With lacon — gets only the answer

lacon describe sales.csv --pretty

{
  "op": "describe",
  "row_count": 84000,
  "schema": [{"name": "country", "type": "VARCHAR"}, ...],
  "~tokens": 89
}

89 tokens instead of 350k.

Agent wants revenue by country

# without lacon — dumps rows, agent parses
rows = csv.DictReader(open("sales.csv"))
prompt = f"Sum revenue by country from:\n{list(rows)}"
# → thousands of tokens

With lacon — one aggregate call

lacon aggregate sales.csv \
  --group-by country \
  --metrics revenue:sum --pretty

{"op": "aggregate", "rows": [["UA", 4500.25], ...], "~tokens": 62}

Same answer. Tiny fraction of the token cost.

┌─────────────────────────────────────────────────────┐
│  DESCRIBE A 5MB FILE    ████░░░░░░░░   89 tokens    │
│  SAME FILE DUMPED RAW   ████████████  350k tokens   │
│  WRITES ALLOWED         ░░░░░░░░░░░░   0 (read-only)│
└─────────────────────────────────────────────────────┘

[!IMPORTANT] lacon saves tokens by not sending data to the model — only the answer. Savings depend on query type: describe is always cheap; filter/sample returns rows (but capped). Every result includes ~tokens so agents know what they spent.

Install

# core (CLI + DuckDB + SQL validation)
pip install lacon
uv add lacon

# with token estimates (~tokens in every response)
pip install "lacon[tokens]"

# with MCP server (coming in v0.2)
pip install "lacon[mcp]"

Requires Python 3.12+.

Claude Code skill:

claude skill install https://github.com/andrii-su/lacon/releases/latest/download/lacon.skill

Once installed, Claude automatically uses lacon when you mention a .csv, .parquet, or .json file — no manual invocation needed.

What You Get

	What
`lacon` CLI	9 curated primitives for querying data files from terminal or scripts
Python API	`describe()`, `aggregate()`, `filter()`, `query()` — same primitives programmatically
Claude Code Skill	`/lacon` in-session trigger, progressive disclosure workflow, HITL for `query`
~tokens	Every response includes its estimated token cost (requires `lacon[tokens]`)
MCP Server	Coming in v0.2 — expose primitives as MCP tools for Claude Desktop, Cursor, Windsurf

Primitives

9 curated operations. Each builds safe SQL internally, enforces read-only, and applies auto-LIMIT.

Primitive	Use when the agent asks...	Returns
`describe`	"what's in this file?", "what columns?", "how many rows?"	schema + metadata, no data rows
`sample`	"show me some rows", "what does the data look like?"	first/random N rows
`count`	"how many rows?", "how many where X?"	integer
`profile`	"tell me about column X", "any nulls?", "distribution?"	per-column stats
`distinct`	"what are the unique values of X?"	values list + truncation info
`aggregate`	"total/avg X by Y", "group by Z"	grouped rows
`filter`	"show rows where X", "find records matching Y"	matching rows, projected columns
`find-duplicates`	"are there duplicates?", "find duplicate X"	duplicate groups + counts
`query`	anything the primitives above can't express	rows — HITL required

Supported formats: CSV, Parquet, JSON, JSONL — auto-detected from extension.

Quick Start

# always start here
lacon describe data.csv --pretty

# understand a column
lacon profile data.csv --column revenue --pretty

# answer a count question
lacon count data.csv --where "country = 'UA'"

# group by
lacon aggregate data.csv --group-by country --metrics revenue:sum --pretty

# filter with column projection
lacon filter data.csv --where "year = 2024" --columns name revenue --pretty

# unique values
lacon distinct data.csv --column country --pretty

# find duplicates
lacon find-duplicates data.csv --columns email --pretty

# escape hatch — use HITL (see below)
lacon query data.csv "SELECT country, AVG(revenue) FROM {file} GROUP BY country" --show-sql --pretty
lacon query data.csv "SELECT country, AVG(revenue) FROM {file} GROUP BY country" --pretty

Use {file} as the placeholder for the data source in query. Auto-LIMIT is always enforced (default 50, max 1000).

HITL for `query`

The query primitive runs arbitrary SQL — an escape hatch when curated primitives aren't enough. Because the SQL is not predictable from fixed parameters, always preview before executing.

# Step 1 — preview resolved SQL (no execution)
lacon query sales.csv \
  "SELECT country, name, revenue FROM {file} t \
   WHERE revenue = (SELECT MAX(revenue) FROM {file} t2 WHERE t2.country = t.country) \
   ORDER BY revenue DESC" \
  --show-sql --pretty

{
  "op": "query",
  "sql": "SELECT country, name, revenue FROM read_csv('sales.csv') t WHERE ...",
  "will_execute": false
}

# Step 2 — confirm the SQL, then execute
lacon query sales.csv "..." --pretty

Result always includes the sql field so you can verify what ran:

{
  "op": "query",
  "schema": ["country", "name", "revenue"],
  "rows": [["UA", "Bob", 3400.0], ...],
  "shown": 3,
  "~tokens": 74,
  "sql": "SELECT country, name, revenue FROM read_csv('sales.csv') t WHERE ..."
}

Why: text-to-SQL agents achieve ~80% accuracy on real schemas. A subtly wrong WHERE clause executes silently and looks correct. Showing the SQL before execution catches errors before they produce wrong answers. Curated primitives need no confirmation — their SQL is fully determined by parameters.

Output Envelope

Every response is a shaped JSON object:

{
  "op": "aggregate",
  "schema": ["country", "sum_revenue"],
  "rows": [["UA", 4500.25], ["DE", 2200.0], ["US", 2001.25]],
  "shown": 3,
  "~tokens": 62
}

Field	Description
`op`	Which primitive ran
`schema`	Column names — agent never guesses shape
`rows`	Data as arrays — compact, no key repetition
`shown`	How many rows returned (honest truncation)
`~tokens`	Estimated token cost of this response (`lacon[tokens]`)
`sql`	Resolved SQL that ran (`query` only)

describe returns metadata only (no rows). count returns a single integer.

Python API

from lacon import DuckDBEngine, describe, sample, count, aggregate, filter, query

with DuckDBEngine() as engine:
    # schema + metadata
    r = describe("data.csv", engine)
    print(r["row_count"], r["schema"])

    # column stats
    r = profile("data.csv", column="revenue", engine=engine)
    print(r["min"], r["max"], r["mean"])

    # grouped aggregation
    r = aggregate(
        "data.csv",
        group_by=["country"],
        metrics=[{"col": "revenue", "fn": "sum"}],
        engine=engine,
    )

    # filtered rows with projection
    r = filter("data.csv", where="revenue > 2000", columns=["name", "revenue"], engine=engine)

    # escape hatch — read-only, auto-LIMIT
    r = query("data.csv", "SELECT country, AVG(revenue) FROM {file} GROUP BY country", engine=engine)

Safety

Read-only — no writes, no DDL, no COPY, no INSTALL
SQL validation — sqlglot parses every query call; rejects INSERT, DROP, CREATE, multi-statement
Auto-LIMIT — injected if missing, capped at 1000
Path escaping — single quotes in paths escaped before DuckDB

CLI Reference

Global options

Flag	Description
`--pretty`	Pretty-print JSON output
`--version`	Print version and exit

`describe <path>`

Schema, row count, file size. Always start here.

`sample <path>`

Flag	Default	Description
`--n N`	5	Number of rows
`--random`	false	Random sample instead of first N

`count <path>`

Flag	Description
`--where EXPR`	Optional WHERE filter

`profile <path>`

Flag	Description
`--column COL`	Column to profile (required)
`--top-k N`	Top-k values for categorical columns (default 10)

`aggregate <path>`

Flag	Description
`--group-by COL [COL ...]`	Group-by columns
`--metrics COL:FN [...]`	Metrics — fn ∈ `sum avg min max count`
`--where EXPR`	Optional WHERE filter
`--limit N`	Row cap (default 50)

`filter <path>`

Flag	Description
`--where EXPR`	WHERE filter (required)
`--columns COL [...]`	Column projection
`--limit N`	Row cap (default 50)

`distinct <path>`

Flag	Description
`--column COL`	Column to enumerate (required)
`--limit N`	Value cap (default 50)

`find-duplicates <path>`

Flag	Description
`--columns COL [...]`	Key columns (required)
`--limit N`	Group cap (default 50)

`query <path> SQL`

Flag	Description
`--limit N`	Row cap (default 50, max 1000)
`--show-sql`	Preview resolved SQL without executing (HITL)

Use {file} as the data-source placeholder in SQL.

Development

# setup
uv sync --extra dev
uvx pre-commit install

# tests (46 — primitives, safety, shaping)
uv run --with ".[dev,tokens]" pytest

# build lacon.skill for distribution
python scripts/build_skill.py

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.1

Jun 22, 2026

1.0.2

Jun 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lacon-1.1.1.tar.gz (103.1 kB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lacon-1.1.1-py3-none-any.whl (15.9 kB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file lacon-1.1.1.tar.gz.

File metadata

Download URL: lacon-1.1.1.tar.gz
Upload date: Jun 22, 2026
Size: 103.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lacon-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`e3cbad79d1a41fcf99a7070134a517056f1c03cc652031dcab93b43102e164cc`
MD5	`d4b8eccb5f42c7955c6dc147f4e10ad3`
BLAKE2b-256	`4d1efb5087e5408210ca5087cf90de8134731799127e732ea06133101586fac6`

See more details on using hashes here.

File details

Details for the file lacon-1.1.1-py3-none-any.whl.

File metadata

Download URL: lacon-1.1.1-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 15.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lacon-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`56dc39ab05ab6f2beeb9cee0b14dcffcc9a76a81e4494e835920794e2bf09d00`
MD5	`979ede50ff071d3b322da85d80f5e44a`
BLAKE2b-256	`005320770e5ccc4747f528efbf6894a8925c70b354e3844e5edb16f383d65536`

See more details on using hashes here.

lacon 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

lacon

Before / After

Without lacon — dumps file into context

With lacon — gets only the answer

Agent wants revenue by country

With lacon — one aggregate call

Install

What You Get

Primitives

Quick Start

HITL for query

Output Envelope

Python API

Safety

CLI Reference

Global options

describe <path>

sample <path>

count <path>

profile <path>

aggregate <path>

filter <path>

distinct <path>

find-duplicates <path>

query <path> SQL

Development

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

HITL for `query`

`describe <path>`

`sample <path>`

`count <path>`

`profile <path>`

`aggregate <path>`

`filter <path>`

`distinct <path>`

`find-duplicates <path>`

`query <path> SQL`