Skip to main content

LARS: Declarative agent framework with first class SQL integration

Project description

LARS - AI That Speaks SQL

PyPI version Python 3.10+ License: O'SASSY Docs

Your team knows SQL. Why learn Python for AI?

Add AI operators directly to your SQL queries — from your existing SQL client, on your existing databases.

SELECT * FROM support_tickets
WHERE description MEANS 'urgent customer issue';

That's it. No notebooks. No orchestration code. No vector database to provision. Just SQL with semantic understanding (and a declarative workflow engine underneath). Express intent, not patterns — especially when you don't know what you're looking for.

  • Use your existing SQL client: PostgreSQL wire protocol (lars serve sql)
  • Your data never moves: DuckDB federation across Postgres/MySQL/BigQuery/Snowflake/S3/files
  • Cached + cost-attributed: query LLM calls via all_data / sql_query_log
  • Optional Studio UI: inspect runs, takes, costs, and "what the model saw" (not required)

One line. That's all it takes.

Before: Regex, LIKE patterns, and brittle keyword matching.

SELECT * FROM tickets
WHERE description LIKE '%urgent%'
   OR description LIKE '%critical%'
   OR description LIKE '%asap%'
   -- still misses "need this fixed immediately"

After: One line that understands meaning.

SELECT * FROM tickets
WHERE description MEANS 'urgent customer issue'

What Can You Do?

-- Filter by meaning, not keywords
SELECT * FROM products
WHERE description MEANS 'eco-friendly'

-- Score relevance (0.0 to 1.0)
SELECT title, description ABOUT 'sustainability' AS relevance
FROM reports
ORDER BY relevance DESC

-- Semantic deduplication
SELECT SEMANTIC DISTINCT company_name FROM leads

-- Find contradictions (compliance, fact-checking)
SELECT * FROM disclosures
WHERE statement CONTRADICTS 'no material changes'

-- Summarize groups
SELECT category, SUMMARIZE(reviews) AS summary
FROM feedback
GROUP BY category

-- Group by auto-discovered topics
SELECT TOPICS(title, 5) AS topic, COUNT(*) AS count
FROM articles
GROUP BY topic

-- Vector similarity search
SELECT * FROM docs
WHERE title SIMILAR_TO 'quarterly earnings report'
LIMIT 10

-- Ask arbitrary questions
SELECT
  product_name,
  ASK('Is this suitable for children? yes/no', description) AS kid_friendly
FROM products

50+ built-in operators for filtering, logic, transformation, aggregation, data quality, parsing, and more.

Quick Start

# Install
pip install larsql

# Set your LLM API key (OpenRouter, or see docs for others)
export OPENROUTER_API_KEY=sk-or-v1-...

# Start the SQL server (PostgreSQL wire protocol)
lars serve sql --port 15432

# Connect with any SQL client
psql postgresql://localhost:15432/default

That's it. Run semantic queries from DBeaver, DataGrip, psql, Tableau, or any PostgreSQL client. For a full end-to-end setup (ClickHouse + sample data + Studio UI), see the Quickstart Guide.

Screenshot: DataGrip connected over pgwire

DataGrip querying LARS via PostgreSQL wire protocol

How It Works

LARS uses query rewriting - your semantic SQL is transformed into standard SQL with UDF calls that execute LLM operations. Your database stays untouched.

WHERE description MEANS 'urgent'
         ↓
WHERE semantic_matches('urgent', description)
         ↓
UDF runs LLM → returns true/false

Results are cached - same query on same data costs zero after the first run.

Every semantic UDF call is also logged (model, tokens, cost, duration) into queryable "magic tables":

SELECT session_id, cell_name, model, cost, duration_ms
FROM all_data
WHERE is_sql_udf = true
ORDER BY timestamp DESC
LIMIT 20;
Screenshot: costs + context (optional Studio UI)

Studio showing context inspector, query logs, takes, and cost analytics

Every semantic operator is backed by a cascade file under cascades/semantic_sql/ - edit YAML to change behavior or create your own operator. If you want a visual view of the same execution data, Studio is a UI over these logs (optional).

Wait, it gets weirder.

Semantic SQL is just the beginning. Under the hood, LARS is a declarative agent framework for building sophisticated LLM workflows.

The Problem It Solves

Every LLM project eventually becomes this:

for attempt in range(max_retries):
    try:
        result = llm.call(prompt)
        if validate(result):
            return result
        prompt += f"\nError: {validation.error}. Try again."
    except JSONDecodeError as e:
        prompt += f"\nFailed to parse: {e}"
# 47 lines later... still doesn't work reliably

The LARS Solution

Run multiple attempts in parallel. Filter errors naturally. Pick the best.

- name: generate_analysis
  instructions: "Analyze the sales data..."
  takes:
    factor: 3  # Run 3 times in parallel
    evaluator_instructions: "Pick the most thorough analysis"

Instead of serial retries hoping one succeeds, run N attempts simultaneously and select the winner. Same cost, faster execution, higher quality output.

Declarative Workflows (Cascades)

Define multi-step agent workflows in YAML:

cascade_id: analyze_data
cells:
  - name: query_data
    tool: sql_data
    tool_inputs:
      query: "SELECT * FROM sales WHERE date > '2024-01-01'"

  - name: analyze
    instructions: |
      Analyze this sales data: {{ outputs.query_data }}
      Create visualizations and summarize key trends.
    skills:
      - create_chart
      - smart_sql_run
    takes:
      factor: 3
      evaluator_instructions: "Pick the most insightful analysis"
    handoffs: [review]

  - name: review
    instructions: "Summarize the findings"
    context:
      from: [analyze]

Key Concepts

Concept What It Does
Cascades Declarative YAML workflows
Cells Execution stages (LLM, deterministic, or human-in-the-loop)
Takes Parallel execution → filter errors → pick best
Reforge Iterative refinement of winning output
Wards Validation barriers (blocking, retry, advisory)
Skills Tools available to agents (are also FULL multi-cell cascades!)

Database Support

LARS connects to your existing databases:

  • DuckDB (default, in-memory or file)
  • PostgreSQL, MySQL, ClickHouse
  • BigQuery, Snowflake
  • S3, Azure, GCS (Parquet, CSV, JSON)

Your data stays where it is. LARS queries it federated-style. Join across DB boundaries.

LLM Providers

Works with any LLM via LiteLLM:

  • OpenRouter (default) - access to 200+ models, excellently granular cost tracking
  • OpenAI, Anthropic, Google
  • Ollama (local & remote models, zero cost)
  • Azure OpenAI, AWS Bedrock, Vertex AI

Installation Options

# Basic
pip install larsql

# With browser automation (Playwright)
pip install larsql[browser]

# With local models (HuggingFace)
pip install larsql[local-models]

# Everything
pip install larsql[all]

Running Cascades

# Run a workflow
lars run cascades/example.yaml --input '{"task": "analyze sales data"}'

# With model override
lars run cascades/example.yaml --model "anthropic/claude-sonnet-4"

Studio Web UI (Optional)

# Launch the visual interface
lars serve studio

# Access at http://localhost:5050
# - SQL IDE with semantic operators
# - Cascade runner (incl. takes + winners)
# - Context inspector ("what the model saw")
# - Cost explorer (by query/cascade/model)
Screenshots: Studio overview

Studio overview screenshots

Documentation

Full documentation at larsql.com

Example: Create Your Own Operator

Any cascade can become a SQL operator. No Python required.

# cascades/semantic_sql/sentiment_score.cascade.yaml
cascade_id: sentiment_score
sql_function:
  name: SENTIMENT_SCORE
  operators:
    - "SENTIMENT_SCORE({{ text }})"
  returns: DOUBLE
  shape: SCALAR
cells:
  - name: score
    model: google/gemini-2.5-flash-lite
    instructions: |
      Rate the sentiment of this text from -1.0 to 1.0.
      TEXT: {{ input.text }}
      Return only the number.

Worried about the output? Me too. Run validations or multiple takes (on multiple models), all within a SQL call.

Now use it:

SELECT product_id, AVG(SENTIMENT_SCORE(review)) AS sentiment
FROM reviews
GROUP BY product_id
HAVING sentiment < -0.3

Contributing

Issues welcome at github.com/ryrobes/larsql

License

O'SASSY License (basically MIT)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

larsql-2.2.4.tar.gz (33.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

larsql-2.2.4-py3-none-any.whl (33.9 MB view details)

Uploaded Python 3

File details

Details for the file larsql-2.2.4.tar.gz.

File metadata

  • Download URL: larsql-2.2.4.tar.gz
  • Upload date:
  • Size: 33.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for larsql-2.2.4.tar.gz
Algorithm Hash digest
SHA256 6d06ea81f804f54b229827e27c51ce3db41fa97ee0a18d467cfa40691956aef8
MD5 23642337fa0f394a4e3248f83e17dec1
BLAKE2b-256 5ec6cada2e1e85bc5a8359914dd982bf65bbe8874e831b595e07488f605026e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for larsql-2.2.4.tar.gz:

Publisher: publish.yml on ryrobes/larsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file larsql-2.2.4-py3-none-any.whl.

File metadata

  • Download URL: larsql-2.2.4-py3-none-any.whl
  • Upload date:
  • Size: 33.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for larsql-2.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 74cea73ab48a7d16f2d5f56a3fcb6421bef8a2cc7c246f008292b9fcff5be441
MD5 0045205fd9e27d944ac0610587a08d0e
BLAKE2b-256 c631791bf2294657edca60d377697f5bf1c3f5fa1af5ca26504a4a39b54965c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for larsql-2.2.4-py3-none-any.whl:

Publisher: publish.yml on ryrobes/larsql

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page