Skip to main content

The AI Data Layer: Secure, sandboxed environment for AI agents to query and process data.

Project description

Strake Logo

Strake

The AI Data Layer

License PRs Welcome Docs


Strake is the AI Data Layer. Not a query tool. Not a RAG pipeline. The sandboxed execution environment where agents meet your data and return answers, not rows.

Built on Apache Arrow DataFusion, Strake enables AI agents to discover, query, and process data across your entire stack (PostgreSQL, Snowflake, S3, and more) without the need for data movement or ETL. Give AI agents structured access to your entire data stack safely.

📚 Full Documentation: Check out the complete documentation for installation, architecture, and API references.


Key Features

  • MCP-Native Discovery: Built for the Model Context Protocol. Your agents immediately discover your entire data catalog and schemas.
  • Run Python, Not Prompts: Every agent execution runs inside strict native OS sandboxes for performance, or ephemeral MicroVMs for hardware-level isolation.
  • Zero-Copy Federation: Query Postgres, S3, Local Files, REST, gRPC, and more simultaneously with Pushdown optimization via Apache Arrow.
  • Read-Only by Default: Strict read-only enforcement, dynamic Row-Level Security (RLS), and PII masking out of the box.
  • Developer First: Built for engineers shipping agents to production. Type-safe configuration, rich CLI tooling, and local development workflows.
  • Python Native: Zero-copy integration with Pandas and Polars via PyO3.
  • GitOps Native: Manage your data mesh configuration as code. Version control your sources, policies, and metrics.
  • Observability: Built-in OpenTelemetry tracing and Prometheus metrics.
  • Enterprise Capabilities: OIDC Authentication, Row-Level Security, and Data Contracts (Enterprise Edition).

Code Mode: Don't Compute in Context

Most agents fail by swallowing thousands of raw SQL rows. Strake's Code Mode lets them process data in Python where it lives, inside a secure sandbox, sending only the parsed results that matter to the LLM.

import strake
from strake.mcp import run_python

script = """
# 1. Query 10M rows instantly via DataFusion
df = strake.sql("SELECT * FROM user_events")

# 2. Aggregate in Python to prevent context bloat
summary = df.groupby('feature_flag')['latency'].median()

# 3. Print exactly what the LLM needs
print(summary.to_json())
"""

# Runs isolated with OS Sandboxing or Firecracker VMs
result = await run_python(script)
print(result)

Quick Start (5-Minute Setup)

If you're building agents that need to query Postgres, S3, and a REST API in a single operation — without context overflow and without leaking credentials — Strake is the runtime you're missing.

1. Installation

Quick Install (Linux/macOS)

curl -sSfL https://strakedata.com/install.sh | sh

Install via Cargo (Rust)

cargo install --path crates/cli
cargo install --path crates/server

Python Client

pip install strake

2. Configuration (GitOps)

Initialize a new config and validate your sources:

# Initialize a new config
strake-cli init

# Validate configuration
strake-cli validate sources.yaml

# Apply to the metadata store (Sync)
strake-cli apply sources.yaml --force

3. Query with Python

First, define your data sources in a sources.yaml file:

sources:
  - name: local_files
    type: csv
    path: "data/*.csv"
    has_header: true
    tables:
      - name: measurements

Then, query using the Strake Python client:

import strake
import polars as pl

# Connect using your source configuration
conn = strake.connect(sources_config="sources.yaml")

# Query across sources using standard SQL
query = "SELECT * FROM measurements LIMIT 5"
data = conn.sql(query)

# Zero-copy integration with Polars/Pandas
df = pl.from_arrow(data)
print(df)

Project Structure

Component Description
strake-runtime Orchestration layer (Federation Engine, Sidecar).
strake-connectors Data source implementations (Postgres, S3, REST, etc).
strake-sql SQL Dialects, Query Optimization, and Substrait generation.
strake-common Shared types, configuration, and error handling.
strake-server Arrow Flight SQL server implementation.
strake-cli GitOps CLI for managing data mesh configurations.
strake-python Python bindings for high-performance data access.

Contributing

We welcome contributions! Please see our Contributing Guidelines for details on how to get started.

License

Strake is licensed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

strake-0.2.3rc1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (62.8 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

strake-0.2.3rc1-cp310-abi3-win_amd64.whl (54.0 MB view details)

Uploaded CPython 3.10+Windows x86-64

strake-0.2.3rc1-cp310-abi3-manylinux_2_28_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

strake-0.2.3rc1-cp310-abi3-macosx_11_0_arm64.whl (53.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file strake-0.2.3rc1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for strake-0.2.3rc1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b8f9e7b87a5864c25636b49800500d5939ca59e64a9b6a62bd19b6e62629456f
MD5 4e0e4d6e63a9a32daaab116ac067b832
BLAKE2b-256 688dcbb491b16c7d250a84397b669fa71ae363841a7ae993727be3dbeac921bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.3rc1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strake-0.2.3rc1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: strake-0.2.3rc1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 54.0 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for strake-0.2.3rc1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ba14b90a24455512c73dfb77c74618145aed50249d15ff5fa77fc85af2f9889f
MD5 7fdb34ae02d9e2dc8bffe197afdb6f66
BLAKE2b-256 d877cee502c48bb69583434c59816eb578a2631e83f26a6905b40f02e81321b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.3rc1-cp310-abi3-win_amd64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strake-0.2.3rc1-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for strake-0.2.3rc1-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0d810e1824d74674b1feb265756c0c1d151e08bc9e9c7f066187f45c1079f172
MD5 35406b5d88959c542fbbc2a60af3554c
BLAKE2b-256 b3a746f189f7426725543f9416ec405f25bc8b2b26ed7023d2654f5f70086516

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.3rc1-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strake-0.2.3rc1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for strake-0.2.3rc1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0120370b4dc94a3cb9e399ac975e4cfb318cff288d9572302ef7994691da24f4
MD5 1058468937e6b4cff9ed0ed1a5e48a19
BLAKE2b-256 cacc6ba1b19fd0c84f48e49331342de135a06b5f12c1b36b0c741166b9ec9ef9

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.3rc1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page