Skip to main content

The AI Data Layer: Secure, sandboxed environment for AI agents to query and process data.

Project description

Strake Logo

Strake

The AI Data Layer

License PRs Welcome Docs


Strake is the AI Data Layer. Not just a query tool, and not a RAG pipeline. It's the sandboxed execution environment where agents meet your data and return answers, not rows.

Built on Apache Arrow DataFusion, Strake enables AI agents to discover, query, and process data across your entire stack (PostgreSQL, Snowflake, S3, and more) without the need for data movement or ETL.

📚 Full Documentation: Check out the complete documentation for installation, architecture, and API references.


Key Features

  • Developer First: Built for engineers. Type-safe configuration, rich CLI tooling, and local development workflows.
  • Secure Execution Layer: Run untrusted Python code safely using Firecracker MicroVMs or Native OS Sandboxing (Landlock, Seccomp, Namespaces).
  • High Performance: Sub-second latency for federated joins using Apache Arrow.
  • Pluggable Sources: Postgres, S3, Local Files, REST, gRPC, and more.
  • MCP-Native Discovery: Built for the Model Context Protocol. Your agents discover your entire data catalog and schemas instantly.
  • Python Native: Zero-copy integration with Pandas and Polars via PyO3.
  • Enterprise Governance: Row-Level Security (RLS), Column Masking, and OIDC Authentication (Enterprise Edition).
  • Observability: Built-in OpenTelemetry tracing and Prometheus metrics.
  • GitOps Native: Manage your data mesh configuration as code. Version control your sources, policies, and metrics.
  • Enterprise Features: OIDC, Row-Level Security, and Data Contracts (see Enterprise Edition).

Code Mode: Don't Compute in Context

Most agents fail by swallowing thousands of raw SQL rows. Strake's Code Mode lets them process data in Python inside a secure sandbox, sending only the parsed results to the LLM.

import strake
from strake.mcp import run_python

# Query 10M rows instantly via DataFusion
# Aggregate in Python to prevent context bloat
script = """
df = strake.sql("SELECT * FROM user_events")
summary = df.groupby('feature_flag')['latency'].median()
print(summary.to_json())
"""

# Runs isolated with OS Sandboxing or Firecracker VMs
result = await run_python(script)

Quick Start (5-Minute Setup)

1. Installation

Quick Install (Linux/macOS)

curl -sSfL https://strakedata.com/install.sh | sh

Install via Cargo (Rust)

cargo install --path crates/cli
cargo install --path crates/server

Python Client

pip install strake

2. Configuration (GitOps)

Initialize a new config and validate your sources:

# Initialize a new config
strake-cli init

# Validate configuration
strake-cli validate sources.yaml

# Apply to the metadata store (Sync)
strake-cli apply sources.yaml --force

3. Query with Python

First, define your data sources in a sources.yaml file:

sources:
  - name: local_files
    type: csv
    path: "data/*.csv"
    has_header: true
    tables:
      - name: measurements

Then, query using the Strake Python client:

import strake
import polars as pl

# Connect using your source configuration
conn = strake.connect(sources_config="sources.yaml")

# Query across sources using standard SQL
query = "SELECT * FROM measurements LIMIT 5"
data = conn.sql(query)

# Zero-copy integration with Polars/Pandas
df = pl.from_arrow(data)
print(df)

Project Structure

Component Description
strake-runtime Orchestration layer (Federation Engine, Sidecar).
strake-connectors Data source implementations (Postgres, S3, REST, etc).
strake-sql SQL Dialects, Query Optimization, and Substrait generation.
strake-common Shared types, configuration, and error handling.
strake-server Arrow Flight SQL server implementation.
strake-cli GitOps CLI for managing data mesh configurations.
strake-python Python bindings for high-performance data access.

Contributing

We welcome contributions! Please see our Contributing Guidelines for details on how to get started.

License

Strake is licensed under the Apache 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

strake-0.2.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl (62.8 MB view details)

Uploaded PyPymanylinux: glibc 2.28+ x86-64

strake-0.2.1-cp310-abi3-win_amd64.whl (54.1 MB view details)

Uploaded CPython 3.10+Windows x86-64

strake-0.2.1-cp310-abi3-manylinux_2_28_x86_64.whl (62.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

strake-0.2.1-cp310-abi3-macosx_11_0_arm64.whl (53.9 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file strake-0.2.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for strake-0.2.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0549ea549c7f125b1f24e8f11b36e5dc121db4e5777c385ec47f50e0dee29edd
MD5 917041c20094c2574d4e1e6feeca6c7c
BLAKE2b-256 5cc3a5ee978f6a841851a7197bda284381376d28d2bfff7a85236c7aaf85c258

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.1-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strake-0.2.1-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: strake-0.2.1-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 54.1 MB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for strake-0.2.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 036854164e844f036a2eb32103987a1c478ff6100e6e90f06868ea88364dc7a8
MD5 4d111d427060b06c83e9ea544c60b32c
BLAKE2b-256 2d4b35bae1a676dc27e1d407d0c29ee6a3fb15f6d3d5b1145e3d36081d68a8c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.1-cp310-abi3-win_amd64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strake-0.2.1-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for strake-0.2.1-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 98d64df4fa2775b10210d9f5dace99f0ea69023355e58a476607fa55e25de80a
MD5 a7e3b5aba4d504e827bbd95431a1a048
BLAKE2b-256 b009411b55fbb3227aafbcd1df7ff45b0d6554833bb0930aba74fd99d193fa9a

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.1-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strake-0.2.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for strake-0.2.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 55f40911c5e31a153a8927c1b3b711649f44b16a09a8b550ccaf9dc9000686c7
MD5 408d169cbe731d797a450db6f968635c
BLAKE2b-256 7ddf84883980063bb8f449c3fca28bd3e9596240ca7bd13fabbadec76e9b1e9f

See more details on using hashes here.

Provenance

The following attestation bundles were made for strake-0.2.1-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on strake-data/strake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page