The multi-modal data warehouse built for the AI era. Unified analytics for structured data, JSON, and vector embeddings with near 100% Snowflake compatibility.
Project description
Databend local mode Python Binding
Python binding for Databend in local mode - The multi-modal data warehouse built for the AI era.
Databend unifies structured data, JSON documents, and vector embeddings in a single platform with near 100% Snowflake compatibility. Built in Rust with MPP architecture and S3-native storage for cloud-scale analytics.
Installation
pip install databend
To test, run:
python3 -c "import databend; ctx = databend.SessionContext(); ctx.sql('SELECT version() AS version').show()"
API Reference
Core Operations
| Method | Description |
|---|---|
connect(path=":memory:") |
Create a local embedded connection |
SessionContext() |
Create a new session context |
sql(query) |
Execute SQL query, returns DataFrame |
execute(query) |
Execute SQL and return the connection |
table(name) |
Query a registered table or view |
register(name, source) |
Register a local path, pandas/polars frame, or Arrow table |
from_df(obj, name=None) |
Materialize an in-process dataframe-like object as a relation |
File Registration
| Method | Description |
|---|---|
register_parquet(name, path, pattern=None, connection=None) |
Register Parquet files as table |
register_csv(name, path, pattern=None, connection=None) |
Register CSV files as table |
register_ndjson(name, path, pattern=None, connection=None) |
Register NDJSON files as table |
register_text(name, path, pattern=None, connection=None) |
Register TEXT files as table |
Cloud Storage Connections
| Method | Description |
|---|---|
create_s3_connection(name, key, secret, endpoint=None, region=None) |
Create S3 connection |
create_azblob_connection(name, url, account, key) |
Create Azure Blob connection |
create_gcs_connection(name, url, credential) |
Create Google Cloud connection |
list_connections() |
List all connections |
describe_connection(name) |
Show connection details |
drop_connection(name) |
Remove connection |
Stage Management
| Method | Description |
|---|---|
create_stage(name, url, connection) |
Create external stage |
show_stages() |
List all stages |
list_stages(stage_name) |
List files in stage |
describe_stage(name) |
Show stage details |
drop_stage(name) |
Remove stage |
DataFrame Operations
| Method | Description |
|---|---|
collect() |
Execute and collect results |
show(num=20) |
Display results in console |
to_pandas() |
Convert to pandas DataFrame |
to_polars() |
Convert to polars DataFrame |
to_arrow_table() |
Convert to PyArrow Table |
df() |
DuckDB-style alias for to_pandas() |
pl() |
Alias for to_polars() |
arrow() |
Alias for to_arrow_table() |
fetchall() |
Collect result rows as Python tuples |
fetchone() |
Return the first row or None |
Examples
Local Tables
import databend
ctx = databend.connect()
# Create and query in-memory tables
ctx.execute("CREATE TABLE users (id INT, name STRING, age INT)")
ctx.execute("INSERT INTO users VALUES (1, 'Alice', 25), (2, 'Bob', 30)")
df = ctx.sql("SELECT * FROM users WHERE age > 25").df()
Working with Local Files
import databend
ctx = databend.connect("./demo-data")
# Query local Parquet files
ctx.read_parquet("/path/to/orders/", name="orders")
ctx.register("customers", "/path/to/customers.parquet")
df = ctx.sql("SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id").df()
Working with In-Process DataFrames
import databend
import pandas as pd
ctx = databend.connect("./memory-store")
docs = pd.DataFrame(
{
"id": [1, 2],
"content": ["hello", "vector memory"],
}
)
ctx.register("docs", docs)
rows = ctx.sql("SELECT id, content FROM docs ORDER BY id").fetchall()
Cloud Storage - S3 Files
import databend
import os
ctx = databend.SessionContext()
# Connect to S3 and query remote files
ctx.create_s3_connection("s3", os.getenv("AWS_ACCESS_KEY_ID"), os.getenv("AWS_SECRET_ACCESS_KEY"))
ctx.register_parquet("trips", "s3://bucket/trips/", connection="s3")
df = ctx.sql("SELECT COUNT(*) FROM trips").to_pandas()
Cloud Storage - S3 Tables
import databend
import os
ctx = databend.SessionContext()
# Create S3 connection and persistent table
ctx.create_s3_connection("s3", os.getenv("AWS_ACCESS_KEY_ID"), os.getenv("AWS_SECRET_ACCESS_KEY"))
ctx.sql("CREATE TABLE s3_table (id INT, name STRING) 's3://bucket/table/' CONNECTION=(CONNECTION_NAME='s3')").collect()
df = ctx.sql("SELECT * FROM s3_table").to_pandas()
Development
# Setup environment
uv sync
source .venv/bin/activate
# Run tests
uvx maturin develop -E test
pytest tests/
Local Publish Without Docker
cd src/bendpy
# Build only
./scripts/local_publish.sh --python python3.12 --skip-upload
# Build only and clean old artifacts first
./scripts/local_publish.sh --python python3.12 --clean --skip-upload
# Build multiple Linux targets on the current host
./scripts/local_publish.sh --python python3.12 --clean --skip-upload \
--target x86_64-unknown-linux-gnu \
--target aarch64-unknown-linux-gnu
# Build and upload to PyPI
PYPI_TOKEN=your-token ./scripts/local_publish.sh --python python3.12
# Optional: override the package version for this build
PYPI_TOKEN=your-token ./scripts/local_publish.sh --python python3.12 --version 0.1.1
This script builds a wheel on the current host with maturin and uploads it with twine.
It does not use Docker and does not produce a manylinux wheel. Cross-target builds also depend on
the host having a working linker/toolchain for the requested target.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file databend-1.2.895-cp312-abi3-manylinux_2_28_x86_64.whl.
File metadata
- Download URL: databend-1.2.895-cp312-abi3-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 86.2 MB
- Tags: CPython 3.12+, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9de88cd999cacea39e66d83bb7215b7ae08937f95e89624fa708f691fcee469
|
|
| MD5 |
beac56de2d37d8c4680d0c1c7dd63211
|
|
| BLAKE2b-256 |
63e7db155c6433cc74105c2c401e2e72286c01dd2ab4d2de8d716ea882ec1350
|
File details
Details for the file databend-1.2.895-cp312-abi3-manylinux_2_28_aarch64.whl.
File metadata
- Download URL: databend-1.2.895-cp312-abi3-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 79.1 MB
- Tags: CPython 3.12+, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32951c570afe4878dc4c261d785b7122e25abb49d9159dbe35d272fb5d93f2b9
|
|
| MD5 |
a92b4b8656f38955ee0bff4bcb58b66b
|
|
| BLAKE2b-256 |
52aa62616f04ad114a089611c36d6d3f33d1b5806f1e919a208fe69ce6bc53ae
|
File details
Details for the file databend-1.2.895-cp312-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: databend-1.2.895-cp312-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 75.3 MB
- Tags: CPython 3.12+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d909b6e3b246b2e6c00ef8cf22a9e571605738482f6d8b5671462055df1a8ad5
|
|
| MD5 |
51c52059c3b5ab60718a93fe677eca5e
|
|
| BLAKE2b-256 |
8506007f78a7d0fcde2efa9fe576371a096da6465c5316c7325670190d4a6471
|
File details
Details for the file databend-1.2.895-cp312-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: databend-1.2.895-cp312-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 77.4 MB
- Tags: CPython 3.12+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5979bbb0725ac5396d0b14e963b2e1627e8e132b0ac5ff52565bc737f7a6b198
|
|
| MD5 |
c9771c862407d61c5d53bbe780df9ce9
|
|
| BLAKE2b-256 |
1a0c21dc32a9e71bda49c57f1761dd433f5bb4b3357b56ada9818c179dff958d
|