Skip to main content

High-performance embedded database with Rust core and Python API

Project description

ApexBase

High-performance embedded database with Rust core and Python API

ApexBase is a high-performance embedded database powered by a Rust core, with a clean and ergonomic Python API.

✨ Features

  • 🚀 High performance - Rust core with batch write throughput up to 970K+ ops/s
  • 📦 Single-file storage - custom .apex file format with no external dependencies
  • �️ SQL DDL support - CREATE TABLE, ALTER TABLE, DROP TABLE via standard SQL
  • Full-text search - NanoFTS integration with fuzzy search support
  • 🐍 Python-friendly - clean API with Pandas/Polars/PyArrow integrations
  • 💾 Compact storage - ~45% smaller on disk compared to traditional approaches

📦 Installation

# Install from PyPI
pip install apexbase

# Build from source
maturin develop --release

🚀 Quick Start

Installation

pip install apexbase

Basic Usage

from apexbase import ApexClient

# Create a client (data stored in single .apex file)
client = ApexClient("./data")

# Store single record
client.store({"name": "Alice", "age": 30, "city": "Beijing"})

# Store multiple records
client.store([
    {"name": "Bob", "age": 25, "city": "Shanghai"},
    {"name": "Charlie", "age": 35, "city": "Beijing"}
])

# SQL query (recommended)
results = client.execute("SELECT * FROM default WHERE age > 28")

# Convert to DataFrame
df = results.to_pandas()

# Close client
client.close()

Table Management

# Create and switch tables
client.create_table("users")
client.use_table("users")

# List all tables
tables = client.list_tables()

# Drop table
client.drop_table("old_table")

Data Operations

# Store from various formats
import pandas as pd
import polars as pl
import pyarrow as pa

# From pandas DataFrame
df = pd.DataFrame({"name": ["A", "B"], "age": [20, 30]})
client.from_pandas(df)

# From polars DataFrame
df_pl = pl.DataFrame({"name": ["C", "D"], "age": [25, 35]})
client.from_polars(df_pl)

# From PyArrow Table
table = pa.table({"name": ["E", "F"], "age": [28, 38]})
client.from_pyarrow(table)

# Columnar storage (fastest for bulk data)
client.store({
    "name": ["G", "H", "I"],
    "age": [22, 32, 42]
})

Query Operations

# Full SQL support
results = client.execute("SELECT name, age FROM default WHERE age > 25 ORDER BY age DESC LIMIT 10")

# WHERE expression (compatibility mode)
results = client.query("age > 28")
results = client.query("name LIKE 'A%'")
results = client.query(where_clause="city = 'Beijing'", limit=100)

# Aggregation
agg = client.execute("SELECT COUNT(*), AVG(age), MAX(age) FROM default")
count = agg.scalar()  # Get single value

# Retrieve by _id (internal auto-increment ID)
record = client.retrieve(0)
records = client.retrieve_many([0, 1, 2])
all_data = client.retrieve_all()

Column Operations

# Add column
client.add_column("email", "String")

# Rename column
client.rename_column("email", "email_address")

# Drop column
client.drop_column("email_address")

# Get column type
dtype = client.get_column_dtype("age")

# List all fields
fields = client.list_fields()

SQL DDL (Data Definition Language)

ApexBase supports full SQL DDL operations:

# Create table via SQL
client.execute("CREATE TABLE employees")
client.execute("CREATE TABLE IF NOT EXISTS departments")  # No error if exists

# Add columns via SQL
client.execute("ALTER TABLE employees ADD COLUMN name STRING")
client.execute("ALTER TABLE employees ADD COLUMN age INT")

# Insert data via SQL
client.execute("INSERT INTO employees (name, age) VALUES ('Alice', 30)")
client.execute("INSERT INTO employees (name, age) VALUES ('Bob', 25), ('Charlie', 35)")

# Query the data
results = client.execute("SELECT * FROM employees WHERE age > 28")

# Drop table via SQL
client.execute("DROP TABLE employees")
client.execute("DROP TABLE IF EXISTS departments")  # No error if not exists

Multi-Statement SQL

You can execute multiple SQL statements in a single call by separating them with semicolons:

# Execute multiple DDL statements at once
client.execute("""
    CREATE TABLE IF NOT EXISTS products;
    ALTER TABLE products ADD COLUMN name STRING;
    ALTER TABLE products ADD COLUMN price FLOAT;
    INSERT INTO products (name, price) VALUES ('Laptop', 999.99)
""")

# Execute multiple INSERT statements
client.execute("""
    INSERT INTO products (name, price) VALUES ('Mouse', 29.99);
    INSERT INTO products (name, price) VALUES ('Keyboard', 79.99);
    INSERT INTO products (name, price) VALUES ('Monitor', 299.99)
""")

# Query results
results = client.execute("SELECT * FROM products ORDER BY price DESC")
print(results.to_pandas())

Full-Text Search

# Initialize FTS
client.init_fts(index_fields=["name", "city"], lazy_load=True)

# Search
ids = client.search_text("Alice")
records = client.search_and_retrieve("Beijing")
top_records = client.search_and_retrieve_top("keyword", n=10)

# Fuzzy search (tolerates typos)
ids = client.fuzzy_search_text("Alic")

# FTS stats
stats = client.get_fts_stats()

# Disable or drop FTS
client.disable_fts()
client.drop_fts()

ResultView Operations

results = client.execute("SELECT * FROM default")

# Convert to different formats
df = results.to_pandas()          # pandas DataFrame
pl_df = results.to_polars()       # polars DataFrame
arrow_table = results.to_arrow()  # PyArrow Table
dicts = results.to_dict()         # List of dicts

# Result properties
print(results.shape)       # (rows, columns)
print(results.columns)     # column names
print(len(results))        # row count

# Get single values
first_row = results.first()
ids = results.get_ids()    # numpy array
scalar = client.execute("SELECT COUNT(*) FROM default").scalar()

Context Manager Support

# Automatic cleanup with context manager
with ApexClient("./data") as client:
    client.store({"key": "value"})
    results = client.execute("SELECT * FROM default")
    # Client automatically closed on exit

📊 Performance Comparison

ApexBase vs DuckDB

Comparison with DuckDB (v1.1.3), a popular embedded analytical database.

Test Environment

Component Specification
Platform macOS 26.2 (arm64)
CPU Apple M1 Pro
Memory 32.0 GB
Python 3.11.10
ApexBase v0.4.2
DuckDB v1.1.3
PyArrow 19.0.0

Dataset: 1,000,000 rows with columns: name (string), age (int), score (float), category (string)

Query Performance (average of 5 iterations, after 2 warmup runs)

Query ApexBase DuckDB Ratio
COUNT(*) 0.08ms 0.37ms 0.22x (4.6x faster)
SELECT * LIMIT 100 0.09ms 0.25ms 0.35x (2.9x faster)
SELECT * LIMIT 10K 0.26ms 3.53ms 0.07x (13.6x faster)
Filter (name = 'user_5000') 7.41ms 6.35ms 1.17x
Insert 1M rows 327.44ms 197844.50ms 0.00x (604x faster)

Notes:

  • Ratio < 1 means ApexBase is faster than DuckDB
  • ApexBase excels at INSERT operations and large LIMIT queries due to Arrow IPC optimization
  • DuckDB has better performance on complex GROUP BY and ORDER BY operations

🔧 API Reference

ApexClient

Initialization

client = ApexClient(
    dirpath="./data",           # Data directory (default: current dir)
    drop_if_exists=False,       # Delete existing data on open
    batch_size=1000,            # Batch size for operations
    enable_cache=True,          # Enable query cache
    cache_size=10000,           # Cache size
    prefer_arrow_format=True,   # Prefer Arrow format for results
    durability="fast",          # Durability level: "fast" | "safe" | "max"
)

# Create clean instance (drop existing data)
client = ApexClient.create_clean("./data")

# Context manager
with ApexClient("./data") as client:
    ...

Table Management

Method Description
create_table(name) Create a new table
drop_table(name) Drop a table
use_table(name) Switch to a table
list_tables() List all tables
current_table Property: get current table name

Data Storage

Method Description
store(data) Store data (dict, list, DataFrame, Arrow Table)
from_pandas(df) Import from pandas DataFrame
from_polars(df) Import from polars DataFrame
from_pyarrow(table) Import from PyArrow Table

Data Retrieval

Method Description
retrieve(id) Get record by internal _id
retrieve_many(ids) Get multiple records by _id
retrieve_all() Get all records
execute(sql) Execute SQL query
query(where, limit) Query with WHERE expression
count_rows(table) Count rows in table

Data Modification

Method Description
replace(id, data) Replace a record
batch_replace({id: data}) Batch replace records
delete(id) or delete([ids]) Delete record(s)

Column Operations

Method Description
add_column(name, type) Add a column
drop_column(name) Drop a column
rename_column(old, new) Rename a column
get_column_dtype(name) Get column data type
list_fields() List all fields/columns

Full-Text Search

Method Description
init_fts(fields, lazy_load, cache_size) Initialize FTS
search_text(query) Search documents
fuzzy_search_text(query) Fuzzy search (tolerates typos)
search_and_retrieve(query, limit, offset) Search and return records
search_and_retrieve_top(query, n) Return top N results
get_fts_stats() Get FTS statistics
disable_fts() Disable FTS
drop_fts() Drop FTS index

Utility

Method Description
flush() Flush data to disk
set_auto_flush(rows, bytes) Set auto-flush thresholds
get_auto_flush() Get auto-flush configuration
estimate_memory_bytes() Estimate memory usage
close() Close the client

ResultView

Query results are returned as ResultView objects:

Method/Property Description
to_pandas(zero_copy=True) Convert to pandas DataFrame
to_polars() Convert to polars DataFrame
to_arrow() Convert to PyArrow Table
to_dict() Convert to list of dicts
scalar() Get single scalar value
first() Get first row
get_ids(return_list=False) Get record IDs
shape Property: (rows, columns)
columns Property: column names
__len__() Row count
__iter__() Iterate over rows
__getitem__(idx) Index access

📚 Documentation

Documentation entry point: docs/README.md

📄 License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apexbase-0.4.2.tar.gz (322.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

apexbase-0.4.2-cp313-cp313-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.13Windows x86-64

apexbase-0.4.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

apexbase-0.4.2-cp313-cp313-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

apexbase-0.4.2-cp312-cp312-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.12Windows x86-64

apexbase-0.4.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

apexbase-0.4.2-cp312-cp312-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

apexbase-0.4.2-cp311-cp311-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.11Windows x86-64

apexbase-0.4.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

apexbase-0.4.2-cp311-cp311-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

apexbase-0.4.2-cp310-cp310-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.10Windows x86-64

apexbase-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

apexbase-0.4.2-cp310-cp310-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

apexbase-0.4.2-cp39-cp39-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.9Windows x86-64

apexbase-0.4.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

apexbase-0.4.2-cp39-cp39-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file apexbase-0.4.2.tar.gz.

File metadata

  • Download URL: apexbase-0.4.2.tar.gz
  • Upload date:
  • Size: 322.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-0.4.2.tar.gz
Algorithm Hash digest
SHA256 8c30bf9e329cfd14b0255e017786155a1beb827d6fb204d9ca71bb07c8afa69a
MD5 6d5c2f3e5e2fdba546301dd78a896fc6
BLAKE2b-256 39f57323155f31540a62acdc74a7d24fc0419c1221c897ddb917c6192445cee5

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: apexbase-0.4.2-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-0.4.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 8b6cb2583084d1e74e9c398eef39d91716170bc85ea29a2c8970f7cedc8d416e
MD5 2c6ba59f23116729dbf6ef570b26bb96
BLAKE2b-256 821546ca0d0d39d8227eea25e681b8d5b171a3c7a8213e3295aa163c3d6846a9

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 86421e54c8a9774fa798a754eb8dc149f1cf05325a7fe4734bfe79a5e0d7ace4
MD5 1f7f3a37032279890b722e565e96db98
BLAKE2b-256 035445f091ea455884a96ae200dbec2c0c9d57f6aacbc1349f584e624f8febc5

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 366566d199c994092726625a1f8b59df242902a1c8f9c974d9ac8517302eb016
MD5 63f8f06114435ac19fd721847b8b11a7
BLAKE2b-256 7c8afd725dfbac5a1c95a3a833035fe1e793abc5071e3e14b37b85015fbc7300

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: apexbase-0.4.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-0.4.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 7b6e812f8ccb79262f8eb72dff12d44533d9ab299f1cc80134052e4a808854d9
MD5 e8f50af9e444683562acacddb1cd198d
BLAKE2b-256 da7a16dd41b1c4eb12240f9f1d175eb64f510433abfc561de1f89f977f17d099

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ebcca1d18538b53b2696a6422bd49e78fee544e7831675a7b881c33e807347a2
MD5 b585c1e87442d12b05777dc1dd055a3f
BLAKE2b-256 439dd9504db92724ddeef5db13a256b226f74953562b4c59bcc33972de9aa635

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 88bdd3cc90534f0bba6b8c3e99f86745f863cdff36e16a844c46fe6f98276b0b
MD5 5fe124916cf2a03ff4e889f93c010931
BLAKE2b-256 34257a3d03383d5501800366494d4ff9bcff3d9bd3ea11997fd8692c9d116d56

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: apexbase-0.4.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-0.4.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 1e3e1c0f1e9fa3247ece3e1702d64e11b5a9c39bf8bc474416e3dcf13938d49a
MD5 3a37dd5cdc19122588119ae8d20ac4a2
BLAKE2b-256 be8959a1e888b14b6aa6674d0ffd0f9a5bdef621823eadfc07eae3765e806bd3

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd309cf10a99c304890c5ccc2ca13d85e4d4611dac34dbecd16d1009b646c064
MD5 e0de6e33ab809d4efa373dc5abda7984
BLAKE2b-256 e49ef8225b9540b6b24e79383e8be3a7d5da4ba781e918284735cebaf5c5705f

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 44dc4b075be3a2f4deace55f04ef6f3805ee14025f328932f928378ed91209ae
MD5 0baf87165429694cab9140058fab6cc4
BLAKE2b-256 fc25028ae57bbb051689cdb44093a1a813678db52c636335389c87673b7bdfba

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: apexbase-0.4.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-0.4.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 7f714fbe25c951b1d19f60233cbd77ad3d27cceaa5f716a238a589675ebd503d
MD5 400f8ab033f75b649138a7e0c6fa3ce0
BLAKE2b-256 17fee730604df2a08b7634ec3103f162c00db5eba76fb24f26e0f6d490bf0d75

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6eb3f06e7dcbb716050ff5883bbfa8c2f2ad05c0deb4721294e5ac45b5930555
MD5 667ea1a3f119a31be8d008f1b3e447d7
BLAKE2b-256 c81c3acfaf4cd7271f90ce9a8fe7e09d36d3eed2bb516d35a0dd3375827974d4

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d930d2281224479c91a1237ce885a93729ccec0518540ab4a880baaab22a0bf5
MD5 eafd68d6e7d0f12baf0dce9885de6eb2
BLAKE2b-256 343960e58873588ad9cad98771b449acd8c9ccce63220ebe51de2dcdd83f4919

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: apexbase-0.4.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-0.4.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 3fea1967a94d53d9a793c8e73add4996a94912352f77b0a666d752739769afe9
MD5 2a233b4fd8d2fd8832610535f90e69c9
BLAKE2b-256 deaf26b0d7414f0a49ad00c4e82eaea32191c5106d7b6be36e16c1e3a892ba60

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 66dc8b4bbef357e6c595a126f8e038b413d20bb74151fd04752cd9c238dfb08a
MD5 9799f63234578057460797890a893e96
BLAKE2b-256 f2631cb0800d8c74b9bb86f402c141648559af9d351067ed3d1a8790981c5a3b

See more details on using hashes here.

File details

Details for the file apexbase-0.4.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for apexbase-0.4.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 19962edde818481ee2a5e599b7e28b55fec50839154f1d0796d9b14b85fbcde6
MD5 c084c454ac94865567743ef92524eb1a
BLAKE2b-256 a172c0dddc29a3c042c330258c8ad43cdaaff3b5dfe7984ee32e0bcc04215a8c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page