High-performance HTAP embedded database with Rust core and Python API

These details have not been verified by PyPI

Project links

Project description

ApexBase

High-performance HTAP embedded database with Rust core and Python API

ApexBase is an embedded columnar database designed for Hybrid Transactional/Analytical Processing (HTAP) workloads. It combines a high-throughput columnar storage engine written in Rust with an ergonomic Python API, delivering analytical query performance that surpasses DuckDB and SQLite on most benchmarks — all in a single .apex file with zero external dependencies.

Features

HTAP architecture — V4 Row Group columnar storage with DeltaStore for cell-level updates; fast inserts and fast analytical scans in one engine
Multi-database support — multiple isolated databases in one directory; cross-database queries with standard db.table SQL syntax
Single-file storage — custom .apex format per table, no server process, no external dependencies
Comprehensive SQL — DDL, DML, JOINs (INNER/LEFT/RIGHT/FULL/CROSS), subqueries (IN/EXISTS/scalar), CTEs (WITH ... AS), UNION/UNION ALL, window functions, EXPLAIN/ANALYZE, multi-statement execution
70+ built-in functions — math (ABS, SQRT, POWER, LOG, trig), string (UPPER, LOWER, SUBSTR, REPLACE, CONCAT, REGEXP_REPLACE, ...), date (YEAR, MONTH, DAY, DATEDIFF, DATE_ADD, ...), conditional (COALESCE, IFNULL, NULLIF, CASE WHEN, GREATEST, LEAST)
Aggregation and analytics — COUNT, SUM, AVG, MIN, MAX, COUNT(DISTINCT), GROUP BY, HAVING, ORDER BY with NULLS FIRST/LAST
Window functions — ROW_NUMBER, RANK, DENSE_RANK, NTILE, PERCENT_RANK, CUME_DIST, LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE, RUNNING_SUM, and windowed SUM/AVG/COUNT/MIN/MAX with PARTITION BY and ORDER BY
Transactions — BEGIN / COMMIT / ROLLBACK with OCC (Optimistic Concurrency Control), SAVEPOINT / ROLLBACK TO / RELEASE, statement-level auto-rollback
MVCC — multi-version concurrency control with snapshot isolation, version store, and garbage collection
Indexing — B-Tree and Hash indexes with CREATE INDEX / DROP INDEX / REINDEX; automatic multi-index AND intersection for compound predicates
Full-text search — built-in NanoFTS integration with fuzzy matching
JIT compilation — Cranelift-based JIT for predicate evaluation and SIMD-vectorized aggregations
Zero-copy Python bridge — Arrow IPC between Rust and Python; direct conversion to Pandas, Polars, and PyArrow
Durability levels — configurable fast / safe / max with WAL support and crash recovery
Compact storage — dictionary encoding for low-cardinality strings, LZ4 and Zstd compression
Parquet interop — COPY TO / COPY FROM Parquet files
PostgreSQL wire protocol — built-in server for DBeaver, psql, DataGrip, pgAdmin, Navicat, and any PostgreSQL-compatible client; two distribution modes (Python CLI or standalone Rust binary)
Arrow Flight gRPC server — high-performance columnar data transfer over HTTP/2; streams Arrow IPC RecordBatch directly, 4–7× faster than PG wire for large result sets; accessible via pyarrow.flight, Go arrow, Java arrow, and any Arrow Flight client
Cross-platform — Linux, macOS, and Windows; x86_64 and ARM64; Python 3.9 -- 3.13

Installation

pip install apexbase

Build from source (requires Rust toolchain):

maturin develop --release

Quick Start

from apexbase import ApexClient

# Open (or create) a database directory
client = ApexClient("./data")

# Create a table
client.create_table("users")

# Store records
client.store({"name": "Alice", "age": 30, "city": "Beijing"})
client.store([
    {"name": "Bob", "age": 25, "city": "Shanghai"},
    {"name": "Charlie", "age": 35, "city": "Beijing"},
])

# SQL query
results = client.execute("SELECT * FROM users WHERE age > 28 ORDER BY age DESC")

# Convert to DataFrame
df = results.to_pandas()

client.close()

Usage Guide

Database Management

ApexBase supports multiple isolated databases within a single root directory. Each named database lives in its own subdirectory; the default database uses the root directory.

# Switch to a named database (creates it if needed)
client.use_database("analytics")

# Combined: switch database + select/create a table in one call
client.use(database="analytics", table="events")

# List all databases
dbs = client.list_databases()  # ["analytics", "default", "hr"]

# Current database
print(client.current_database)  # "analytics"

# Cross-database SQL — standard db.table syntax
client.execute("SELECT * FROM default.users")
client.execute("SELECT u.name, e.event FROM default.users u JOIN analytics.events e ON u.id = e.user_id")
client.execute("INSERT INTO analytics.events (name) VALUES ('click')")
client.execute("UPDATE default.users SET age = 31 WHERE name = 'Alice'")
client.execute("DELETE FROM default.users WHERE age < 18")

All SQL operations (SELECT, INSERT, UPDATE, DELETE, JOIN, CREATE TABLE, DROP TABLE, ALTER TABLE) support database.table qualified names, allowing cross-database queries in a single statement.

Table Management

Each table is stored as a separate .apex file. Tables must be created before use.

# Create with optional schema
client.create_table("orders", schema={
    "order_id": "int64",
    "product": "string",
    "price": "float64",
})

# Switch tables
client.use_table("users")

# List / drop
tables = client.list_tables()
client.drop_table("orders")

Data Ingestion

import pandas as pd
import polars as pl
import pyarrow as pa

# Columnar dict (fastest for bulk data)
client.store({
    "name": ["D", "E", "F"],
    "age": [22, 32, 42],
})

# From pandas / polars / PyArrow (auto-creates table when table_name given)
client.from_pandas(pd.DataFrame({"name": ["G"], "age": [28]}), table_name="users")
client.from_polars(pl.DataFrame({"name": ["H"], "age": [38]}), table_name="users")
client.from_pyarrow(pa.table({"name": ["I"], "age": [48]}), table_name="users")

SQL

ApexBase supports a broad SQL dialect. Examples:

# DDL
client.execute("CREATE TABLE IF NOT EXISTS products")
client.execute("ALTER TABLE products ADD COLUMN name STRING")
client.execute("DROP TABLE IF EXISTS products")

# DML
client.execute("INSERT INTO users (name, age) VALUES ('Zoe', 29)")
client.execute("UPDATE users SET age = 31 WHERE name = 'Alice'")
client.execute("DELETE FROM users WHERE age < 20")

# SELECT with full clause support
client.execute("""
    SELECT city, COUNT(*) AS cnt, AVG(age) AS avg_age
    FROM users
    WHERE age BETWEEN 20 AND 40
    GROUP BY city
    HAVING cnt > 1
    ORDER BY avg_age DESC
    LIMIT 10
""")

# JOINs
client.execute("""
    SELECT u.name, o.product
    FROM users u
    INNER JOIN orders o ON u._id = o.user_id
""")

# Subqueries
client.execute("SELECT * FROM users WHERE age > (SELECT AVG(age) FROM users)")
client.execute("SELECT * FROM users WHERE city IN (SELECT city FROM cities WHERE pop > 1000000)")

# CTEs
client.execute("""
    WITH seniors AS (SELECT * FROM users WHERE age >= 30)
    SELECT city, COUNT(*) FROM seniors GROUP BY city
""")

# Window functions
client.execute("""
    SELECT name, age,
           ROW_NUMBER() OVER (ORDER BY age DESC) AS rank,
           AVG(age) OVER (PARTITION BY city) AS city_avg
    FROM users
""")

# UNION
client.execute("""
    SELECT name FROM users WHERE city = 'Beijing'
    UNION ALL
    SELECT name FROM users WHERE city = 'Shanghai'
""")

# Multi-statement
client.execute("""
    INSERT INTO users (name, age) VALUES ('New1', 20);
    INSERT INTO users (name, age) VALUES ('New2', 21);
    SELECT COUNT(*) FROM users
""")

# INSERT ... ON CONFLICT (upsert)
client.execute("""
    INSERT INTO users (name, age) VALUES ('Alice', 31)
    ON CONFLICT (name) DO UPDATE SET age = 31
""")

# CREATE TABLE AS
client.execute("CREATE TABLE seniors AS SELECT * FROM users WHERE age >= 30")

# EXPLAIN / EXPLAIN ANALYZE
client.execute("EXPLAIN SELECT * FROM users WHERE age > 25")

# Parquet interop
client.execute("COPY users TO '/tmp/users.parquet'")
client.execute("COPY users FROM '/tmp/users.parquet'")

Transactions

client.execute("BEGIN")
client.execute("INSERT INTO users (name, age) VALUES ('Tx1', 20)")
client.execute("SAVEPOINT sp1")
client.execute("INSERT INTO users (name, age) VALUES ('Tx2', 21)")
client.execute("ROLLBACK TO sp1")   # undo Tx2 only
client.execute("COMMIT")            # Tx1 persisted

Transactions use OCC validation — concurrent writes are detected at commit time.

Indexes

client.execute("CREATE INDEX idx_age ON users (age)")
client.execute("CREATE UNIQUE INDEX idx_name ON users (name)")

# Queries automatically use indexes when applicable
client.execute("SELECT * FROM users WHERE age = 30")  # index scan

client.execute("DROP INDEX idx_age ON users")
client.execute("REINDEX users")

Full-Text Search

ApexBase ships a native full-text search engine (NanoFTS) integrated directly into the SQL executor. FTS is available through all interfaces — Python API, PostgreSQL Wire, and Arrow Flight — without any Python-side middleware.

SQL interface (recommended)

# 1. Create the FTS index via SQL DDL
client.execute("CREATE FTS INDEX ON articles (title, content)")

# Optional: specify lazy loading and cache size
client.execute("CREATE FTS INDEX ON logs WITH (lazy_load=true, cache_size=50000)")

# 2. Query using MATCH() / FUZZY_MATCH() in WHERE
results = client.execute("SELECT * FROM articles WHERE MATCH('rust programming')")
results = client.execute("SELECT title, content FROM articles WHERE FUZZY_MATCH('pytohn')")

# Combine with other predicates
results = client.execute("""
    SELECT * FROM articles
    WHERE MATCH('machine learning') AND published_at > '2024-01-01'
    ORDER BY _id DESC LIMIT 20
""")

# FTS also works in aggregations
count = client.execute("SELECT COUNT(*) FROM articles WHERE MATCH('deep learning')")

# Manage indexes
client.execute("SHOW FTS INDEXES")           # list all FTS-enabled tables
client.execute("ALTER FTS INDEX ON articles DISABLE")  # disable, keep files
client.execute("DROP FTS INDEX ON articles") # remove index + delete files

Python API (alternative)

# Initialize FTS for current table
client.use_table("articles")
client.init_fts(index_fields=["title", "content"])

# Search
ids    = client.search_text("database")
fuzzy  = client.fuzzy_search_text("databse")   # tolerates typos
recs   = client.search_and_retrieve("python", limit=10)
top5   = client.search_and_retrieve_top("neural network", n=5)

# Lifecycle
client.get_fts_stats()
client.disable_fts()   # suspend without deleting files
client.drop_fts()      # remove index + delete files

Tip: The SQL interface (MATCH() / FUZZY_MATCH()) works over PG Wire and Arrow Flight without any extra setup; the Python API methods are Python-process-only.

Record-Level Operations

record = client.retrieve(0)               # by internal _id
records = client.retrieve_many([0, 1, 2])
all_data = client.retrieve_all()

client.replace(0, {"name": "Alice2", "age": 31})
client.delete(0)
client.delete([1, 2, 3])

Column Operations

client.add_column("email", "String")
client.rename_column("email", "email_addr")
client.drop_column("email_addr")
client.get_column_dtype("age")    # "Int64"
client.list_fields()              # ["name", "age", "city"]

ResultView

Query results are returned as ResultView objects with multiple output formats:

results = client.execute("SELECT * FROM users")

df = results.to_pandas()       # pandas DataFrame (zero-copy by default)
pl_df = results.to_polars()    # polars DataFrame
arrow = results.to_arrow()     # PyArrow Table
dicts = results.to_dict()      # list of dicts

results.shape                  # (rows, columns)
results.columns                # column names
len(results)                   # row count
results.first()                # first row as dict
results.scalar()               # single value (for aggregates)
results.get_ids()              # numpy array of _id values

Context Manager

with ApexClient("./data") as client:
    client.create_table("tmp")
    client.store({"key": "value"})
    # Automatically closed on exit

Performance

ApexBase vs SQLite vs DuckDB (1M rows)

Three-way comparison on macOS 26.3, Apple arm (10 cores), 32 GB RAM. Python 3.11.10, ApexBase v1.6.0, SQLite v3.45.3, DuckDB v1.1.3, PyArrow v19.0.0.

Dataset: 1,000,000 rows × 5 columns (name, age, score, city, category). Average of 5 timed iterations after 2 warmup runs.

Query	ApexBase	SQLite	DuckDB	vs Best Other
Bulk Insert (1M rows)	340ms	915ms	898ms	2.6x faster
COUNT(*)	0.070ms	9.29ms	0.523ms	7.5x faster
SELECT * LIMIT 100 [cold]	0.042ms	0.065ms	0.259ms	1.5x faster
SELECT * LIMIT 100 [warm]	2.54µs	0.064ms	0.273ms	25x faster
SELECT * LIMIT 10K [cold]	0.801ms	6.76ms	4.80ms	6x faster
SELECT * LIMIT 10K [warm]	3.55µs	6.96ms	4.69ms	>1000x faster
Filter (name = 'user_5000')	0.046ms	41.44ms	1.64ms	36x faster
Filter (age BETWEEN 25 AND 35)	0.041ms	169ms	96.75ms	>2000x faster
GROUP BY city (10 groups)	0.032ms	360ms	3.87ms	121x faster
GROUP BY + HAVING	0.032ms	370ms	4.23ms	132x faster
ORDER BY score LIMIT 100	0.032ms	52.72ms	8.52ms	266x faster
Aggregation (5 funcs)	0.040ms	85.26ms	1.65ms	41x faster
Complex (Filter+Group+Order)	0.032ms	166ms	3.18ms	99x faster
Point Lookup (by _id)	0.030ms	0.053ms	3.58ms	1.8x faster
Insert 1K rows	0.640ms	1.33ms	2.86ms	2.1x faster
SELECT * → pandas (full scan)	0.744ms	1210ms	181ms	243x faster
GROUP BY city, category (100 grp)	0.032ms	722ms	6.03ms	188x faster
LIKE filter (name LIKE 'user_1%')	33.70ms	137ms	60.23ms	1.8x faster
Multi-cond (age>30 AND score>50)	0.037ms	356ms	212ms	>5000x faster
ORDER BY city, score DESC LIMIT 100	0.035ms	71.03ms	7.63ms	218x faster
COUNT(DISTINCT city)	0.035ms	92.58ms	4.51ms	129x faster
IN filter (city IN 3 cities)	0.038ms	327ms	161ms	>4000x faster
UPDATE rows (age = 25)	8.51ms	39.48ms	17.17ms	2.0x faster
DELETE 1K rows	137ms	39ms	3.6ms	38x slower†
Window ROW_NUMBER (cached)	0.035ms	502ms	47ms	>1000x faster
FTS Index Build (1M rows)	2.28s	1.47s	1.16s	2.0x slower‡
FTS Search ('Electronics')	0.124ms	21ms	23ms	>170x faster

Summary: wins 25 of 27 benchmarks (25W / 0T / 2L). "Cold" = fresh DB open per iteration; "warm" = cached backend.

† DELETE triggers full-table rewrite for MVCC safety; DuckDB uses in-place deletion.
‡ FTS index build time; search latency is 170x faster once built.

Cold comparison is fair: all three engines measured without gc.collect() interference.

Reproduce: python benchmarks/bench_vs_sqlite_duckdb.py --rows 1000000

Server Protocols

ApexBase ships two complementary server protocols for external access:

Protocol	Port	Best for	Binary / CLI
PG Wire	5432	DBeaver, psql, DataGrip, BI tools	`apexbase-server`
Arrow Flight	50051	Python (pyarrow), Go, Java, Spark	`apexbase-flight`

Combined Launcher (Both Servers at Once)

# Start PG Wire + Arrow Flight simultaneously
apexbase-serve --dir /path/to/data

# Custom ports
apexbase-serve --dir /path/to/data --pg-port 5432 --flight-port 50051

# Disable one server
apexbase-serve --dir /path/to/data --no-flight   # PG Wire only
apexbase-serve --dir /path/to/data --no-pg       # Arrow Flight only

Flag	Default	Description
`--dir`, `-d`	`.`	Directory containing `.apex` database files
`--host`	`127.0.0.1`	Bind host for both servers
`--pg-port`	`5432`	PostgreSQL Wire port
`--flight-port`	`50051`	Arrow Flight gRPC port
`--no-pg`	—	Disable PG Wire server
`--no-flight`	—	Disable Arrow Flight server

PostgreSQL Wire Protocol Server

ApexBase includes a built-in PostgreSQL wire protocol server, allowing you to connect using DBeaver, psql, DataGrip, pgAdmin, Navicat, and any other tool that supports the PostgreSQL protocol.

Starting the Server

Method 1: Python CLI (after pip install apexbase)

apexbase-server --dir /path/to/data --port 5432

Options:

Flag	Default	Description
`--dir`, `-d`	`.`	Directory containing `.apex` database files
`--host`	`127.0.0.1`	Host to bind to (use `0.0.0.0` for remote access)
`--port`, `-p`	`5432`	Port to listen on

Method 2: Standalone Rust binary (no Python required)

# Build
cargo build --release --bin apexbase-server --no-default-features --features server

# Run
./target/release/apexbase-server --dir /path/to/data --port 5432

Connecting with Database Tools

The server emulates PostgreSQL 15.0, reports a pg_catalog and information_schema compatible metadata layer, and supports SimpleQuery protocol. No username or password is required (authentication is disabled).

DBeaver

New Database Connection → choose PostgreSQL
Fill in connection details:
- Host: 127.0.0.1 (or the --host you specified)
- Port: 5432 (or the --port you specified)
- Database: apexbase (any value accepted)
- Authentication: select No Authentication or leave username/password empty
Click Test Connection → Finish
DBeaver will discover tables and columns automatically via pg_catalog / information_schema

psql

psql -h 127.0.0.1 -p 5432 -d apexbase

DataGrip / IntelliJ IDEA

Database tool window → + → Data Source → PostgreSQL
Set Host, Port, Database as above; leave User and Password empty
Click Test Connection → OK

pgAdmin

Add New Server → General tab: give it a name
Connection tab: set Host and Port; leave Username as postgres (ignored) and Password empty
Save — tables appear under Databases > apexbase > Schemas > public > Tables

Navicat for PostgreSQL

Connection → PostgreSQL
Set Host, Port; leave User and Password blank
Test Connection → OK

Other Compatible Tools

Any tool or library that speaks the PostgreSQL wire protocol (libpq) can connect, including:

TablePlus, Beekeeper Studio, Heidisql
Python: psycopg2 / asyncpg
Node.js: pg (node-postgres)
Go: pgx / lib/pq
Rust: tokio-postgres / sqlx
Java: JDBC PostgreSQL driver

Example with psycopg2:

import psycopg2

conn = psycopg2.connect(host="127.0.0.1", port=5432, dbname="apexbase")
cur = conn.cursor()
cur.execute("SELECT * FROM users LIMIT 10")
print(cur.fetchall())
conn.close()

Supported SQL over Wire Protocol

The wire protocol server passes SQL directly to the ApexBase query engine. All SQL features listed in Usage Guide are available, including JOINs, CTEs, window functions, transactions, and DDL.

Metadata Compatibility

The server implements a pg_catalog compatibility layer that responds to common catalog queries:

Catalog / View	Purpose
`pg_catalog.pg_namespace`	Schema listing
`pg_catalog.pg_database`	Database listing
`pg_catalog.pg_class`	Table discovery
`pg_catalog.pg_attribute`	Column metadata
`pg_catalog.pg_type`	Type information
`pg_catalog.pg_settings`	Server settings
`information_schema.tables`	Standard table listing
`information_schema.columns`	Standard column listing
`SET` / `SHOW` statements	Client configuration probes

This enables GUI tools to browse tables, inspect columns, and display data types without modification.

Supported Protocol Features

Feature	Status
Simple Query Protocol	✅ Fully supported
Extended Query Protocol (prepared statements)	✅ Supported — schema cached, binary format for psycopg3
Cross-database SQL (`db.table`)	✅ Supported — `USE dbname` / `\c dbname` to switch context
`pg_catalog` / `information_schema`	✅ Compatible layer for GUI tools
All ApexBase SQL (JOINs, CTEs, window functions, DDL)	✅ Full pass-through to query engine

Limitations

Authentication is not implemented — the server accepts all connections regardless of username/password
SSL/TLS is not supported — use an SSH tunnel (ssh -L 5432:127.0.0.1:5432 user@host) for remote access

Arrow Flight gRPC Server

Arrow Flight sends Arrow IPC RecordBatch directly over gRPC (HTTP/2), bypassing per-row text serialization entirely. It is 4–7× faster than PG wire for large result sets (10K+ rows).

Query	PG Wire	Arrow Flight	Speedup
SELECT 10K rows	5.1ms	0.7ms	7× faster
BETWEEN (~33K rows)	22ms	5.6ms	4× faster
Single row / point lookup	~7.5ms	~7.9ms	equal

Starting the Flight Server

Python CLI:

apexbase-flight --dir /path/to/data --port 50051

Standalone Rust binary:

cargo build --release --bin apexbase-flight --no-default-features --features flight
./target/release/apexbase-flight --dir /path/to/data --port 50051

Python Client

import pyarrow.flight as fl
import pandas as pd

client = fl.connect("grpc://127.0.0.1:50051")

# SELECT — returns Arrow Table
table = client.do_get(fl.Ticket(b"SELECT * FROM users LIMIT 10000")).read_all()
df = table.to_pandas()              # zero-copy to pandas
pl_df = pl.from_arrow(table)        # zero-copy to polars

# DML / DDL
client.do_action(fl.Action("sql", b"INSERT INTO users (name, age) VALUES ('Alice', 30)"))
client.do_action(fl.Action("sql", b"CREATE TABLE logs (event STRING, ts INT64)"))

# List available actions
for action in client.list_actions():
    print(action.type, "—", action.description)

When to Use Arrow Flight vs PG Wire

Scenario	Recommendation
DBeaver / Tableau / BI tools	PG Wire (only option)
Python + small queries (<100 rows)	Native API (fastest, in-process)
Python + large queries (10K+ rows, remote)	Arrow Flight (4–7× faster than PG wire)
Go / Java / Spark workers	Arrow Flight (native Arrow support)
Local Python (same machine)	Native API (`ApexClient.execute()`)

PyO3 Python API

Both servers are also accessible as blocking Python functions (released GIL):

import threading
from apexbase._core import start_pg_server, start_flight_server

t1 = threading.Thread(target=start_pg_server,     args=("/data", "0.0.0.0", 5432),  daemon=True)
t2 = threading.Thread(target=start_flight_server, args=("/data", "0.0.0.0", 50051), daemon=True)
t1.start()
t2.start()

Architecture

Python (ApexClient)
  |
  |-- Arrow IPC / columnar dict --------> ResultView (Pandas / Polars / PyArrow)
  |
Rust Core (PyO3 bindings)
  |
  +-- SQL Parser -----> Query Planner -----> Query Executor
  |                                              |
  |   +-- JIT Compiler (Cranelift)               |
  |   +-- Expression Evaluator (70+ functions)   |
  |   +-- Window Function Engine                 |
  |                                              |
  +-- Storage Engine                             |
  |     +-- V4 Row Group Format (.apex)          |
  |     +-- DeltaStore (cell-level updates)      |
  |     +-- WAL (write-ahead log)                |
  |     +-- Mmap on-demand reads                 |
  |     +-- LZ4 / Zstd compression              |
  |     +-- Dictionary encoding                  |
  |                                              |
  +-- Index Manager (B-Tree, Hash)               |
  +-- TxnManager (OCC + MVCC)                    |
  +-- NanoFTS (full-text search)                  |
  +-- PG Wire Protocol Server (pgwire)             |
  |   +-- DBeaver / psql / DataGrip / pgAdmin      |
  |   +-- pg_catalog & information_schema compat    |
  |                                                 |
  +-- Arrow Flight gRPC Server (tonic + HTTP/2)     |
      +-- pyarrow.flight / Go / Java / Spark        |
      +-- Arrow IPC — zero serialization overhead   |

Storage Format

ApexBase uses a custom V4 Row Group format:

Each table is a single .apex file containing a header, row groups, and a footer
Row groups store columns contiguously with per-column compression (LZ4 or Zstd)
Low-cardinality string columns are dictionary-encoded on disk
Null bitmaps are stored per column per row group
A DeltaStore file (.deltastore) holds cell-level updates that are merged on read and compacted automatically
WAL records provide crash recovery with idempotent replay

Query Execution

The SQL parser produces an AST that the query planner analyzes for optimization strategy
Fast paths bypass the full executor for common patterns (COUNT(*), SELECT * LIMIT N, point lookups, single-column GROUP BY)
Arrow RecordBatch is the internal data representation; results flow to Python via Arrow IPC with zero-copy when possible
Repeated identical read queries are served from an in-process result cache

API Reference

ApexClient

Constructor

ApexClient(
    dirpath="./data",           # data directory
    drop_if_exists=False,       # clear existing data on open
    batch_size=1000,            # batch size for operations
    enable_cache=True,          # enable query cache
    cache_size=10000,           # cache capacity
    prefer_arrow_format=True,   # prefer Arrow format for results
    durability="fast",          # "fast" | "safe" | "max"
)

Database Management

Method	Description
`use_database(database='default')`	Switch to a named database (creates it if needed)
`use(database='default', table=None)`	Switch database and optionally select/create a table
`list_databases()`	List all databases (`'default'` always included)
`current_database`	Property: current database name

Table Management

Method	Description
`create_table(name, schema=None)`	Create a new table, optionally with pre-defined schema
`drop_table(name)`	Drop a table
`use_table(name)`	Switch active table
`list_tables()`	List all tables in the current database
`current_table`	Property: current table name

Data Storage

Method	Description
`store(data)`	Store data (dict, list, DataFrame, Arrow Table)
`from_pandas(df, table_name=None)`	Import from pandas DataFrame
`from_polars(df, table_name=None)`	Import from polars DataFrame
`from_pyarrow(table, table_name=None)`	Import from PyArrow Table

Data Retrieval

Method	Description
`execute(sql)`	Execute SQL statement(s)
`query(where, limit)`	Query with WHERE expression
`retrieve(id)`	Get record by _id
`retrieve_many(ids)`	Get multiple records by _id
`retrieve_all()`	Get all records
`count_rows(table)`	Count rows in table

Data Modification

Method	Description
`replace(id, data)`	Replace a record
`batch_replace({id: data})`	Batch replace records
`delete(id)` or `delete([ids])`	Delete record(s)

Column Operations

Method	Description
`add_column(name, type)`	Add a column
`drop_column(name)`	Drop a column
`rename_column(old, new)`	Rename a column
`get_column_dtype(name)`	Get column data type
`list_fields()`	List all fields

Full-Text Search

Method	Description
`init_fts(fields, lazy_load, cache_size)`	Initialize FTS
`search_text(query)`	Search documents
`fuzzy_search_text(query)`	Fuzzy search
`search_and_retrieve(query, limit, offset)`	Search and return records
`search_and_retrieve_top(query, n)`	Top N results
`get_fts_stats()`	FTS statistics
`disable_fts()` / `drop_fts()`	Disable or drop FTS

Utility

Method	Description
`flush()`	Flush data to disk
`set_auto_flush(rows, bytes)`	Set auto-flush thresholds
`get_auto_flush()`	Get auto-flush config
`estimate_memory_bytes()`	Estimate memory usage
`close()`	Close the client

ResultView

Method / Property	Description
`to_pandas(zero_copy=True)`	Convert to pandas DataFrame
`to_polars()`	Convert to polars DataFrame
`to_arrow()`	Convert to PyArrow Table
`to_dict()`	Convert to list of dicts
`scalar()`	Get single scalar value
`first()`	Get first row as dict
`get_ids(return_list=False)`	Get record IDs
`shape`	(rows, columns)
`columns`	Column names
`__len__()`	Row count
`__iter__()`	Iterate over rows
`__getitem__(idx)`	Index access

Documentation

Additional documentation is available in the docs/ directory.

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.17.0

May 6, 2026

1.16.0

May 3, 2026

1.15.0

May 2, 2026

1.14.0

Mar 27, 2026

1.13.0

Mar 26, 2026

1.12.0

Mar 26, 2026

1.11.0

Mar 18, 2026

1.10.0

Mar 18, 2026

1.9.0

Mar 17, 2026

1.8.0

Feb 25, 2026

1.7.0

Feb 23, 2026

This version

1.6.0

Feb 21, 2026

1.5.0

Feb 20, 2026

1.4.0

Feb 20, 2026

1.3.0

Feb 20, 2026

1.2.0

Feb 20, 2026

1.1.0

Feb 10, 2026

1.0.0

Feb 10, 2026

0.6.0

Feb 9, 2026

0.5.0

Feb 7, 2026

0.4.2

Feb 1, 2026

0.4.0

Feb 1, 2026

0.3.0

Jan 29, 2026

0.2.3

Jan 27, 2026

0.2.2

Jan 27, 2026

0.2.1

Jan 27, 2026

0.2.0

Jan 27, 2026

0.1.0

Aug 2, 2025

0.0.2

Jan 25, 2025

0.0.1

Jan 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apexbase-1.6.0.tar.gz (678.8 kB view details)

Uploaded Feb 21, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

apexbase-1.6.0-cp313-cp313-win_amd64.whl (6.6 MB view details)

Uploaded Feb 21, 2026 CPython 3.13Windows x86-64

apexbase-1.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.2 MB view details)

Uploaded Feb 21, 2026 CPython 3.13manylinux: glibc 2.17+ x86-64

apexbase-1.6.0-cp313-cp313-macosx_11_0_arm64.whl (6.1 MB view details)

Uploaded Feb 21, 2026 CPython 3.13macOS 11.0+ ARM64

apexbase-1.6.0-cp312-cp312-win_amd64.whl (6.6 MB view details)

Uploaded Feb 21, 2026 CPython 3.12Windows x86-64

apexbase-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.2 MB view details)

Uploaded Feb 21, 2026 CPython 3.12manylinux: glibc 2.17+ x86-64

apexbase-1.6.0-cp312-cp312-macosx_11_0_arm64.whl (6.1 MB view details)

Uploaded Feb 21, 2026 CPython 3.12macOS 11.0+ ARM64

apexbase-1.6.0-cp311-cp311-win_amd64.whl (6.6 MB view details)

Uploaded Feb 21, 2026 CPython 3.11Windows x86-64

apexbase-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.2 MB view details)

Uploaded Feb 21, 2026 CPython 3.11manylinux: glibc 2.17+ x86-64

apexbase-1.6.0-cp311-cp311-macosx_11_0_arm64.whl (6.1 MB view details)

Uploaded Feb 21, 2026 CPython 3.11macOS 11.0+ ARM64

apexbase-1.6.0-cp310-cp310-win_amd64.whl (6.6 MB view details)

Uploaded Feb 21, 2026 CPython 3.10Windows x86-64

apexbase-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.2 MB view details)

Uploaded Feb 21, 2026 CPython 3.10manylinux: glibc 2.17+ x86-64

apexbase-1.6.0-cp310-cp310-macosx_11_0_arm64.whl (6.1 MB view details)

Uploaded Feb 21, 2026 CPython 3.10macOS 11.0+ ARM64

apexbase-1.6.0-cp39-cp39-win_amd64.whl (6.6 MB view details)

Uploaded Feb 21, 2026 CPython 3.9Windows x86-64

apexbase-1.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.2 MB view details)

Uploaded Feb 21, 2026 CPython 3.9manylinux: glibc 2.17+ x86-64

apexbase-1.6.0-cp39-cp39-macosx_11_0_arm64.whl (6.1 MB view details)

Uploaded Feb 21, 2026 CPython 3.9macOS 11.0+ ARM64

File details

Details for the file apexbase-1.6.0.tar.gz.

File metadata

Download URL: apexbase-1.6.0.tar.gz
Upload date: Feb 21, 2026
Size: 678.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0.tar.gz
Algorithm	Hash digest
SHA256	`92e15510d0135bd2d6ff1d692a841be4e1d99303dc33d141ef1ce2aea5d1ae60`
MD5	`40d4354e130635793f063929dcb60a8a`
BLAKE2b-256	`12959f60d9bc76545d1bc579d7428f25cd84ff8561263ad7f0eeace18d5ab757`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp313-cp313-win_amd64.whl.

File metadata

Download URL: apexbase-1.6.0-cp313-cp313-win_amd64.whl
Upload date: Feb 21, 2026
Size: 6.6 MB
Tags: CPython 3.13, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp313-cp313-win_amd64.whl
Algorithm	Hash digest
SHA256	`648f229133fa60094e635612660d10e61131d1abe1e381974c1def2847e4cd4a`
MD5	`e4892d2c2ab0cda88eea4b240936cddf`
BLAKE2b-256	`2a05acd83bd447e7bd3a0240e726a15e475bb6aa1c3e98d94692dbf9f23c5b1a`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: apexbase-1.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Feb 21, 2026
Size: 7.2 MB
Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`bf524334a1e4ddaf10c428c91f720fcec741461e7ababcc878617ee087d32c87`
MD5	`20d06229d757fc0deab713a253749b20`
BLAKE2b-256	`483a6c19166cc6f1f3d6ef1832a3d357a1e003d2944ca10b072bf3d35a3045ee`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

Download URL: apexbase-1.6.0-cp313-cp313-macosx_11_0_arm64.whl
Upload date: Feb 21, 2026
Size: 6.1 MB
Tags: CPython 3.13, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`8ea49d6c8ea562a7b3d31e679636323e562b96aa8b949849291981773698bd2b`
MD5	`b5b007073defd97fc01f6f6b9ee8890a`
BLAKE2b-256	`2c29c571d16dd8eadd87b8cc1a2071c155b0cba0b7396f4f46b901845bc9736a`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp312-cp312-win_amd64.whl.

File metadata

Download URL: apexbase-1.6.0-cp312-cp312-win_amd64.whl
Upload date: Feb 21, 2026
Size: 6.6 MB
Tags: CPython 3.12, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp312-cp312-win_amd64.whl
Algorithm	Hash digest
SHA256	`9c5f3e6c10167b7a69fedb9d4f4b4d2eae4a264588c43831411dc0382c3ce3d6`
MD5	`5afda8863bb85134cb9cf8f8cf5caf6e`
BLAKE2b-256	`9b63c3012177d9ef25d9c715ec3a186fea2ede532bb6d8b36d64230ed15e737a`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: apexbase-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Feb 21, 2026
Size: 7.2 MB
Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`0350231872f8351239a76935085858a54ade86173d2cdeffbb68a06b47deb639`
MD5	`1dae42609be234c0cf0f9a39b0e25130`
BLAKE2b-256	`94dd0f1272cb2f96708f67c971411b26b3cf1ec74265fd36db4997459cd503ea`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

Download URL: apexbase-1.6.0-cp312-cp312-macosx_11_0_arm64.whl
Upload date: Feb 21, 2026
Size: 6.1 MB
Tags: CPython 3.12, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`37b230efea6992454b5b3270b63e25a7425905cc1630b58aaed06ca218f21bbc`
MD5	`94ffdd12a2e8f8f42f9dd8062549c4d3`
BLAKE2b-256	`ecb7a41a8ad95ab894beb0006633633ef03552e879ccc03ab38145ad8a0659bf`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp311-cp311-win_amd64.whl.

File metadata

Download URL: apexbase-1.6.0-cp311-cp311-win_amd64.whl
Upload date: Feb 21, 2026
Size: 6.6 MB
Tags: CPython 3.11, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp311-cp311-win_amd64.whl
Algorithm	Hash digest
SHA256	`dcdbb469bf55d7c8229c98a8ce9d1bc4bd5b9b8df57446fc1ba038ca6afdd0d6`
MD5	`8e959c8ec5ba23b5d487a9701fbd1722`
BLAKE2b-256	`8d578b86c30deff846e82b7795768aab3325ff82d3ae2ea26545fd74f47cacae`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: apexbase-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Feb 21, 2026
Size: 7.2 MB
Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`bd6f981aeb02b7ab1be571e214c33290621fc0ea62725e4581eed694b2fefa92`
MD5	`e908e6ae74f5dcb7a1d6353b332370e6`
BLAKE2b-256	`0f6abc2a8c0ff53fc2161baf238da0e4a05280839767b476e92b2e0c3e94494c`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

Download URL: apexbase-1.6.0-cp311-cp311-macosx_11_0_arm64.whl
Upload date: Feb 21, 2026
Size: 6.1 MB
Tags: CPython 3.11, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`304540c7c9c79fed875f2d9df9f3987423222dc6a7b6252c214957d1a3ffcde7`
MD5	`2f511aada1e1b5c9cbd8143132ac7543`
BLAKE2b-256	`909d8729c118ca012508e0db5968ec8d8db6a062677a43ad7798df3b26fc4050`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp310-cp310-win_amd64.whl.

File metadata

Download URL: apexbase-1.6.0-cp310-cp310-win_amd64.whl
Upload date: Feb 21, 2026
Size: 6.6 MB
Tags: CPython 3.10, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp310-cp310-win_amd64.whl
Algorithm	Hash digest
SHA256	`67ef7fa6e43ca01b0cc70417d17dc93ace40967ab687efe73ea4f3e278c63662`
MD5	`b885d545b1b54db79b753546cb816269`
BLAKE2b-256	`2026e4c58131c27fb27643925919eae809c9d9650e58a7d23de4c3b06874376e`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: apexbase-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Feb 21, 2026
Size: 7.2 MB
Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`4535ed0aa662cc1e5d5f8a3cc3869335a920922d3fb202e2ea94585d16348abc`
MD5	`5dbda37901936ad8692ce98abad0de1c`
BLAKE2b-256	`f37e9e5ba6f939b4deab1d07e443b46fd7cf47c5fbfe8b05ab57b6223168220b`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

Download URL: apexbase-1.6.0-cp310-cp310-macosx_11_0_arm64.whl
Upload date: Feb 21, 2026
Size: 6.1 MB
Tags: CPython 3.10, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`efe1cacf30227d18a3c1fdc7bfd8455035047aa2df89632be1316432de93118c`
MD5	`6d29484cadbab0ecd5fce8cd1cf80529`
BLAKE2b-256	`1c09671d03f6eef2da1337a7ef608a50e54e5141124b654f586b441829828d2a`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp39-cp39-win_amd64.whl.

File metadata

Download URL: apexbase-1.6.0-cp39-cp39-win_amd64.whl
Upload date: Feb 21, 2026
Size: 6.6 MB
Tags: CPython 3.9, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp39-cp39-win_amd64.whl
Algorithm	Hash digest
SHA256	`ede4482fa230ea5cca3d79cedc8f7f70ee020a449a7e551ddc4d43f3cb419d19`
MD5	`f3d1aa71b7e87e640a56cae25cda3c4e`
BLAKE2b-256	`fc0dae37225984a19f5721ae49c350b34e2dc0541fa069364a33a9e6336b4873`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: apexbase-1.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Feb 21, 2026
Size: 7.2 MB
Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`92905ff9e9663feeee5870e09c670fe1125611dac12b72fc0e3d5ae1e545e89e`
MD5	`cfe40ab279422993d471e3834dac3654`
BLAKE2b-256	`ed70ca666bc3820b98edac8398ca05a6d84f0de0293765c633bc89ac9b9d0bed`

See more details on using hashes here.

File details

Details for the file apexbase-1.6.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

Download URL: apexbase-1.6.0-cp39-cp39-macosx_11_0_arm64.whl
Upload date: Feb 21, 2026
Size: 6.1 MB
Tags: CPython 3.9, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for apexbase-1.6.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`923e9de690a931039f9028f92f291d9981dc78de10cb24b88e4cfcbac15d9e97`
MD5	`2355439cfa25231f0e6915d5438475eb`
BLAKE2b-256	`d5d0ab1eff40b3e9cc2a26bb2a1f961484ad83a34b4f07b9e0b5f14f112c2b0e`

See more details on using hashes here.

apexbase 1.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ApexBase

Features

Installation

Quick Start

Usage Guide

Database Management

Table Management

Data Ingestion

SQL

Transactions

Indexes

Full-Text Search

SQL interface (recommended)

Python API (alternative)

Record-Level Operations

Column Operations

ResultView

Context Manager

Performance

ApexBase vs SQLite vs DuckDB (1M rows)

Server Protocols

Combined Launcher (Both Servers at Once)

PostgreSQL Wire Protocol Server

Starting the Server

Connecting with Database Tools

DBeaver

psql

DataGrip / IntelliJ IDEA

pgAdmin

Navicat for PostgreSQL

Other Compatible Tools

Supported SQL over Wire Protocol

Metadata Compatibility

Supported Protocol Features

Limitations

Arrow Flight gRPC Server

Starting the Flight Server

Python Client

When to Use Arrow Flight vs PG Wire

PyO3 Python API

Architecture

Storage Format

Query Execution

API Reference

ApexClient

ResultView

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes