High-performance embedded database with Rust core and Python API
Project description
ApexBase
High-performance embedded database with Rust core and Python API
ApexBase 是一个基于 Rust 核心的高性能嵌入式数据库,提供简洁的 Python API。
✨ 特性
- 🚀 高性能 - Rust 核心,批量写入速度可达 97万+ ops/s
- 📦 单文件存储 - 自定义
.apex文件格式,无需外部依赖 - 🔍 全文搜索 - 集成 NanoFTS,支持中文和模糊搜索
- 🐍 Python 友好 - 简洁的 API,支持 Pandas/Polars/PyArrow
- 💾 紧凑存储 - 相比传统方案节省约 45% 存储空间
📦 安装
# 从源码构建
cd ApexBase
maturin develop --release
# 安装可选依赖
pip install pandas pyarrow polars # 数据框架支持
🚀 快速开始
from apexbase import ApexClient
# 创建客户端
client = ApexClient("./data")
# 存储数据
id = client.store({"name": "Alice", "age": 30, "city": "Beijing"})
ids = client.store([
{"name": "Bob", "age": 25},
{"name": "Charlie", "age": 35}
])
# 查询数据
results = client.query("age > 28") # SQL 风格条件查询
record = client.retrieve(id) # 按 ID 检索
all_data = client.retrieve_all() # 获取所有记录
# 全文搜索
doc_ids = client.search_text("Alice")
records = client.search_and_retrieve("Beijing")
# 转换为 DataFrame
df = results.to_pandas()
pl_df = results.to_polars()
# 关闭连接
client.close()
📊 性能对比
| 操作 | ApexBase (Rust) | 传统方案 | 提升 |
|---|---|---|---|
| 批量写入 (10K) | 17ms | 57ms | 3.3x |
| 单条检索 | 0.01ms | 0.4ms | 40x |
| 批量检索 (100) | 0.08ms | 1.1ms | 14x |
| 存储大小 | 2.1 MB | 3.9 MB | 1.8x 更小 |
📁 项目结构
ApexBase/
├── apexbase/ # 主包目录
│ ├── src/ # Rust 源代码
│ │ ├── storage/ # 存储引擎
│ │ ├── table/ # 表管理
│ │ ├── query/ # 查询执行器
│ │ ├── index/ # B-tree 索引
│ │ ├── cache/ # LRU 缓存
│ │ ├── data/ # 数据类型
│ │ └── python/ # PyO3 绑定
│ ├── python/ # Python 包装层
│ │ └── apexbase/
│ │ └── __init__.py # Python API
│ ├── Cargo.toml
│ └── pyproject.toml
├── Cargo.toml # 工作区配置
└── pyproject.toml # 项目配置
🔧 API 参考
ApexClient
# 初始化
client = ApexClient(
dirpath="./data", # 数据目录
drop_if_exists=False, # 是否删除已存在的数据
enable_fts=True, # 启用全文搜索
enable_search_cache=True, # 启用搜索缓存
)
# 表操作
client.create_table("users")
client.use_table("users")
client.drop_table("users")
tables = client.list_tables()
# CRUD 操作
id = client.store({"key": "value"})
ids = client.store([{...}, {...}])
record = client.retrieve(id)
records = client.retrieve_many([1, 2, 3])
client.replace(id, {"new": "data"})
client.delete(id)
client.delete([1, 2, 3])
# 查询
results = client.query("age > 30")
results = client.query("name LIKE 'A%'")
count = client.count_rows()
# 全文搜索
ids = client.search_text("keyword")
ids = client.fuzzy_search_text("keywrd") # 模糊搜索
records = client.search_and_retrieve("keyword")
# 数据框架集成
client.from_pandas(df)
client.from_polars(df)
results.to_pandas()
results.to_polars()
📄 License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file apexbase-0.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: apexbase-0.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 5.0 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c83502ad6728a887e9022ffffb73b24221d3412025e6aa092d805fdb3e30cd13
|
|
| MD5 |
4a36ae030976a516a6543751e2b242be
|
|
| BLAKE2b-256 |
bd53fb46f5fb9c2a94ad51f23e197b04d4d0d5b2d14805b8bd284d3bd7a0fa05
|