Benchmark suite for Flatseek and competitor search engines (Elasticsearch, SQLite, Typesense, ZincSearch, Tantivy, Whoosh, DuckDB).
Project description
Flatbench
Search engine benchmark suite — compare Flatseek against Elasticsearch, tantivy, Typesense, Whoosh, ZincSearch, SQLite, and DuckDB.
Benchmarks: build speed, search latency, wildcard, range queries, and aggregations. Results saved as JSON + Markdown to ./output/.
Install
pip install flatbench
Requires Python ≥ 3.10, Docker (for full engine comparison).
Quick Start
1. Start all search engines (Docker)
flatbench make up
Starts: Flatseek API (port 8000), Elasticsearch (9200), Typesense (8108), ZincSearch (4080).
2. Generate a dataset
flatbench generate -s article -r 500000 -o ./data/article.csv
3. Run benchmark comparison
flatbench compare -e flatseek_cli,elasticsearch,tantivy,typesense,whoosh,zincsearch -s 500000
Results → output/benchmark_YYYYMMDD_HHMMSS.json + .md.
CLI Reference
Commands
| Command | Description |
|---|---|
flatbench generate |
Generate synthetic dataset |
flatbench compare |
Compare multiple engines |
flatbench run |
Benchmark single engine |
flatbench serve |
Serve report viewer locally |
flatbench make |
Run infrastructure Makefile targets |
Generate
flatbench generate --schema <schema> --rows <N> --output <path> [--format csv|jsonl]
Schemas: standard, ecommerce, logs, nested, sparse, article, adsb, campaign, devops, sosmed, blockchain
Compare
flatbench compare --engines <engines> --sizes <sizes> [options]
Options:
| Flag | Description | Default |
|---|---|---|
--schema |
Data schema | standard |
--workers, -w |
Parallel index workers | 1 |
--format |
csv or jsonl |
csv |
--source, -S |
Use existing CSV/JSONL instead of generating | — |
--mode, -m |
normal (disk) or tmpfs (RAM) |
normal |
--cache-dir, -c |
Cache generated data for reuse | — |
--skip-build |
Skip build (use existing index) | — |
--serve |
After compare completes, build site and serve report | — |
Engines: flatseek, flatseek_cli, elasticsearch, tantivy, typesense, whoosh, zincsearch, sqlite, duckdb
Sizes: multiple sizes supported, e.g. --sizes 1000 10000 500000
Run
flatbench run --engine <engine> --data <path> --index-dir <path> [-o output] [--iterations N]
Serve
flatbench serve [--dir ./output] [--port 8080]
Opens the report viewer in your browser automatically.
Make (Infrastructure)
flatbench make <targets...> # Run Makefile targets (default: help)
flatbench make up # Start all services (docker-compose up -d)
flatbench make down # Stop services (keep volumes)
flatbench make clean # Stop and remove volumes
flatbench make status # Show service status
flatbench make logs # View logs (follow mode)
flatbench make benchmark NROWS=500000 # Run benchmark via Make
Service management:
| Target | Description |
|---|---|
up/down/clean/status/logs |
Docker compose lifecycle |
fs-health/fs-stats/fs-create/fs-delete |
Flatseek API (port 8000) |
es-health/es-stats/es-create/es-delete |
Elasticsearch (port 9200) |
ts-health/ts-stats/ts-create/ts-delete |
Typesense (port 8108) |
zs-health/zs-stats/zs-create/zs-delete |
ZincSearch (port 4080) |
Examples
# Generate article dataset (500K rows)
flatbench generate -s article -r 500000 -o ./data/article.csv
# Compare at single scale
flatbench compare -e flatseek_cli,elasticsearch -s 500000
# Compare at multiple scales
flatbench compare -e flatseek,tantivy -s 1000 10000 500000
# Use existing CSV (reuse generated data)
flatbench compare -e flatseek,elasticsearch -s 500000 -S ./data/article.csv
# RAM-backed index (tmpfs mode, faster builds)
flatbench compare -e flatseek,tantivy -s 500000 -m tmpfs
# Compare and auto-serve report
flatbench compare -e flatseek,tantivy -s 500000 --serve
# Run benchmark via Make
flatbench make benchmark NROWS=500000 ENGINES="flatseek_cli,elasticsearch,tantivy"
Service URLs:
| Service | URL |
|---|---|
| Flatseek API | http://localhost:8000 |
| Elasticsearch | http://localhost:9200 |
| Typesense | http://localhost:8108 |
| ZincSearch | http://localhost:4080 |
| Kibana | http://localhost:5601 (dev profile) |
Available Schemas
| Schema | Fields | Description |
|---|---|---|
article |
8 | Blog articles: id, title, content, tags, views, published_at, author |
standard |
12 | Generic: id, name, email, phone, city, country, status, balance, created_at, updated_at, is_verified, tags |
ecommerce |
12 | Order tracking data |
logs |
11 | Log entries: timestamp, level, service, message, etc. |
nested |
6 | Complex nested JSON objects |
sosmed |
9 | Social media posts |
devops |
11 | Infrastructure/monitoring data |
adsb |
10 | Flight tracking data |
campaign |
10 | Marketing campaign data |
blockchain |
9 | Blockchain transaction data |
Benchmark Operations
| Operation | Description | Metrics |
|---|---|---|
build_index |
Bulk API indexing (1000 rows/batch) | duration_ms, rows/sec, index_size_mb |
search |
Full-text query | p50_ms, p95_ms, p99_ms, ops/sec |
wildcard_search |
Prefix/suffix wildcard queries | p50_ms, p95_ms, ops/sec |
range_query |
Numeric/date range filtering | duration_ms, hits, ops/sec |
aggregate |
Terms/stats aggregations | duration_ms, bucket_count, ops/sec |
Output
Results written to ./output/ with timestamps:
output/
├── benchmark_20260501_142947.json # Full structured results
├── benchmark_20260501_142947.md # Markdown summary
└── index.json # Report manifest (for web viewer)
Report Viewer
Live: bench.flatseek.io — hosted Flatbench report viewer.
Local: Run flatbench serve --port 8080 or open report_viewer.html directly in browser.
Build Static Site
Build output directory as a static site (for self-hosted or Vercel deploy):
make build
# or
bash build.sh
Output → public/ directory with index.html, output/*.json, output/*.md.
Deploy to Vercel
make deploy # Deploy to production (flatbench.vercel.app)
make deploy-preview # Deploy preview build
Project Structure
flatbench/
├── Dockerfile # Flatseek API server container
├── docker-compose.yml # All engine containers
├── Makefile # Infrastructure + build commands
├── build.sh # Static site build script
├── report_viewer.html # Web UI for browsing results
├── pyproject.toml # flatbench package definition
├── src/flatbench/
│ ├── cli.py # CLI entry point
│ ├── benchmarks/ # Benchmark orchestration + report generation
│ ├── generators/ # Synthetic data generators (schema-aware)
│ ├── runners/ # Engine runners (HTTP API / CLI)
│ │ ├── flatseek_api.py # Flatseek HTTP API runner
│ │ ├── flatseek_cli.py # Flatseek CLI runner
│ │ ├── elasticsearch.py # Elasticsearch runner
│ │ ├── tantivy.py # tantivy (Rust) runner
│ │ ├── typesense.py # Typesense runner
│ │ ├── whoosh.py # Whoosh runner
│ │ ├── zincsearch.py # ZincSearch runner
│ │ ├── sqlite.py # SQLite FTS5 runner
│ │ └── duckdb.py # DuckDB full-text runner
│ └── output/ # Benchmark results (JSON + Markdown)
Adding a New Engine
from flatbench.runners import BaseRunner, BenchmarkResult, register_engine
@register_engine("myengine")
class MyEngineRunner(BaseRunner):
name = "myengine"
supports_aggregate = False
supports_range_query = True
supports_wildcard = True
def build_index(self, data_path: str, **kwargs) -> BenchmarkResult:
# Bulk API indexing logic
pass
def search(self, query: str, iterations: int = 10, **kwargs) -> BenchmarkResult:
# Search via HTTP API
pass
Then add to --engines list: --engines flatseek,myengine,...
Benchmark Results (Latest: 500K rows, article schema)
Latest Full results:
bench.flatseek.io
Overall Score (60% speed · 40% correctness)
| Engine | Speed | Correctness | Score |
|---|---|---|---|
| Flatseek | 🟢 | 🟢 | 0.878 ◀ |
| typesense | 🟢 | 🟢 | 0.832 |
| zincsearch | 🟢 | 🟢 | 0.823 |
| elasticsearch | 🟢 | 🟢 | 0.820 |
| tantivy | 🟢 | 🔴 | 0.650 |
| whoosh | 🔴 | 🔴 | 0.025 |
Key Takeaways
- Correctness matters: Flatseek is the only engine with zero correctness errors. Tantivy misses 99.4% of range query hits.
- Search: Tantivy fastest (0.7ms p50), but wrong. Flatseek second-fastest correct (7.9ms).
- Build: Tantivy wins (21s for 500K), but Flatseek build is reasonable (217s).
- Aggregation: Competitors (ES, tantivy) are 20–300× faster — Flatseek aggregation is a known weakness.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flatbench-0.1.1.tar.gz.
File metadata
- Download URL: flatbench-0.1.1.tar.gz
- Upload date:
- Size: 59.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
312a3c4e42b110bc560bb25e543373a5497806398b68945d1b07ed4e6802a331
|
|
| MD5 |
9286c5805ec7a22a208cf22b5c79fb15
|
|
| BLAKE2b-256 |
75dca834ba2b1a7b271e064ccb20abf7f13edf6809123d2830258b0ef07184c7
|
Provenance
The following attestation bundles were made for flatbench-0.1.1.tar.gz:
Publisher:
publish.yml on flatseek/flatbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flatbench-0.1.1.tar.gz -
Subject digest:
312a3c4e42b110bc560bb25e543373a5497806398b68945d1b07ed4e6802a331 - Sigstore transparency entry: 1426646196
- Sigstore integration time:
-
Permalink:
flatseek/flatbench@260047e6ccbf0287a3a99241bc48e07b3f7d8e5b -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/flatseek
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@260047e6ccbf0287a3a99241bc48e07b3f7d8e5b -
Trigger Event:
release
-
Statement type:
File details
Details for the file flatbench-0.1.1-py3-none-any.whl.
File metadata
- Download URL: flatbench-0.1.1-py3-none-any.whl
- Upload date:
- Size: 70.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d32fce3bbacab408eb77ffc0cef684c116582ff8f049304cc7045bce4e425ab
|
|
| MD5 |
8cab36c1c5b69d0b6e536d0d035e0e4c
|
|
| BLAKE2b-256 |
6678ad24956bb236aab00281ac4f31d8c610297ece5dad9f492a8f708ce987e8
|
Provenance
The following attestation bundles were made for flatbench-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on flatseek/flatbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flatbench-0.1.1-py3-none-any.whl -
Subject digest:
8d32fce3bbacab408eb77ffc0cef684c116582ff8f049304cc7045bce4e425ab - Sigstore transparency entry: 1426646273
- Sigstore integration time:
-
Permalink:
flatseek/flatbench@260047e6ccbf0287a3a99241bc48e07b3f7d8e5b -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/flatseek
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@260047e6ccbf0287a3a99241bc48e07b3f7d8e5b -
Trigger Event:
release
-
Statement type: