Benchmark suite for Flatseek and competitor search engines (Elasticsearch, SQLite, Typesense, ZincSearch, Tantivy, Whoosh, DuckDB).
Project description
Flatbench
Search engine benchmark suite — compare Flatseek against Elasticsearch, tantivy, Typesense, Whoosh, ZincSearch, SQLite, and DuckDB.
Benchmarks: build speed, search latency, wildcard, range queries, and aggregations. Results saved as JSON + Markdown to ./output/.
Install
pip install flatbench
Requires Python ≥ 3.10, Docker (for full engine comparison).
Quick Start
1. Start all search engines (Docker)
make up
Starts: Flatseek API (port 8000), Elasticsearch (9200), Typesense (8108), ZincSearch (4080).
2. Generate a dataset
flatbench generate --schema article --rows 500000 -o ./data/article.csv
3. Run benchmark comparison
flatbench compare --engines flatseek_cli,elasticsearch,tantivy,typesense,whoosh,zincsearch --sizes 500000 --schema article
Results → output/benchmark_YYYYMMDD_HHMMSS.json + .md.
CLI Reference
Commands
| Command | Description |
|---|---|
flatbench generate |
Generate synthetic dataset |
flatbench compare |
Compare multiple engines |
flatbench run |
Benchmark single engine |
flatbench serve |
Serve report viewer locally |
Generate
flatbench generate --schema <schema> --rows <N> --output <path> [--format csv|jsonl]
Compare
flatbench compare --engines <engines> --sizes <sizes> [options]
Options:
| Flag | Description | Default |
|---|---|---|
--schema |
Data schema | standard |
--workers |
Parallel index workers | 1 |
--format |
csv or jsonl |
csv |
--source |
Use existing CSV/JSONL instead of generating | — |
--mode |
normal (disk) or tmpfs (RAM) |
normal |
--cache-dir |
Cache generated data for reuse | — |
--skip-build |
Skip build (use existing index) | — |
Engines: flatseek, flatseek_cli, elasticsearch, tantivy, typesense, whoosh, zincsearch, sqlite, duckdb
Sizes: multiple sizes supported, e.g. --sizes 1000 10000 500000
Examples
# Generate article dataset (500K rows)
flatbench generate --schema article --rows 500000 -o ./data/article.csv
# Compare at single scale
flatbench compare --engines flatseek_cli,elasticsearch --sizes 500000
# Compare at multiple scales
flatbench compare --engines flatseek,tantivy --sizes 1000 10000 500000
# Use existing CSV (reuse generated data)
flatbench compare --engines flatseek,elasticsearch --sizes 500000 --source ./data/article.csv
# RAM-backed index (tmpfs mode, faster builds)
flatbench compare --engines flatseek,tantivy --sizes 500000 --mode tmpfs
Infrastructure (Makefile)
make up # Start all services (docker-compose up -d)
make down # Stop services (keep volumes)
make clean # Stop and remove volumes
make status # Show service status
make logs # View logs (follow mode)
# Flatseek management
make fs-health # Health check
make fs-stats # Index stats
make fs-create # Create index
make fs-delete # Delete index
# Elasticsearch management
make es-health # Cluster health
make es-stats # Cluster stats
make es-create # Create index
make es-delete # Delete index
# Typesense management
make ts-health # Health check
make ts-stats # Collection stats
make ts-create # Create collection
make ts-delete # Delete collection
# ZincSearch management
make zs-health # Health check
make zs-stats # Index stats
make zs-create # Create index
make zs-delete # Delete index
# Run benchmark directly via Make
make benchmark NROWS=500000 ENGINES="flatseek_cli,elasticsearch,tantivy"
Service URLs:
| Service | URL |
|---|---|
| Flatseek API | http://localhost:8000 |
| Elasticsearch | http://localhost:9200 |
| Typesense | http://localhost:8108 |
| ZincSearch | http://localhost:4080 |
| Kibana | http://localhost:5601 (dev profile) |
Available Schemas
| Schema | Fields | Description |
|---|---|---|
article |
8 | Blog articles: id, title, content, tags, views, published_at, author |
standard |
12 | Generic: id, name, email, phone, city, country, status, balance, created_at, updated_at, is_verified, tags |
ecommerce |
12 | Order tracking data |
logs |
11 | Log entries: timestamp, level, service, message, etc. |
nested |
6 | Complex nested JSON objects |
sosmed |
9 | Social media posts |
devops |
11 | Infrastructure/monitoring data |
adsb |
10 | Flight tracking data |
campaign |
10 | Marketing campaign data |
blockchain |
9 | Blockchain transaction data |
Benchmark Operations
| Operation | Description | Metrics |
|---|---|---|
build_index |
Bulk API indexing (1000 rows/batch) | duration_ms, rows/sec, index_size_mb |
search |
Full-text query | p50_ms, p95_ms, p99_ms, ops/sec |
wildcard_search |
Prefix/suffix wildcard queries | p50_ms, p95_ms, ops/sec |
range_query |
Numeric/date range filtering | duration_ms, hits, ops/sec |
aggregate |
Terms/stats aggregations | duration_ms, bucket_count, ops/sec |
Output
Results written to ./output/ with timestamps:
output/
├── benchmark_20260501_142947.json # Full structured results
├── benchmark_20260501_142947.md # Markdown summary
└── index.json # Report manifest (for web viewer)
Report Viewer
Live: bench.flatseek.io — hosted Flatbench report viewer.
Local: Run flatbench serve --port 8080 or open report_viewer.html directly in browser.
Build Static Site
Build output directory as a static site (for self-hosted or Vercel deploy):
make build
# or
bash build.sh
Output → public/ directory with index.html, output/*.json, output/*.md.
Deploy to Vercel
make deploy # Deploy to production (flatbench.vercel.app)
make deploy-preview # Deploy preview build
Project Structure
flatbench/
├── Dockerfile # Flatseek API server container
├── docker-compose.yml # All engine containers
├── Makefile # Infrastructure + build commands
├── build.sh # Static site build script
├── report_viewer.html # Web UI for browsing results
├── pyproject.toml # flatbench package definition
├── src/flatbench/
│ ├── cli.py # CLI entry point
│ ├── benchmarks/ # Benchmark orchestration + report generation
│ ├── generators/ # Synthetic data generators (schema-aware)
│ ├── runners/ # Engine runners (HTTP API / CLI)
│ │ ├── flatseek_api.py # Flatseek HTTP API runner
│ │ ├── flatseek_cli.py # Flatseek CLI runner
│ │ ├── elasticsearch.py # Elasticsearch runner
│ │ ├── tantivy.py # tantivy (Rust) runner
│ │ ├── typesense.py # Typesense runner
│ │ ├── whoosh.py # Whoosh runner
│ │ ├── zincsearch.py # ZincSearch runner
│ │ ├── sqlite.py # SQLite FTS5 runner
│ │ └── duckdb.py # DuckDB full-text runner
│ └── output/ # Benchmark results (JSON + Markdown)
Adding a New Engine
from flatbench.runners import BaseRunner, BenchmarkResult, register_engine
@register_engine("myengine")
class MyEngineRunner(BaseRunner):
name = "myengine"
supports_aggregate = False
supports_range_query = True
supports_wildcard = True
def build_index(self, data_path: str, **kwargs) -> BenchmarkResult:
# Bulk API indexing logic
pass
def search(self, query: str, iterations: int = 10, **kwargs) -> BenchmarkResult:
# Search via HTTP API
pass
Then add to --engines list: --engines flatseek,myengine,...
Benchmark Results (Latest: 500K rows, article schema)
Full results:
output/benchmark_20260501_142947.md
Overall Score (60% speed · 40% correctness)
| Engine | Speed | Correctness | Score |
|---|---|---|---|
| Flatseek | 🟢 | 🟢 | 0.878 ◀ |
| typesense | 🟢 | 🟢 | 0.832 |
| zincsearch | 🟢 | 🟢 | 0.823 |
| elasticsearch | 🟢 | 🟢 | 0.820 |
| tantivy | 🟢 | 🔴 | 0.650 |
| whoosh | 🔴 | 🔴 | 0.025 |
Key Takeaways
- Correctness matters: Flatseek is the only engine with zero correctness errors. Tantivy misses 99.4% of range query hits.
- Search: Tantivy fastest (0.7ms p50), but wrong. Flatseek second-fastest correct (7.9ms).
- Build: Tantivy wins (21s for 500K), but Flatseek build is reasonable (217s).
- Aggregation: Competitors (ES, tantivy) are 20–300× faster — Flatseek aggregation is a known weakness.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flatbench-0.1.0.tar.gz.
File metadata
- Download URL: flatbench-0.1.0.tar.gz
- Upload date:
- Size: 53.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
edc0dcc2d4a10c237f1b642cdf27622d1d5120e50656eb52281a0cd7817e4f5d
|
|
| MD5 |
81d98d2d221ca8ea917ef5bd8f2332ec
|
|
| BLAKE2b-256 |
fe17e501c6bfe9a6ff4aae84ccde020b34821649b066fc3d091678a6c18c7203
|
Provenance
The following attestation bundles were made for flatbench-0.1.0.tar.gz:
Publisher:
publish.yml on flatseek/flatbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flatbench-0.1.0.tar.gz -
Subject digest:
edc0dcc2d4a10c237f1b642cdf27622d1d5120e50656eb52281a0cd7817e4f5d - Sigstore transparency entry: 1417565575
- Sigstore integration time:
-
Permalink:
flatseek/flatbench@88fd6e11803d2519bd25dfecf78403053842b0f2 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/flatseek
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@88fd6e11803d2519bd25dfecf78403053842b0f2 -
Trigger Event:
release
-
Statement type:
File details
Details for the file flatbench-0.1.0-py3-none-any.whl.
File metadata
- Download URL: flatbench-0.1.0-py3-none-any.whl
- Upload date:
- Size: 64.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
edf79c58235dd303f30ccce994e93c489b11ed287fe26b2e4d3c2569ddab95f5
|
|
| MD5 |
5d03f06a37cf16966713938a86d3e1ef
|
|
| BLAKE2b-256 |
7aab9669711594ded7a683620cc1b390b536766f08dc4e22be303f0994e60b03
|
Provenance
The following attestation bundles were made for flatbench-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on flatseek/flatbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flatbench-0.1.0-py3-none-any.whl -
Subject digest:
edf79c58235dd303f30ccce994e93c489b11ed287fe26b2e4d3c2569ddab95f5 - Sigstore transparency entry: 1417565687
- Sigstore integration time:
-
Permalink:
flatseek/flatbench@88fd6e11803d2519bd25dfecf78403053842b0f2 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/flatseek
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@88fd6e11803d2519bd25dfecf78403053842b0f2 -
Trigger Event:
release
-
Statement type: