Skip to main content

Async-first MongoDB-like persistence library with pluggable storage engines.

Project description

mongoeco

mongoeco is an async-first MongoDB-like persistence library with pluggable storage engines.

It is designed for local development, test environments, embedded persistence and compatibility work where a PyMongo-shaped API is useful without requiring a real MongoDB server for every workflow.

Current Scope

What is already in place:

  • async and sync client APIs
  • memory and SQLite engines
  • transactional local sessions
  • aggregation runtime with pushdown and spill guardrails
  • compatibility modeling across MongoDB dialects and PyMongo profiles
  • local wire/driver runtime
  • search index lifecycle plus local $search / experimental $vectorSearch

What this is not:

  • a drop-in replacement for a production MongoDB cluster
  • a full Atlas Search implementation
  • a full-text/vector engine with server-grade scaling guarantees

Installation

Editable local install:

python -m pip install -e .

Development install:

python -m pip install -e .[dev]

Optional fast JSON backend:

python -m pip install -e .[json-fast]

mongoeco uses the standard library json module by default, even if orjson is installed. You can choose the backend at process start with MONGOECO_JSON_BACKEND:

  • stdlib: always use the standard library JSON backend
  • orjson: require orjson and use it
  • auto: use orjson when available, otherwise fall back to stdlib

Example:

MONGOECO_JSON_BACKEND=orjson python your_app.py

Quick Start

Async with the in-memory engine:

import asyncio

from mongoeco import AsyncMongoClient
from mongoeco.engines.memory import MemoryEngine


async def main() -> None:
    async with AsyncMongoClient(MemoryEngine()) as client:
        collection = client.demo.users
        await collection.insert_one({"_id": "1", "name": "Ada"})
        document = await collection.find_one({"name": "Ada"})
        print(document)


asyncio.run(main())

Sync with SQLite:

from mongoeco import MongoClient
from mongoeco.engines.sqlite import SQLiteEngine


with MongoClient(SQLiteEngine("mongoeco.db")) as client:
    collection = client.demo.users
    collection.insert_one({"_id": "1", "name": "Ada"})
    print(collection.find_one({"_id": "1"}))

Compatibility

mongoeco models two separate axes:

  • MongoDB server semantics through mongodb_dialect
  • PyMongo surface compatibility through pymongo_profile

See:

Testing

The repository currently uses the standard library test runner:

python -m pip install -e .[dev]
python -m unittest discover -s tests -p 'test*.py'

Benchmarks

There is a benchmark harness under benchmarks/README.md intended for reproducible local profiling, regression tracking and community-facing performance analysis.

Quick smoke run:

python -m benchmarks.run \
  --engine all \
  --size 1000 \
  --warmup 0 \
  --repetitions 1

Latest local smoke snapshot from March 30, 2026:

  • command:
python -m benchmarks.run \
  --engine all \
  --size 1000 \
  --warmup 0 \
  --repetitions 1 \
  --format table
  • high-level reading:
    • memory is currently strongest on point lookups and most filter-heavy workloads
    • sqlite is currently strongest on full-sort workloads and remains competitive on filter-heavy workloads
    • mongomock is still a useful baseline, but mongoeco now beats it in many lookup, filter and sort cases on this harness
  • notable smoke numbers:
    • secondary_lookup_indexed_1k
      • memory-sync 0.1647s
      • sqlite-sync 0.3760s
      • memory-async 0.1542s
      • sqlite-async 0.3089s
      • mongomock 0.6840s
    • filter_selectivity_high_100
      • memory-sync 0.3720s
      • sqlite-sync 0.4404s
      • memory-async 0.3201s
      • sqlite-async 0.3534s
      • mongomock 0.2676s
    • sort_limit_indexed_200
      • memory-sync 0.1971s
      • sqlite-sync 0.3427s
      • memory-async 0.1783s
      • sqlite-async 0.4458s
      • mongomock 1.4541s
    • sort_shape_top_level_full_50
      • memory-sync 0.6109s
      • sqlite-sync 0.3267s
      • memory-async 0.5790s
      • sqlite-async 0.3182s
      • mongomock 0.4602s

Treat this as a smoke snapshot, not a publication-quality claim. For anything you plan to cite publicly, use the reproducible report commands below with warmup and repeated runs.

Recommended community matrix:

  • size=100
    • small-dataset overhead and API-path visibility
  • size=1000
    • primary reference point for balanced comparisons
  • size=5000
    • larger-scale behavior without turning the default community workflow into a multi-hour run

The size=50 case is mainly useful for local smoke checks, and size=500 rarely changes the story enough to justify making it part of the default published matrix.

Reproducible local report:

python -m benchmarks.report \
  --engine all \
  --size 10000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/latest.json \
  --output-markdown benchmarks/reports/latest.md

Suggested community runs:

python -m benchmarks.report \
  --engine all \
  --size 100 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-100.json \
  --output-markdown benchmarks/reports/matrix-100.md

python -m benchmarks.report \
  --engine all \
  --size 1000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-1000.json \
  --output-markdown benchmarks/reports/matrix-1000.md

python -m benchmarks.report \
  --engine all \
  --size 5000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-5000.json \
  --output-markdown benchmarks/reports/matrix-5000.md

Latest serious reference snapshot used for local analysis:

  • date: March 30, 2026
  • command:
python -m benchmarks.report \
  --engine all \
  --size 1000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/latest-1000-serious.json \
  --output-markdown benchmarks/reports/latest-1000-serious.md
  • environment:
    • Python 3.14.0
    • macOS 15.6
    • arm64
    • JSON backend stdlib
    • git revision 7277a30904192aa8c6cbb7547ec3035840781bb4
    • worktree was dirty, so treat the numbers as a strong local reference, not a final published baseline
  • representative means:
    • secondary_lookup_indexed_1k
      • memory-sync 0.1430s
      • sqlite-sync 0.3142s
      • memory-async 0.1414s
      • sqlite-async 0.3048s
      • mongomock 0.6643s
    • filter_selectivity_high_100
      • memory-sync 0.3275s
      • sqlite-sync 0.3898s
      • memory-async 0.3069s
      • sqlite-async 0.3556s
      • mongomock 0.2529s
    • sort_limit_indexed_200
      • memory-sync 0.1769s
      • sqlite-sync 0.2927s
      • memory-async 0.1705s
      • sqlite-async 0.4060s
      • mongomock 1.3748s
    • sort_shape_top_level_full_50
      • memory-sync 0.5765s
      • sqlite-sync 0.2887s
      • memory-async 0.5709s
      • sqlite-async 0.2946s
      • mongomock 0.4256s

Current high-level reading from that snapshot:

  • memory is strongest on point lookups and most filter-heavy workloads
  • sqlite is strongest on full-sort workloads and remains competitive on many filter-heavy workloads
  • mongomock remains a useful baseline, but it is no longer the fastest option in many lookup, filter and sort scenarios covered by this harness

orjson note:

  • mongoeco defaults to stdlib JSON for reproducibility
  • SQLite-specific A/B runs show that orjson can help materially on JSON-heavy filter and materialization workloads at size=1000 and above
  • the same A/B runs do not show a universal improvement for lookup-heavy workloads
  • because of that, benchmark discussions should always state the JSON backend explicitly when SQLite is involved

Benchmark outputs are treated as local artifacts and should stay out of git. The harness itself is versioned so anyone can reproduce the same workload mix, dataset seeds and report structure from a given revision.

Project Status

The repository is in active development and the public package surface is still best treated as pre-release.

License

This project is licensed under the Apache License 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mongoeco-2.0.0.tar.gz (259.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mongoeco-2.0.0-py3-none-any.whl (295.5 kB view details)

Uploaded Python 3

File details

Details for the file mongoeco-2.0.0.tar.gz.

File metadata

  • Download URL: mongoeco-2.0.0.tar.gz
  • Upload date:
  • Size: 259.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mongoeco-2.0.0.tar.gz
Algorithm Hash digest
SHA256 35691878d311031dd78c246bf8002d21867e8edc7b89f27bc87c9a8ef80c008d
MD5 c2454dd79f7ab28fd56deb6147dab310
BLAKE2b-256 ca254ee1aea9b984daaf4a916f18dd4c6de11093941cf504dcc78a3089c5fcd4

See more details on using hashes here.

File details

Details for the file mongoeco-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: mongoeco-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 295.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mongoeco-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 06de780b97e550b8cbab67d517458f04fa0839a3de741acee2837027126e3ea6
MD5 6b0d09686f55dd27780fe03228782db6
BLAKE2b-256 fa3d75a30a1d2ec5201e0b6fc57b6ce5e49b232bf7e72504eb674a5b2db0edd7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page