Skip to main content

Async-first MongoDB-like persistence library with pluggable storage engines.

Project description

mongoeco

mongoeco is an async-first MongoDB-like persistence library with pluggable storage engines.

It is designed for local development, test environments, embedded persistence and compatibility work where a PyMongo-shaped API is useful without requiring a real MongoDB server for every workflow.

Current Scope

What is already in place:

  • async and sync client APIs
  • memory and SQLite engines
  • transactional local sessions
  • aggregation runtime with pushdown and spill guardrails
  • compatibility modeling across MongoDB dialects and PyMongo profiles
  • local wire/driver runtime
  • search index lifecycle plus local $search / experimental $vectorSearch

What this is not:

  • a drop-in replacement for a production MongoDB cluster
  • a full Atlas Search implementation
  • a full-text/vector engine with server-grade scaling guarantees

Installation

Editable local install:

python -m pip install -e .

Development install:

python -m pip install -e .[dev]

Optional fast JSON backend:

python -m pip install -e .[json-fast]

mongoeco uses the standard library json module by default, even if orjson is installed. You can choose the backend at process start with MONGOECO_JSON_BACKEND:

  • stdlib: always use the standard library JSON backend
  • orjson: require orjson and use it
  • auto: use orjson when available, otherwise fall back to stdlib

Example:

MONGOECO_JSON_BACKEND=orjson python your_app.py

Quick Start

Async with the in-memory engine:

import asyncio

from mongoeco import AsyncMongoClient
from mongoeco.engines.memory import MemoryEngine


async def main() -> None:
    async with AsyncMongoClient(MemoryEngine()) as client:
        collection = client.demo.users
        await collection.insert_one({"_id": "1", "name": "Ada"})
        document = await collection.find_one({"name": "Ada"})
        print(document)


asyncio.run(main())

Sync with SQLite:

from mongoeco import MongoClient
from mongoeco.engines.sqlite import SQLiteEngine


with MongoClient(SQLiteEngine("mongoeco.db")) as client:
    collection = client.demo.users
    collection.insert_one({"_id": "1", "name": "Ada"})
    print(collection.find_one({"_id": "1"}))

Compatibility

mongoeco models two separate axes:

  • MongoDB server semantics through mongodb_dialect
  • PyMongo surface compatibility through pymongo_profile

See:

Testing

The repository currently uses the standard library test runner:

python -m pip install -e .[dev]
python -m unittest discover -s tests -p 'test*.py'

Benchmarks

There is a benchmark harness under benchmarks/README.md intended for reproducible local profiling, regression tracking and community-facing performance analysis.

Quick smoke run:

python -m benchmarks.run \
  --engine all \
  --size 1000 \
  --warmup 0 \
  --repetitions 1

Latest local smoke snapshot from March 30, 2026:

  • command:
python -m benchmarks.run \
  --engine all \
  --size 1000 \
  --warmup 0 \
  --repetitions 1 \
  --format table
  • high-level reading:
    • memory is currently strongest on point lookups and most filter-heavy workloads
    • sqlite is currently strongest on full-sort workloads and remains competitive on filter-heavy workloads
    • mongomock is still a useful baseline, but mongoeco now beats it in many lookup, filter and sort cases on this harness
  • notable smoke numbers:
    • secondary_lookup_indexed_1k
      • memory-sync 0.1647s
      • sqlite-sync 0.3760s
      • memory-async 0.1542s
      • sqlite-async 0.3089s
      • mongomock 0.6840s
    • filter_selectivity_high_100
      • memory-sync 0.3720s
      • sqlite-sync 0.4404s
      • memory-async 0.3201s
      • sqlite-async 0.3534s
      • mongomock 0.2676s
    • sort_limit_indexed_200
      • memory-sync 0.1971s
      • sqlite-sync 0.3427s
      • memory-async 0.1783s
      • sqlite-async 0.4458s
      • mongomock 1.4541s
    • sort_shape_top_level_full_50
      • memory-sync 0.6109s
      • sqlite-sync 0.3267s
      • memory-async 0.5790s
      • sqlite-async 0.3182s
      • mongomock 0.4602s

Treat this as a smoke snapshot, not a publication-quality claim. For anything you plan to cite publicly, use the reproducible report commands below with warmup and repeated runs.

Recommended community matrix:

  • size=100
    • small-dataset overhead and API-path visibility
  • size=1000
    • primary reference point for balanced comparisons
  • size=5000
    • larger-scale behavior without turning the default community workflow into a multi-hour run

The size=50 case is mainly useful for local smoke checks, and size=500 rarely changes the story enough to justify making it part of the default published matrix.

Reproducible local report:

python -m benchmarks.report \
  --engine all \
  --size 10000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/latest.json \
  --output-markdown benchmarks/reports/latest.md

Suggested community runs:

python -m benchmarks.report \
  --engine all \
  --size 100 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-100.json \
  --output-markdown benchmarks/reports/matrix-100.md

python -m benchmarks.report \
  --engine all \
  --size 1000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-1000.json \
  --output-markdown benchmarks/reports/matrix-1000.md

python -m benchmarks.report \
  --engine all \
  --size 5000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-5000.json \
  --output-markdown benchmarks/reports/matrix-5000.md

Latest serious reference snapshot used for local analysis:

  • date: March 30, 2026
  • command:
python -m benchmarks.report \
  --engine all \
  --size 1000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/latest-1000-serious.json \
  --output-markdown benchmarks/reports/latest-1000-serious.md
  • environment:
    • Python 3.14.0
    • macOS 15.6
    • arm64
    • JSON backend stdlib
    • git revision 7277a30904192aa8c6cbb7547ec3035840781bb4
    • worktree was dirty, so treat the numbers as a strong local reference, not a final published baseline
  • representative means:
    • secondary_lookup_indexed_1k
      • memory-sync 0.1430s
      • sqlite-sync 0.3142s
      • memory-async 0.1414s
      • sqlite-async 0.3048s
      • mongomock 0.6643s
    • filter_selectivity_high_100
      • memory-sync 0.3275s
      • sqlite-sync 0.3898s
      • memory-async 0.3069s
      • sqlite-async 0.3556s
      • mongomock 0.2529s
    • sort_limit_indexed_200
      • memory-sync 0.1769s
      • sqlite-sync 0.2927s
      • memory-async 0.1705s
      • sqlite-async 0.4060s
      • mongomock 1.3748s
    • sort_shape_top_level_full_50
      • memory-sync 0.5765s
      • sqlite-sync 0.2887s
      • memory-async 0.5709s
      • sqlite-async 0.2946s
      • mongomock 0.4256s

Current high-level reading from that snapshot:

  • memory is strongest on point lookups and most filter-heavy workloads
  • sqlite is strongest on full-sort workloads and remains competitive on many filter-heavy workloads
  • mongomock remains a useful baseline, but it is no longer the fastest option in many lookup, filter and sort scenarios covered by this harness

orjson note:

  • mongoeco defaults to stdlib JSON for reproducibility
  • SQLite-specific A/B runs show that orjson can help materially on JSON-heavy filter and materialization workloads at size=1000 and above
  • the same A/B runs do not show a universal improvement for lookup-heavy workloads
  • because of that, benchmark discussions should always state the JSON backend explicitly when SQLite is involved

Benchmark outputs are treated as local artifacts and should stay out of git. The harness itself is versioned so anyone can reproduce the same workload mix, dataset seeds and report structure from a given revision.

Project Status

The repository is in active development and the public package surface is still best treated as pre-release.

License

This project is licensed under the Apache License 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mongoeco-2.0.1.tar.gz (259.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mongoeco-2.0.1-py3-none-any.whl (295.5 kB view details)

Uploaded Python 3

File details

Details for the file mongoeco-2.0.1.tar.gz.

File metadata

  • Download URL: mongoeco-2.0.1.tar.gz
  • Upload date:
  • Size: 259.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mongoeco-2.0.1.tar.gz
Algorithm Hash digest
SHA256 8b24251ae2760eefad4b234a8c45c1ed09090c53d5c2b98f0e4c94f6027dcafd
MD5 854cc6f97b6205147e590e4e1eedd5c6
BLAKE2b-256 4bec8dc70730d4d85252a1ab1bfa517a967ebe1a03da49e8f22294f63276d22b

See more details on using hashes here.

File details

Details for the file mongoeco-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: mongoeco-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 295.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mongoeco-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 40afc7ae841626ab2693a91d59f879f76bc04f3fc222659394f4c976e085cb14
MD5 9ce2a8dd81d48576d53a1e3f3801fd66
BLAKE2b-256 4915cd3ffc920d01ddc7972ff4266b61445d6ad5022bb955addcf0993cfbc88b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page