Skip to main content

Async-first MongoDB-like persistence library with pluggable storage engines.

Project description

mongoeco

mongoeco is an async-first MongoDB-like persistence library with pluggable storage engines.

It is designed for local development, test environments, embedded persistence and compatibility work where a PyMongo-shaped API is useful without requiring a real MongoDB server for every workflow.

Current Scope

What is already in place:

  • async and sync client APIs
  • memory and SQLite engines
  • transactional local sessions
  • aggregation runtime with pushdown and spill guardrails
  • compatibility modeling across MongoDB dialects and PyMongo profiles
  • local wire/driver runtime
  • search index lifecycle plus local $search / experimental $vectorSearch

What this is not:

  • a drop-in replacement for a production MongoDB cluster
  • a full Atlas Search implementation
  • a full-text/vector engine with server-grade scaling guarantees

Installation

Editable local install:

python -m pip install -e .

Development install:

python -m pip install -e .[dev]

Optional fast JSON backend:

python -m pip install -e .[json-fast]

mongoeco uses the standard library json module by default, even if orjson is installed. You can choose the backend at process start with MONGOECO_JSON_BACKEND:

  • stdlib: always use the standard library JSON backend
  • orjson: require orjson and use it
  • auto: use orjson when available, otherwise fall back to stdlib

Example:

MONGOECO_JSON_BACKEND=orjson python your_app.py

Quick Start

Async with the in-memory engine:

import asyncio

from mongoeco import AsyncMongoClient
from mongoeco.engines.memory import MemoryEngine


async def main() -> None:
    async with AsyncMongoClient(MemoryEngine()) as client:
        collection = client.demo.users
        await collection.insert_one({"_id": "1", "name": "Ada"})
        document = await collection.find_one({"name": "Ada"})
        print(document)


asyncio.run(main())

Sync with SQLite:

from mongoeco import MongoClient
from mongoeco.engines.sqlite import SQLiteEngine


with MongoClient(SQLiteEngine("mongoeco.db")) as client:
    collection = client.demo.users
    collection.insert_one({"_id": "1", "name": "Ada"})
    print(collection.find_one({"_id": "1"}))

Compatibility

mongoeco models two separate axes:

  • MongoDB server semantics through mongodb_dialect
  • PyMongo surface compatibility through pymongo_profile

See:

Testing

The repository currently uses the standard library test runner:

python -m pip install -e .[dev]
python -m unittest discover -s tests -p 'test*.py'

Benchmarks

There is a benchmark harness under benchmarks/README.md intended for reproducible local profiling, regression tracking and community-facing performance analysis.

Quick smoke run:

python -m benchmarks.run \
  --engine all \
  --size 1000 \
  --warmup 0 \
  --repetitions 1

Latest local smoke snapshot from March 30, 2026:

  • command:
python -m benchmarks.run \
  --engine all \
  --size 1000 \
  --warmup 0 \
  --repetitions 1 \
  --format table
  • high-level reading:
    • memory is currently strongest on point lookups and most filter-heavy workloads
    • sqlite is currently strongest on full-sort workloads and remains competitive on filter-heavy workloads
    • mongomock is still a useful baseline, but mongoeco now beats it in many lookup, filter and sort cases on this harness
  • notable smoke numbers:
    • secondary_lookup_indexed_1k
      • memory-sync 0.1647s
      • sqlite-sync 0.3760s
      • memory-async 0.1542s
      • sqlite-async 0.3089s
      • mongomock 0.6840s
    • filter_selectivity_high_100
      • memory-sync 0.3720s
      • sqlite-sync 0.4404s
      • memory-async 0.3201s
      • sqlite-async 0.3534s
      • mongomock 0.2676s
    • sort_limit_indexed_200
      • memory-sync 0.1971s
      • sqlite-sync 0.3427s
      • memory-async 0.1783s
      • sqlite-async 0.4458s
      • mongomock 1.4541s
    • sort_shape_top_level_full_50
      • memory-sync 0.6109s
      • sqlite-sync 0.3267s
      • memory-async 0.5790s
      • sqlite-async 0.3182s
      • mongomock 0.4602s

Treat this as a smoke snapshot, not a publication-quality claim. For anything you plan to cite publicly, use the reproducible report commands below with warmup and repeated runs.

Recommended community matrix:

  • size=100
    • small-dataset overhead and API-path visibility
  • size=1000
    • primary reference point for balanced comparisons
  • size=5000
    • larger-scale behavior without turning the default community workflow into a multi-hour run

The size=50 case is mainly useful for local smoke checks, and size=500 rarely changes the story enough to justify making it part of the default published matrix.

Reproducible local report:

python -m benchmarks.report \
  --engine all \
  --size 10000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/latest.json \
  --output-markdown benchmarks/reports/latest.md

Suggested community runs:

python -m benchmarks.report \
  --engine all \
  --size 100 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-100.json \
  --output-markdown benchmarks/reports/matrix-100.md

python -m benchmarks.report \
  --engine all \
  --size 1000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-1000.json \
  --output-markdown benchmarks/reports/matrix-1000.md

python -m benchmarks.report \
  --engine all \
  --size 5000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/matrix-5000.json \
  --output-markdown benchmarks/reports/matrix-5000.md

Latest serious reference snapshot used for local analysis:

  • date: March 30, 2026
  • command:
python -m benchmarks.report \
  --engine all \
  --size 1000 \
  --warmup 1 \
  --repetitions 5 \
  --output-json benchmarks/reports/latest-1000-serious.json \
  --output-markdown benchmarks/reports/latest-1000-serious.md
  • environment:
    • Python 3.14.0
    • macOS 15.6
    • arm64
    • JSON backend stdlib
    • git revision 7277a30904192aa8c6cbb7547ec3035840781bb4
    • worktree was dirty, so treat the numbers as a strong local reference, not a final published baseline
  • representative means:
    • secondary_lookup_indexed_1k
      • memory-sync 0.1430s
      • sqlite-sync 0.3142s
      • memory-async 0.1414s
      • sqlite-async 0.3048s
      • mongomock 0.6643s
    • filter_selectivity_high_100
      • memory-sync 0.3275s
      • sqlite-sync 0.3898s
      • memory-async 0.3069s
      • sqlite-async 0.3556s
      • mongomock 0.2529s
    • sort_limit_indexed_200
      • memory-sync 0.1769s
      • sqlite-sync 0.2927s
      • memory-async 0.1705s
      • sqlite-async 0.4060s
      • mongomock 1.3748s
    • sort_shape_top_level_full_50
      • memory-sync 0.5765s
      • sqlite-sync 0.2887s
      • memory-async 0.5709s
      • sqlite-async 0.2946s
      • mongomock 0.4256s

Current high-level reading from that snapshot:

  • memory is strongest on point lookups and most filter-heavy workloads
  • sqlite is strongest on full-sort workloads and remains competitive on many filter-heavy workloads
  • mongomock remains a useful baseline, but it is no longer the fastest option in many lookup, filter and sort scenarios covered by this harness

orjson note:

  • mongoeco defaults to stdlib JSON for reproducibility
  • SQLite-specific A/B runs show that orjson can help materially on JSON-heavy filter and materialization workloads at size=1000 and above
  • the same A/B runs do not show a universal improvement for lookup-heavy workloads
  • because of that, benchmark discussions should always state the JSON backend explicitly when SQLite is involved

Benchmark outputs are treated as local artifacts and should stay out of git. The harness itself is versioned so anyone can reproduce the same workload mix, dataset seeds and report structure from a given revision.

Project Status

The repository is in active development and the public package surface is still best treated as pre-release.

License

This project is licensed under the Apache License 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mongoeco-2.2.0.tar.gz (268.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mongoeco-2.2.0-py3-none-any.whl (304.8 kB view details)

Uploaded Python 3

File details

Details for the file mongoeco-2.2.0.tar.gz.

File metadata

  • Download URL: mongoeco-2.2.0.tar.gz
  • Upload date:
  • Size: 268.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mongoeco-2.2.0.tar.gz
Algorithm Hash digest
SHA256 7b5e4e7b8cd3c8aa4fb533f8035ddb628d0bf19f3faa9723a0eb140614a076cc
MD5 b2c349afbca9b9789acb3cea17e04922
BLAKE2b-256 52aca9cf6a7554be31acece37bb554986c821c60afc25dc86d0a925ce3a4df3b

See more details on using hashes here.

File details

Details for the file mongoeco-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: mongoeco-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 304.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mongoeco-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3b696f3642d955aabd47d008eb0328a46819e69dd65764d94158459604f0ce27
MD5 0ee6ea7f2668db24f0a60b579d5dd48f
BLAKE2b-256 416906c8e8dd17ad0480cdd057a5947b412875b7331206a2f6abe6667e830bbc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page