MEMOPT — Universal Memory Fabric for AI Infrastructure. Open-sourced by Sophisticates (https://sophisticatesai.com).

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

MEMOPT — Universal Memory Fabric for AI Infrastructure

MEMOPT is the open-source universal memory fabric for AI infrastructure, built from first principles by Sophisticates — a deep tech venture company working across AI, Quantum Computing, Robotics, and Physics. MEMOPT is Sophisticates' flagship product, open-sourced under Apache-2.0 so the broader AI infrastructure community can build on, audit, and extend the hardest part of GPU serving: memory.

⚠️ ALPHA — TEST ON YOUR OWN GPU BEFORE PRODUCTION

GPU validation has NOT been performed on this release. The 1016-passing test baseline is Mac / no-CUDA only. The 2 @gpu tests

18 cuda-named tests are SKIPPED / DESELECTED on the release host.

If you are deploying to real GPUs you MUST:

Run the full regression on your target GPU (A100 / H100 / L40S / ROCm) with the steps in PRODUCTION_READINESS.md.

Soak-test under representative traffic for ≥ 24 hours before declaring the deployment "production-ready."

Set MEMOPT_SIGNING_KEY to a high-entropy secret (NOT the default).

Do not assume "tests pass" means "works on my hardware." See the four blockers in Status — production readiness below.

A GPU memory profiling, and serving platform that turns GPU clusters into a unified memory fabric. Python control plane, C++17 data plane, optional CUDA kernels.

1016 tests pass | 7 C++ test suites | 0 failures (Mac / no CUDA)

Status — production readiness

This is an alpha release (v1.3.0a1). The library is OSS-licensed and the test baseline is green on Mac, but the following must complete before any "production-ready" claim:

GPU rig validation — the 2 @gpu tests and 18 cuda-named tests are SKIPPED / DESELECTED on Mac. They must run green on an A100 / H100 host before tagging a non-alpha release.
CI must pass on its first push — .github/workflows/ci.yml defines a Linux + Mac × Python 3.10/3.11/3.12 matrix; nothing has run there yet.
Live-workload soak — recommend ≥ 24h soak under representative traffic before declaring "production-ready."
Phase B MEMOPT_USE_ORCHESTRATOR=1 — Layer 2 (orchestrator) ships in observation-only mode per docs/orchestrator_v1_design.md DECISION 7. Eviction-driving Layer 2 ships in v1.4.0, not here.

Track these in the v1.3.0 entry of CHANGELOG.md.

What It Does

memopt v1.3.0 ships two infrastructure layers plus eight product pillars.

Infrastructure layers

Layer	Purpose	Reference
Layer 1 — Substrate	Tenant-isolated, stream-aware allocator with pluggable backends (CUDA VMM, ROCm/HIP, CXL/NUMA, CPU). Public API: `memopt.alloc / free / context / stats / observe / peek_handle / MemoryHandle`.	docs/substrate_v1_design.md
Layer 2 — Orchestrator	Tenant-aware decision pump on top of the substrate. Public API: `memopt.orchestrator.start / stop / stats / register_policy`. v1.0 ships in observation-only mode (DECISION 7).	docs/orchestrator_v1_design.md

Pillars

#	Pillar	What It Solves	How
1	Infinite Context VMM	KV cache OOM for long contexts	Multi-tier paging (HBM → DRAM → NVMe) with predictive prefetch
2	Agentic KV Memory	Redundant KV recomputation across requests	Content-addressed cache skips inference on exact prompt hit
3	Self-Synthesizing Kernels	HBM memory stalls	Detects stalls, calls Claude API, synthesizes fused Triton kernels
4	AI Compliance Ledger	Energy / cost accountability + EU AI Act conformity	Per-batch energy measurement, SQLite ledger, HMAC-signed entries, carbon calculator, savings/compliance reports
5	Global Unified Memory	Wasted NVMe across nodes	Cross-node block sharing over TCP/RDMA with lease protocol
6	Silicon Certification	Hardware drift	Correctness + throughput battery, drift detector, auto re-cert daemon
7	GPU FinOps Intelligence	$/hour waste invisibility	Per-tenant utilization → dollar tracking with auditable signed reports
8	Hardware Abstraction	Multi-backend portability	Unified HAL over CUDA / ROCm / Gaudi / TPU / CPU stubs

Pillars 4, 6, 7 wire to Layer 1/2 through memopt/integrations/ (attach_ledger_to_substrate, FinOpsPoller, assemble_production_receipt).

Quick Start

# Install from PyPI:
pip install memopt-engine

# After installation, the Python import path is `memopt`
# (distribution-name vs import-name — same convention as
# `pip install scikit-learn` then `import sklearn`):
python -c "import memopt; print(memopt.__version__)"

For development from source:

git clone https://github.com/basnetlachu/memopt.git
cd memopt
pip install -e ".[dev,daemon,api]"

Profile a Model

memopt profile --model gpt2 --batch-size 8

Serve with All Pillars Active

memopt-serve --model meta-llama/Llama-2-7b --port 8001

The serving engine automatically:

Deduplicates KV cache across requests (Pillar 2)
Synthesizes fused kernels on HBM stalls (Pillar 3)
Measures energy per token via NVML (Pillar 4)
Monitors hardware drift (Pillar 6)

OpenAI-Compatible API

curl http://localhost:8001/v1/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llama2", "prompt": "Hello", "max_tokens": 50}'

Check Savings

curl http://localhost:8001/report/found-capacity

Architecture

Python (control plane)          C++ (data plane)
─────────────────────          ──────────────────
vmm/                           csrc/core/
  page_table.py (shim) ──────── _memopt_core.so
  oracle.py (shim)                (64-shard page table,
  tier_manager.py                  striped-lock oracle,
  prefetch_engine.py               16-shard block directory)
  federation.py

serving/                       csrc/hooks/
  kernel_hooks.py (shim) ────── _memopt_hooks.so
  auto_optimizer.py               (FNV-1a keys, atomic
  server.py                        counters, lock-free dispatch)
  paged_attention.py (shim) ── _memopt_paged.so
                                  (block pool, CUDA gather)

cluster/                       csrc/cuda_backend/
  gkd_store.py ────────────────  _memopt_cuda.so
  transport.py (shim) ────────── memopt-transport (sidecar)
  prefix_index.py (shim) ─────  _memopt_simd.so
  block_directory.py (shim)      (AVX-512 prefix match)

Every C++ module has a Python fallback. The system runs correctly without any C++ extensions built.

Repository Structure

memopt/
├── memopt/              Python package (control plane)
│   ├── vmm/             Infinite Context VMM (Pillar 1)
│   ├── cluster/         GKD + GUM + transport (Pillars 2, 5)
│   ├── kernels/         Self-synthesizing kernels (Pillar 3)
│   ├── observability/   Energy ledger + certificates (Pillar 4)
│   ├── serving/         OpenAI-compatible HTTP server
│   ├── profiler/        Hardware counter profiling
│   ├── control_plane/   Cluster management (FastAPI)
│   ├── daemon/          Background GPU monitor
│   └── api/             REST API
├── csrc/                C++17 data plane (38 source files)
│   ├── core/            PageTable + Oracle + BlockDirectory
│   ├── hooks/           Kernel dispatch table
│   ├── paged/           Block pool + CUDA gather kernel
│   ├── cuda_backend/    Stream pool + NVMe I/O + GDS
│   ├── transport/       RDMA sidecar daemon
│   └── simd/            AVX-512 prefix matching
├── tests/cpp/           GoogleTest suites (7 files)
├── scripts/             Audit and tooling
│   └── audit_wiring.py  Runtime wiring verification
└── docs/
    ├── architecture.md  Complete technical reference
    └── rdma_deployment.md  InfiniBand deployment guide

Configuration

All behavior is configurable via environment variables. Key ones:

Variable	Default	Purpose
`REDIS_URL`	—	Redis for cluster-wide GKD + peer discovery
`MEMOPT_NODE_ID`	hostname	Unique node identifier
`MEMOPT_EVICT_HIGH`	0.90	HBM eviction trigger threshold
`MEMOPT_EVICT_LOW`	0.75	HBM eviction target threshold
`MEMOPT_GOSSIP_FANOUT`	5	Peers per gossip round
`MEMOPT_FETCH_RETRIES`	0	Remote block fetch retry count
`MEMOPT_NVME_MAX_GB`	500	NVMe usage cap before eviction
`MEMOPT_QP_DEBUG`	0	Log RDMA QP state transitions

See docs/architecture.md Section 25 for the complete list.

Building C++ Extensions

pip install pybind11 scikit-build-core cmake ninja

# Build all extensions
cd csrc && mkdir build && cd build
cmake .. -DMEMOPT_ENABLE_TESTS=ON
make -j$(nproc)

# Run C++ tests
ctest --output-on-failure

# Optional: CUDA, RDMA, AVX-512
cmake .. -DMEMOPT_ENABLE_RDMA=ON -DMEMOPT_ENABLE_AVX512=ON

Running Tests

# Python tests (no C++ required)
pytest --tb=short -q

# Wiring audit (verifies C++ integration)
python scripts/audit_wiring.py

Requirements

Python 3.10+
PyTorch 2.0+
Optional: CUDA 12.4+ (GPU kernels), pynvml (power measurement), redis-py (cluster mode)

Documentation

Architecture Reference — complete technical specification
RDMA Deployment Guide — InfiniBand setup and troubleshooting

Memory substrate

memopt v1 ships Layer 1 of the memory substrate: a tenant-isolated, stream-aware allocator with pluggable backends (CUDA VMM, ROCm/HIP stub, Level Zero stub, CXL/NUMA, CPU). The public API is memopt.alloc / free / context / stats / observe plus MemoryHandle. See docs/substrate_v1_user_guide.md for usage and docs/substrate_v1_design.md for the spec.

Orchestrator (Layer 2)

memopt v1.2 adds Layer 2: a tenant-aware observation/decision layer that sits on top of the substrate. In v1.0 (Phase A) it observes the substrate's event stream and exposes a public Policy protocol; it does NOT drive eviction yet (that ships behind MEMOPT_USE_ORCHESTRATOR=1 in Phase B). The public API is memopt.orchestrator.start / stop / stats / register_policy plus memopt.peek_handle. See docs/orchestrator_v1_user_guide.md for usage and docs/orchestrator_v1_design.md for the spec.

About MEMOPT

MEMOPT (pronounced memm-opt) is a universal memory fabric for AI infrastructure. It solves one of the hardest problems in modern GPU serving: memory — KV-cache OOM under long contexts, redundant KV recomputation across requests, HBM stalls from un-fused kernels, multi-tier paging across HBM/DRAM/NVMe, cross-node block sharing, and auditable energy/cost accountability. memopt unifies all of these behind one API, with two pinned infrastructure layers (substrate + orchestrator) and eight product pillars layered on top.

It's released under Apache-2.0 so any AI infrastructure team can read the source, audit the security guarantees, fork it, contribute back, or run it in production without licensing friction.

About Sophisticates

memopt is built and open-sourced by Sophisticates (pronounced so-phis-ti-cates), a deep tech venture company founded by Lachu Man Basnet. Sophisticates builds companies from first principles across AI, Quantum Computing, Robotics, and Physics. MEMOPT is Sophisticates' flagship product in the AI infrastructure vertical.

Website: sophisticatesai.com
Maintainer: Lachu Man Basnet (lachu.basnet@sophisticatesai.com)
Issues / discussions: https://github.com/basnetlachu/memopt/issues
Security disclosures: see CONTRIBUTING.md § Security

If your team uses memopt in production, we'd love to hear about it — open a discussion on GitHub or reach out via sophisticatesai.com.

License

Apache License 2.0. See LICENSE for the full text and NOTICE for third-party attributions. A pinned dependency license audit lives at docs/license_audit.md.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

lachubasnet

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.3.0a3 pre-release

May 5, 2026

This version

1.3.0a2 pre-release

May 5, 2026

1.3.0a1 pre-release

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memopt_engine-1.3.0a2.tar.gz (558.4 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memopt_engine-1.3.0a2-py3-none-any.whl (639.7 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file memopt_engine-1.3.0a2.tar.gz.

File metadata

Download URL: memopt_engine-1.3.0a2.tar.gz
Upload date: May 5, 2026
Size: 558.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memopt_engine-1.3.0a2.tar.gz
Algorithm	Hash digest
SHA256	`71ffcb2cb55e4741080dd5a3e7990b552d243409ad273eb8f805724bba0053de`
MD5	`30baa4f3fcd65af825b4887671edf522`
BLAKE2b-256	`589ea047b3a956648e7ae83aca64ba4db63238a523e3b521f815682345ec8c3d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for memopt_engine-1.3.0a2.tar.gz:

Publisher: publish.yml on basnetlachu/memopt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: memopt_engine-1.3.0a2.tar.gz
- Subject digest: 71ffcb2cb55e4741080dd5a3e7990b552d243409ad273eb8f805724bba0053de
- Sigstore transparency entry: 1439267054
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: basnetlachu/memopt@066c33b8eaf0caf3eccd26607ba53f1e7bc5968c
- Branch / Tag: refs/tags/v1.3.0a2
- Owner: https://github.com/basnetlachu
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@066c33b8eaf0caf3eccd26607ba53f1e7bc5968c
- Trigger Event: push

File details

Details for the file memopt_engine-1.3.0a2-py3-none-any.whl.

File metadata

Download URL: memopt_engine-1.3.0a2-py3-none-any.whl
Upload date: May 5, 2026
Size: 639.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memopt_engine-1.3.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f81be594379211946935be1212aaaf8fde9d2e51ef56e027787332444006e82`
MD5	`ae39708869ccf6de57bc3cd4b5fbc0ab`
BLAKE2b-256	`434eaf8cc931ff46cfaa9010f3698271e68211d0068d409f1a193d15fb8f82c4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for memopt_engine-1.3.0a2-py3-none-any.whl:

Publisher: publish.yml on basnetlachu/memopt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: memopt_engine-1.3.0a2-py3-none-any.whl
- Subject digest: 8f81be594379211946935be1212aaaf8fde9d2e51ef56e027787332444006e82
- Sigstore transparency entry: 1439267065
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: basnetlachu/memopt@066c33b8eaf0caf3eccd26607ba53f1e7bc5968c
- Branch / Tag: refs/tags/v1.3.0a2
- Owner: https://github.com/basnetlachu
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@066c33b8eaf0caf3eccd26607ba53f1e7bc5968c
- Trigger Event: push

memopt-engine 1.3.0a2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

MEMOPT — Universal Memory Fabric for AI Infrastructure

⚠️ ALPHA — TEST ON YOUR OWN GPU BEFORE PRODUCTION

Status — production readiness

What It Does

Infrastructure layers

Pillars

Quick Start

Profile a Model

Serve with All Pillars Active

OpenAI-Compatible API

Check Savings

Architecture

Repository Structure

Configuration

Building C++ Extensions

Running Tests

Requirements

Documentation

Memory substrate

Orchestrator (Layer 2)

About MEMOPT

About Sophisticates

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance