Python/Mojo interface for Google Gemma 3 using MAX Engine
Project description
mogemma
Python/Mojo interface for Google Gemma 3 with MAX Engine.
Features
- Embeddings — Generate dense vector embeddings through the Mojo backend
- Text generation — Synchronous and async streaming text generation with configurable sampling (temperature, top-k, top-p)
- HuggingFace Hub — Automatically resolves local paths and downloads missing HF IDs into the cache
- OpenTelemetry — Optional tracing instrumentation
- Lazy imports — Only loads what you use; optional extras keep the install slim
Installation
pip install mogemma
huggingface-hub is included in the base installation, so Hugging Face model IDs are supported by default.
Optional extras
| Extra | What it adds | Install |
|---|---|---|
embed |
numpy (for embeddings) | pip install 'mogemma[embed]' |
text |
numpy + tokenizers (for text generation) | pip install 'mogemma[text]' |
telemetry |
opentelemetry-api (for tracing) | pip install 'mogemma[telemetry]' |
all |
everything above | pip install 'mogemma[all]' |
Model resolution behavior
- Model identifiers are accepted in
namespace/modelformat (for example,google/gemma-3-1b). - Missing IDs are downloaded into the local cache on first use unless disabled by your offline settings.
Quick start
Embeddings
from mogemma import EmbeddingConfig, EmbeddingModel
config = EmbeddingConfig(model_path="google/gemma-3-1b")
model = EmbeddingModel(config)
embeddings = model.embed(["Hello, world!", "Model outputs are computed by MAX Engine."])
print(embeddings.shape) # (2, hidden_dim)
Text generation
from mogemma import GenerationConfig, SyncGemmaModel
config = GenerationConfig(
model_path="google/gemma-3-1b",
max_new_tokens=64,
temperature=0.7,
)
model = SyncGemmaModel(config)
# Full generation
print(model.generate("Explain quantum computing in one sentence:"))
# Streaming
for token in model.generate_stream("Once upon a time"):
print(token, end="", flush=True)
Async streaming
import asyncio
from mogemma import GenerationConfig
from mogemma.model import AsyncGemmaModel
config = GenerationConfig(model_path="google/gemma-3-1b", max_new_tokens=64)
model = AsyncGemmaModel(config)
async def main():
async for token in model.generate_stream("The future of AI is"):
print(token, end="", flush=True)
asyncio.run(main())
Development
# Clone and install everything
git clone https://github.com/cofin/mogemma.git
cd mogemma
make install
# Run tests
make test
# Lint and type-check
make lint
# Run release preflight checks
make check-release
# Build the Mojo shared library
make build
Troubleshooting
Model path ... exists but is not a directory: point to a model directory or use a valid Hugging Face model id.offline mode: setHF_HUB_OFFLINE=0, ensure network access, then retry.HF_TOKEN: configure a token (huggingface-cli login) for private/restricted repos.- Cached downloads are stored in
~/.cache/mogemma.
Implementation evidence
- Core output contracts and error paths:
src/mo/tests/test_core_contract.pysrc/py/tests/test_contracts_integration.py
- Runtime contract behavior (sync/async generation and embeddings):
src/py/tests/test_gemma_model.pysrc/py/tests/test_embeddings.pysrc/py/tests/test_async.py
- Model delivery and hub error handling:
src/py/tests/test_hub.py
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mogemma-0.1.0.tar.gz.
File metadata
- Download URL: mogemma-0.1.0.tar.gz
- Upload date:
- Size: 102.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a7f20674d67ee861eeeb29f2f4d601cc10251121bbd03f67fbf1c45721d02d9
|
|
| MD5 |
b1750457bffb736222645089b99bfa5b
|
|
| BLAKE2b-256 |
8a32401ad7fbf5235ec83448629648053a8193fbeff197116e0408e701390a83
|
Provenance
The following attestation bundles were made for mogemma-0.1.0.tar.gz:
Publisher:
publish.yml on cofin/mogemma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mogemma-0.1.0.tar.gz -
Subject digest:
4a7f20674d67ee861eeeb29f2f4d601cc10251121bbd03f67fbf1c45721d02d9 - Sigstore transparency entry: 976818112
- Sigstore integration time:
-
Permalink:
cofin/mogemma@2dd32e2946094c4d0766c4e40f4a4faf719b3421 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/cofin
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2dd32e2946094c4d0766c4e40f4a4faf719b3421 -
Trigger Event:
release
-
Statement type:
File details
Details for the file mogemma-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mogemma-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10b5f9a05fb2d8d88e59c67331558652a8fa960152ad46ddfe738176edd9d2f0
|
|
| MD5 |
30c1db0eee7670216432d37a88dae3f2
|
|
| BLAKE2b-256 |
1c6801f150651646ea7ebad8bd9dd0221c8e195f2a5af408ff552cf62f696c27
|
Provenance
The following attestation bundles were made for mogemma-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on cofin/mogemma
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mogemma-0.1.0-py3-none-any.whl -
Subject digest:
10b5f9a05fb2d8d88e59c67331558652a8fa960152ad46ddfe738176edd9d2f0 - Sigstore transparency entry: 976818123
- Sigstore integration time:
-
Permalink:
cofin/mogemma@2dd32e2946094c4d0766c4e40f4a4faf719b3421 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/cofin
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2dd32e2946094c4d0766c4e40f4a4faf719b3421 -
Trigger Event:
release
-
Statement type: