Skip to main content

Recipe-driven recommender training and serving on irspack.

Project description

Recotem

PyPI Python License CI

Recipe-driven recommender training and serving, built on irspack. One YAML recipe describes where the data lives, how to train, and where to write the result — recotem train produces a signed binary artifact, recotem serve mounts it under /v1/recipes/{name}:recommend (plus :recommend-related and batch verbs) and hot-swaps when a new artifact appears. No database, no message broker, no admin UI.

Why Recotem

Most recommender stacks pull in a service mesh of databases, queues, and control planes before you can train your first model. Recotem keeps the moving parts to a recipe file and a binary artifact:

  • Single binary, two commands. recotem train runs as a batch job; recotem serve runs as a long-lived FastAPI process. They share nothing but the artifact file on disk (or object storage).
  • Reproducible by construction. Recipes are versioned with your code; artifacts are HMAC-signed with a SHA-checked header you can inspect without loading the model.
  • Hot-swap, no restart. The serving process watches the artifact directory and atomically swaps the in-memory model when training emits a new file.
  • Bring-your-own scheduler. recotem train is a normal process — drive it from cron, Airflow, a Kubernetes CronJob, or anything else.

Features

  • Recipe-driven: 1 YAML = 1 model = 1 /v1/recipes/{name}:recommend endpoint (with related/batch verbs)
  • Hyperparameter search across irspack algorithms via Optuna
  • Pluggable data sources (built-in: CSV / Parquet / BigQuery / SQL; extend via Python entry points)
  • HMAC-signed artifacts with multi-key rotation and a deterministic FQCN allow-list at deserialization time
  • API-key authentication (X-API-Key); keys hashed at rest
  • fsspec paths everywhere — local, S3, GCS, HTTPS, anything fsspec speaks
  • Optional Prometheus metrics endpoint, structured JSON logs with built-in secret redaction

Data Sources

  • CSV / Parquet — local files or any fsspec-reachable URL (S3, GCS, Azure, HTTPS).
  • BigQuery — SQL queries with Storage Read API support.
  • SQL (PostgreSQL / MySQL / MariaDB / SQLite) — via SQLAlchemy 2. See docs/data-sources/sql.md.
  • Custom plugins — implement the DataSource Protocol and register via recotem.datasources entry-points.

Install

pip install recotem                 # core
pip install "recotem[bigquery]"     # BigQuery data source
pip install "recotem[metrics]"      # Prometheus metrics endpoint
pip install 'recotem[postgres]'     # PostgreSQL via psycopg
pip install 'recotem[mysql]'        # MySQL/MariaDB via PyMySQL
pip install 'recotem[sqlite]'       # SQLite (stdlib)

Requires Python 3.12+. A multi-arch Docker image is published to ghcr.io/codelibs/recotem.

Quickstart

The repository ships with a self-contained example at examples/quickstart/ — recipe, dataset, and artifact directory all in one place. Train a TopPop recommender from a 60-user CSV in under a minute.

# 1. Set demo keys. DEMO ONLY — for production, generate fresh keys with
#    `recotem keygen --type signing` and `recotem keygen --type api`.
export RECOTEM_SIGNING_KEYS="dev:0000000000000000000000000000000000000000000000000000000000000000"
export RECOTEM_API_PLAINTEXT="recotem-quickstart-demo-key-0000"
export RECOTEM_API_KEYS="dev:sha256:21be5c3be85b8d68123df9f9b6a26d8e307db30350ea8bcc844883e22ebcf125"

# 2. Train, serve
recotem train examples/quickstart/recipe.yaml
recotem serve --recipes examples/quickstart/ &

# Wait for the server to become ready before sending traffic.
until curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/v1/health | grep -q "200"; do sleep 1; done

# 3. Recommend
# 3a. Recommend for a known user
curl -X POST http://localhost:8080/v1/recipes/top_picks:recommend \
  -H "X-API-Key: $RECOTEM_API_PLAINTEXT" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "u01", "limit": 5}'

# 3b. Recommend items related to a seed item
curl -X POST http://localhost:8080/v1/recipes/top_picks:recommend-related \
  -H "X-API-Key: $RECOTEM_API_PLAINTEXT" \
  -H "Content-Type: application/json" \
  -d '{"seed_items": ["i00"], "limit": 5}'
{
  "request_id": "req_01HZX...",
  "recipe": "top_picks",
  "model_version": "sha256:abc...",
  "items": [{"item_id": "i00", "score": 0.91}]
}

The recipe itself is 11 lines — every other field has a sensible default. See examples/quickstart/recipe.yaml for the source of truth and docs/recipe-reference.md for the full schema.

Which env var is needed where?

Variable Required by Purpose
RECOTEM_SIGNING_KEYS train and serve HMAC sign / verify artifact files (server keeps plaintext; needed for both sides)
RECOTEM_API_KEYS serve Authenticate /v1/recipes/* callers (server keeps hash only)
X-API-Key: <plaintext> HTTP clients Sent by clients on every /v1/recipes/* call; server re-hashes and compares

Both variables accept multiple comma-separated entries (kid:value,kid2:value,…) to enable zero-downtime key rotation — that is why they are pluralised.

Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                  recotem (single Python package)                       │
├────────────────────────────────────────────────────────────────────────┤
│                                                                        │
│   recipe.yaml ──▶ recotem train ──▶ artifact.recotem ──▶ recotem serve │
│                   (batch job)        (HMAC-signed)        (FastAPI,    │
│                                                            hot-swap)   │
│                                                                        │
│   any scheduler          local FS, S3,         POST /v1/recipes/{name} │
│   (cron / k8s / …)       GCS, fsspec               X-API-Key auth      │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

train and serve communicate only via signed artifact files. They can run on different machines; the watcher swaps models per recipe based on file mtime.

Documentation

Contributing

Issues and pull requests welcome. Development uses uv for dependency management:

uv sync --all-extras
uv run pytest tests
uv run ruff check src tests

See CLAUDE.md (or the project guidelines therein) for the full contributor workflow.

License

Apache License 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recotem-2.0.0.tar.gz (178.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recotem-2.0.0-py3-none-any.whl (203.3 kB view details)

Uploaded Python 3

File details

Details for the file recotem-2.0.0.tar.gz.

File metadata

  • Download URL: recotem-2.0.0.tar.gz
  • Upload date:
  • Size: 178.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for recotem-2.0.0.tar.gz
Algorithm Hash digest
SHA256 812456913baaba91a5cfd59e7ec50464735d7b57e7c9c3c7a779f87f29abbf94
MD5 ac35b03db2f1c6f5162ed050fefd5aba
BLAKE2b-256 50690b2362b02ab6c34b55dab611f155f479b90acad0ae29ab4516f4dda1d42f

See more details on using hashes here.

Provenance

The following attestation bundles were made for recotem-2.0.0.tar.gz:

Publisher: publish.yml on codelibs/recotem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file recotem-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: recotem-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 203.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for recotem-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e80bb72d059f55dd54048bec1b8e1721ac158160d8d1a73e010466b11f4de2a
MD5 6a0832ecd7ea480fb552aaa07c6f5085
BLAKE2b-256 2cd12194501695b390ca985105fd9e4ea8e6b99007021ad07fd5aa6f29a3edfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for recotem-2.0.0-py3-none-any.whl:

Publisher: publish.yml on codelibs/recotem

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page