Skip to main content

🔥 Lightweight recommender for teaching and small-scale production

Project description

CI Python License: MIT PyPI version pre-commit

Lightweight recommender for teaching and small-scale production.

Why recsys-lite?

recsys-lite is a small, teach-first recommender toolkit that runs in constrained environments and scales to real data. Use it from the CLI or Python to load interaction data, train classic recommenders, evaluate, and export results reproducibly.

Highlights

  • Teach‑first UX – Interactive CLI, instructor notes, and Jupyter demos so classes and workshops can get hands‑on quickly.
  • Solid data handling – Validates schema, tolerates missing / malformed rows, supports dense or sparse input, and streams large CSV/Parquet files in chunks.
  • Runs small – Minimal deps; installs under ~50 MB* and trains sample models in <2 GB RAM; works offline once wheels are cached.
  • Multiple algorithms – SVD, implicit ALS, cosine KNN, global & per‑user bias, plus pluggable metrics and top‑N recommenders.
  • CLI ⇄ API parity – Same operations available programmatically and via recsys-lite commands; good for CI and teaching.
  • Reproducible deploy – Docker image and serve command expose a REST endpoint for scoring & top‑N; versioned model artifacts.
  • Docs & examples – Notebooks, educator guides, and generated API reference for fast onboarding.

🚀 Getting Started

Installation

pip install recsys_lite

Offline: See guide below.

Quick Example

import numpy as np
from recsys_lite import svd_reconstruct, top_n

ratings = np.array([[5, 3, 0, 1], [0, 0, 4, 5]], dtype=np.float32)
reconstructed = svd_reconstruct(ratings, k=2)
recommendations = top_n(reconstructed, ratings, n=2)
print(recommendations)

CLI: recsys --help

🎓 For Educators & Students

Teach recommendation systems without fancy hardware.

  • Interactive Mode: recsys teach --concept svd (prompts, examples)
  • Notebooks: examples/svd_math_demo.ipynb (math breakdowns, plots)
  • Low-Resource Demos: Generate data and run in <1s

Full guide: docs/educator_guide.md

🚀 For Small App Deployers

Build production recommenders for <10k users/items.

  • Deploy API: recsys deploy model.pkl
  • Offline Install: Download wheel, install via USB
  • Resource Efficient: Sparse matrices for low memory

Examples: Library book recommender, local e-commerce.

Full guide: docs/deployment_guide.md

🛠️ For Developers

Extend or contribute easily.

  • API: Clean, typed functions (svd_reconstruct, RecommenderSystem)
  • Contributing: make dev setup, tests, linting
  • Benchmarks: recsys benchmark

See CONTRIBUTING.md and API Reference.

ML Tooling Examples

ALS (Implicit Feedback):

from recsys_lite import RecommenderSystem
ratings = np.array([[1, 0, 1], [0, 1, 0]], dtype=np.float32)  # binary implicit
model = RecommenderSystem(algorithm="als")
model.fit(ratings, k=2)
preds = model.predict(ratings)

KNN (Cosine Similarity):

from recsys_lite import RecommenderSystem
ratings = np.array([[5, 3, 0], [0, 0, 4]], dtype=np.float32)
model = RecommenderSystem(algorithm="knn")
model.fit(ratings, k=2)
preds = model.predict(ratings)

Bias Handling (SVD):

from recsys_lite import RecommenderSystem
ratings = np.array([[5, 3, 0], [0, 0, 4]], dtype=np.float32)
model = RecommenderSystem(algorithm="svd")
model.fit(ratings, k=2)
preds = model.predict(ratings)  # Includes global/user/item bias

Chunked SVD:

from recsys_lite import svd_reconstruct
large_mat = np.random.rand(10000, 500)
reconstructed = svd_reconstruct(large_mat, k=10, use_sparse=True)

Metrics:

recs = [[1,2,3], [4,5]]
actual = [{1,3}, {4,6}]
print(precision_at_k(recs, actual, 3))
print(recall_at_k(recs, actual, 3))
print(ndcg_at_k(recs, actual, 3))

CV Split:

train, test = train_test_split_ratings(mat, test_size=0.2)

Pipeline:

from recsys_lite import RecsysPipeline, RecommenderSystem
pipe = RecsysPipeline([('model', RecommenderSystem())])
pipe.fit(mat, k=2)
recs = pipe.recommend(mat, n=3)

Grid Search:

result = grid_search_k(mat, [2,4], 'rmse')
print(result['best_k'])

Full details in API Reference.

📚 Use Cases

Education

  • University courses: Teach recommendation systems without expensive infrastructure
  • Self-learning: Students can run everything on personal laptops
  • Workshops: Quick demos that work offline
  • Research: Simple baseline implementation for papers

Small-Scale Production

  • School library: Recommend books to students (500 books, 200 students)
  • Local business: Product recommendations for small e-commerce
  • Community app: Match local services to residents
  • Personal projects: Add recommendations to your blog or app

Development

  • Prototyping: Test recommendation ideas quickly
  • Learning: Understand SVD by reading clean, documented code
  • Benchmarking: Compare against simple, fast baseline
  • Integration: Easy to embed in larger systems

📈 Performance

Resource Usage

Designed for resource-constrained environments:

Dataset Size RAM Usage Time (old laptop) Time (modern PC)
100 × 50 < 10 MB < 0.1s < 0.01s
1K × 1K < 50 MB < 1s < 0.1s
10K × 5K < 500 MB < 10s < 2s

Tested On

  • 10-year-old laptops (Core i3, 2GB RAM)
  • Raspberry Pi 4
  • Modern workstations
  • Cloud containers (minimal resources)

Memory Efficiency

  • Sparse matrix support: Handles 90% sparse data efficiently
  • Chunked processing: Works with limited RAM
  • Minimal dependencies: ~50MB total install size

🔧 API Reference

Core Functions

def svd_reconstruct(
    mat: FloatMatrix,
    *,
    k: Optional[int] = None,
    random_state: Optional[int] = None,
    use_sparse: bool = True,
) -> FloatMatrix:
    """Truncated SVD reconstruction for collaborative filtering. Supports chunked processing for large matrices."""

def top_n(
    est: FloatMatrix,
    known: FloatMatrix,
    *,
    n: int = 10
) -> np.ndarray:
    """Get top-N items for each user, excluding known items."""

def compute_rmse(predictions: FloatMatrix, actual: FloatMatrix) -> float:
    """Compute Root Mean Square Error between predictions and actual ratings."""

def compute_mae(predictions: FloatMatrix, actual: FloatMatrix) -> float:
    """Compute Mean Absolute Error between predictions and actual ratings."""

I/O Functions

def load_ratings(
    path: Union[str, Path],
    *,
    format: Optional[str] = None,
    sparse_format: bool = False,
    **kwargs: Any,
) -> FloatMatrix:
    """Load matrix from file with format detection."""

def save_ratings(
    matrix: Union[FloatMatrix, sparse.csr_matrix],
    path: Union[str, Path],
    *,
    format: Optional[str] = None,
    show_progress: bool = True,
    **kwargs: Any,
) -> None:
    """Save rating matrix with automatic format detection."""

def create_sample_ratings(
    n_users: int = 100,
    n_items: int = 50,
    sparsity: float = 0.8,
    rating_range: tuple[float, float] = (1.0, 5.0),
    random_state: Optional[int] = None,
    sparse_format: bool = False,
) -> Union[FloatMatrix, sparse.csr_matrix]:
    """Create a sample rating matrix for testing."""

RecommenderSystem Class

class RecommenderSystem:
    """Production-ready recommender system with model persistence. Supports SVD, ALS (implicit), and KNN algorithms, with bias handling."""

    def __init__(self, algorithm: str = "svd", ...):
        """algorithm: 'svd', 'als', or 'knn'"""
    def fit(self, ratings: FloatMatrix, k: Optional[int] = None) -> "RecommenderSystem":
        """Fit the model to training data."""
    def predict(self, ratings: FloatMatrix) -> FloatMatrix:
        """Generate predictions for the input matrix."""
    def recommend(self, ratings: FloatMatrix, n: int = 10, exclude_rated: bool = True) -> np.ndarray:
        """Generate top-N recommendations for each user."""
    def save(self, path: str) -> None:
        """Save model to file using secure joblib serialization."""
    @classmethod
    def load(cls, path: str) -> "RecommenderSystem":
        """Load model from file."""

🧩 Real-World Usage Examples

Using with Pandas

import pandas as pd
from recsys_lite import svd_reconstruct, top_n

# Load ratings from a DataFrame
ratings_df = pd.read_csv('ratings.csv', index_col=0)
ratings = ratings_df.values.astype(float)

# SVD recommendations
reconstructed = svd_reconstruct(ratings, k=20)
recommendations = top_n(reconstructed, ratings, n=5)
print(recommendations)

Integrating in a Web App (FastAPI Example)

from fastapi import FastAPI
from recsys_lite import svd_reconstruct, top_n
import numpy as np

app = FastAPI()
ratings = np.load('ratings.npy')
reconstructed = svd_reconstruct(ratings, k=10)

@app.get('/recommend/{user_id}')
def recommend(user_id: int, n: int = 5):
    recs = top_n(reconstructed, ratings, n=n)
    return {"user_id": user_id, "recommendations": recs[user_id].tolist()}

📚 Use Cases

Education

  • University courses: Teach recommendation systems without expensive infrastructure
  • Self-learning: Students can run everything on personal laptops
  • Workshops: Quick demos that work offline
  • Research: Simple baseline implementation for papers

Small-Scale Production

  • School library: Recommend books to students (500 books, 200 students)
  • Local business: Product recommendations for small e-commerce
  • Community app: Match local services to residents
  • Personal projects: Add recommendations to your blog or app

Development

  • Prototyping: Test recommendation ideas quickly
  • Learning: Understand SVD by reading clean, documented code
  • Benchmarking: Compare against simple, fast baseline
  • Integration: Easy to embed in larger systems

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • NumPy/SciPy: Core numerical computing
  • Numba: JIT compilation for performance
  • Rich: Beautiful terminal interface
  • Typer: Modern CLI framework

"Fast, secure, and production-ready recommender systems for everyone."

🗓️ Deprecation Policy

We strive to maintain backward compatibility and provide clear deprecation warnings. Deprecated features will:

  • Be marked in the documentation and code with a warning.
  • Remain available for at least one minor release cycle.
  • Be removed only after clear notice in the changelog and release notes.

If you rely on a feature that is marked for deprecation, please open an issue to discuss migration strategies.

👩‍💻 Developer Quickstart

To get started as a contributor or to simulate CI locally:

# 1. Set up your full dev environment (Poetry, pre-commit, all dev deps)
make dev

# 2. Run all linters
make lint

# 3. Run all tests with coverage
make test

# 4. Run all pre-commit hooks
make precommit

# 5. Build the Docker image
make docker-build

# 6. Run tests inside Docker (as CI does)
make docker-test

# 7. Simulate the full CI pipeline (lint, test, coverage, Docker)
make ci

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recsys_lite-1.0.0.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recsys_lite-1.0.0-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file recsys_lite-1.0.0.tar.gz.

File metadata

  • Download URL: recsys_lite-1.0.0.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.9.23 Linux/6.11.0-1018-azure

File hashes

Hashes for recsys_lite-1.0.0.tar.gz
Algorithm Hash digest
SHA256 5e6a27bed1cbe86c96f4fe22850880e958976bf8f3cf003b9c1aff0108cf7fe0
MD5 d9a0466c962bd06ff71616db278b8743
BLAKE2b-256 15a73255907bb0d002d23014ff3be938282de0ce32959f2f148b05cf1302927b

See more details on using hashes here.

File details

Details for the file recsys_lite-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: recsys_lite-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.9.23 Linux/6.11.0-1018-azure

File hashes

Hashes for recsys_lite-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08afa2dc954cb60ffc41983506c1de09dcf8f2d367fda82f9fd13a8098fd57c0
MD5 3fc574d6a88468012c03236279295021
BLAKE2b-256 7a09c8481dc654f552ac686fb8d3083b78c7ca9f8f55a0853fcc4870e466a1ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page