Skip to main content

A hindsight logging dataabase for MLOps

Project description

FlorDB: Log-First Context Management for ML Practitioners

PyPI

FlorDB brings experiment tracking, provenance, and reproducibility to your ML workflow—using the one thing every engineer already writes: logs.

Unlike heavyweight MLOps platforms, FlorDB doesn’t ask you to adopt a new UI, schema, or service. Just import it, log as you normally would, and gain full history, lineage, and replay capabilities across your training runs.

🚀 Why FlorDB?

  • Log-Driven Experiment Tracking
    No dashboards to configure or schemas to design. FlorDB turns your existing print() or log() calls into structured, queryable metadata.

  • Hindsight Logging & Replay
    Missed a metric? Add a log after the fact and replay past runs to capture it—no rerunning from scratch.

  • Reproducibility Without Friction
    Every run is versioned via Git, every hyperparameter is recorded, and every model checkpoint is linked and queryable—automatically.

  • Works With Your Stack
    Makefiles, Airflow, Slurm, HuggingFace, PyTorch—you don’t change your workflow. FlorDB fits in.

📦 Installation

pip install flordb

For contributors or bleeding-edge features:

git clone https://github.com/ucbrise/flor.git
cd flor
pip install -e .

📝 First Log in 30 Seconds

Requires a Git repository for automatic versioning.

mkdir flor_sandbox
cd flor_sandbox
git init
ipython
import flordb as flor
flor.log("message", "Hello ML World!")
message: Hello, ML World!
Changes committed successfully

Retrieve logs anytime:

flor.dataframe("message")
         projid              tstamp filename          message
0  flor_sandbox 2025-10-13 18:13:48  ipython  Hello ML World!

🧪 Track Experiments with Zero Overhead

Drop FlorDB into your existing training script:

import flordb as flor

# Hyperparameters
lr = flor.arg("lr", 1e-3)
batch_size = flor.arg("batch_size", 32)

with flor.checkpointing(model=net, optimizer=optimizer):
    for epoch in flor.loop("epoch", range(epochs)):
        for x, y in flor.loop("step", trainloader):
            ...
            flor.log("loss", loss.item())

Change hyperparameters from the CLI:

python train.py --kwargs lr=5e-4 batch_size=64

View metrics across runs:

flor.dataframe("lr", "batch_size", "loss")
        projid              tstamp  filename  epoch  step      lr batch_size                 loss
0  ml_tutorial 2025-10-13 18:18:14  train.py      1   500  0.0005         64  0.20570574700832367
1  ml_tutorial 2025-10-13 18:18:14  train.py      2   500  0.0005         64   0.1964433193206787
2  ml_tutorial 2025-10-13 18:18:14  train.py      3   500  0.0005         64  0.11040152609348297
3  ml_tutorial 2025-10-13 18:18:14  train.py      4   500  0.0005         64    0.155434250831604
4  ml_tutorial 2025-10-13 18:18:14  train.py      5   500  0.0005         64   0.0741351768374443

🔍 Hindsight Logging: Fix It After You See It

Forgot to log gradient norms?

flor.log("grad_norm", ...)

Just add the logging statement to the script and run:

python -m flordb replay grad_norm

FlorDB replays only what’s needed, injecting the new log across copies of historical versions and committing results.

🏗 Real ML Systems Built on FlorDB

FlorDB powers full AI/ML lifecycle tooling:

  • Feature Stores & Model Registries
  • Document Parsing & Feedback Loops
  • Continuous Training Pipelines

See our Scan Studio and Document Parser examples for real-world integration.

📚 Publications

FlorDB is based on research from UC Berkeley’s RISE Lab and Arizona State University.

  • Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle (CIDR 2025)
  • The Management of Context in the ML Lifecycle (UCB Tech Report 2024)
  • Hindsight Logging for Model Training (PVLDB 2021)

🛠 License

Apache v2 License — free to use, modify, and distribute.


💡 Get Involved

FlorDB is actively developed. Contributions, issues, and real-world use cases are welcome!

GitHub: https://github.com/ucbrise/flor
Tutorial Video: https://youtu.be/mKENSkk3S4Y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flordb-3.4.12.tar.gz (30.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flordb-3.4.12-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file flordb-3.4.12.tar.gz.

File metadata

  • Download URL: flordb-3.4.12.tar.gz
  • Upload date:
  • Size: 30.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for flordb-3.4.12.tar.gz
Algorithm Hash digest
SHA256 8da07d1bc05047e33048c1110db24e4d0371defab62823feb492ecb883252381
MD5 b8fce18b7aeddc1c68b53ea1849f5fd6
BLAKE2b-256 405fbedbc9db8ccf4d4a8d99598a03b77945a1d1369133d80beea4d2534c9a59

See more details on using hashes here.

File details

Details for the file flordb-3.4.12-py3-none-any.whl.

File metadata

  • Download URL: flordb-3.4.12-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for flordb-3.4.12-py3-none-any.whl
Algorithm Hash digest
SHA256 da5af7c1255530ad90770672c0677060bce14323b7fcd0e86ea3675cfee03efc
MD5 f0f4bde8901da51792a37e4b0b8045ae
BLAKE2b-256 9a2f962f1727ce8da73cf4e8dc59e84d3b13dc7855585dfd48901fbcfc071c8b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page