Skip to main content

a framework for building and using data science

Project description

======
Lore
======

Lore is a python data science framework to design, fit, and exploit data science
models from development to production. It codifies best practices to
simplify collaborating and deploying models developed on a laptop with jupyter
notebook, into high availability distributed production data centers.


Example
=======

::

$ pip install lore

$ lore new my_project # create file structure & virtualenv
$ lore workon my_project # change working directory & virtualenv
$ lore install my_project # get dependencies (including raw data snapshots)

$ lore generate my_model # create a model

$ lore task fit my_model # train the model

$ lore serve all & # start a default api process

$ curl -X POST -d '{"feature_1": "true"}' http://0.0.0.0:3000/my_model
{class: "positive"}


Features
========

* **Repeatable:** Lore allows developers to work on multiple projects with
different versions of dependencies without conflict, while preserving
similarity with production. It removes manual handling of anaconda, brew,
apt-get, pyenv, virtualenv, docker et al.

* **Sharp::** Lore is as simple as you want it to be. Getting started takes a
handful of commands, but you've just unlocked the full depth of cutting edge
machine learning.

* **Scalable:** Lore projects are horizontally scalable, but start with
vertical scalability in a single thread to ease into the learning curve.

* **Transparent:** Lore doesn't wrap or hide or abstract the libraries you've
already learned how to use. It's goal is to remove boiler plate and
inconsistency from a typical workflow, without reinventing the wheel.

* **Efficient:** Lore adds minimal overhead while gluing all of the underlying
libraries together. We test for <1% performance impact during critical phases,
such as fitting models on production size datasets.

* **Mature:** Lore has all the bells and whistles you expect to work out of
the box in every environment. IO (csv, sql, pickle...), logging, tracking,
reporting, timing, testing, deploying are all available safe and secure w/
minimal configuration.


Lore stands on the shoulders giants
===================================

Lore is designed to be as fast and powerful as the underlying libraries.
It seamlessly supports workflows that utilize:

* airflow
* tensorflow
* keras
* scikit-learn
* pandas
* numpy
* sqlalchemy
* psychopg
* protobuf
* gunicorn
* hub
* mani
* virtualenv, pyenv, python (2.7, 3.3+)


Commands
========

$ lore new
$ lore install
$ lore update
$ lore workon
$ lore generate [**all**, api, model, notebook, task] NAME
$ lore task
$ lore console
$ lore serve
$ lore start
$ lore stop


Project Structure
=================

::

├── .env.template <- Template for environment variables for developers (mirrors production)
├── .python-version <- keeps dev and production in sync (pyenv)
├── README.md <- The top-level README for developers using this project.
├── requirements.txt <- keeps dev and production in sync (pip)

├── docs/ <- generated from src

├── notebooks/ <- explorations of data and models
│ └── my_exploration/
│ └── exploration_1.ipynb

├── src/
│ ├── __init__.py <- loads the various components (makes this a module)
│ │
│ ├── api/ <- external entry points to runtime models
│ │ └── __init__.py <- loads the various components (makes this a module)
│ │
│ ├── config/ <- environment, logging, exceptions, metrics initializers
│ │ └── __init__.py <- loads the various components (makes this a module)
│ │
│ ├── tasks/ <- run manually, cron or aiflow
│ │ ├── __init__.py <- loads the various components (makes this a module)
│ │ └── my_model/
│ │ ├── etl.py
│ │ └── train.py
│ │
│ ├── data/ <- Scripts to move data between sources
│ │ ├── __init__.py <- loads the various components (makes this a module)
│ │ └── etl/ <- etl sql between DBs (local/production too)
│ │ └── table_name.sql
│ │
│ ├── features/ <- abstractions for dealing with processed data
│ │ ├── __init__.py <- loads the various components (makes this a module)
│ │ └── my_features.py
│ │
│ └── models/ <- Code that make predictions
│ ├── __init__.py <- loads the various components (makes this a module)
│ └── my_objective/
│ ├── deep_model.py
│ └── linear_model.py

└── tests/
├── api/
├── tasks/
└── models/



Design Philosophies & Inspiration
=================================

* Personal Pain
* Minimal Abstraction
* No code is better than no code (https://blog.codinghorror.com/the-best-code-is-no-code-at-all/)
* Convention over configuration (https://xkcd.com/927/)
* Sharp Tools (https://www.schneems.com/2016/08/16/sharp-tools.html)
* Rails (https://en.wikipedia.org/wiki/Ruby_on_Rails)
* Cookie Cutter Data Science (https://drivendata.github.io/cookiecutter-data-science/)
* Gene Roddenberry (https://www.youtube.com/watch?v=0JLgywxeaLM)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lore-0.1.25.tar.gz (153.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lore-0.1.25-py2.py3-none-any.whl (159.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file lore-0.1.25.tar.gz.

File metadata

  • Download URL: lore-0.1.25.tar.gz
  • Upload date:
  • Size: 153.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for lore-0.1.25.tar.gz
Algorithm Hash digest
SHA256 f204149d7d74d0b3ff915fce78c40eabed43a1f3b8c6c711cfdce4ce2f9c991e
MD5 1f23567f8b1a3cd3bab6c5efaa3d804a
BLAKE2b-256 7fde3b0d92b49d9a6c9ec318b7423cb987ccf0db11baa1ec8b079662ffcd4977

See more details on using hashes here.

File details

Details for the file lore-0.1.25-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for lore-0.1.25-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 18651649a3ad2d518d765948bfbd8209e063eca8e6a23dc90e14016305eb661b
MD5 ce4cc89e9cdc500d3f274c52a8996d1c
BLAKE2b-256 6fd5bb682f72a14166d604476169577c3b87885830649d3ca492be84d1a61d19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page