Skip to main content

No project description provided

Project description

docs/source/images/bluprint_logo.png

Bluprint

Bluprint is a command line utility for streamlined exploratory data science projects. Bluprint allows seamless access to configuration, data and shared code in this type of project structure created by bluprint create my_project:

my_project
├── conf
│   └── data.yaml
├── data
│   ├── emailed
│   │   └── messy.xlsx
│   └── user_processed.csv
├── notebooks
│   └── process.ipynb
└── my_project
    └── shared_code.py

Storing paths relative to the my_project directory in conf/data.yaml:

emailed:
    messy: 'emailed/messy.xlsx'
user:
    processed: 'user_processed.csv'

allows you access them in a Python script or Jupyter notebook anywhere within the project:

from bluprint_conf import load_data_yaml

data = load_data_yaml() # By default loads conf/data.yaml
print(data)
#> {
#>   'emailed': {
#>     'messy': '/path/to/my_project/data/emailed/messy.xlsx'
#>   },
#>   'user': {
#>     'processed': '/path/to/my_project/data/user_processed.csv'
#>   },
#>   'remote': {
#>     'extras': 's3://path/to/extra_data.csv'
#>   },
#> }

# Load data in a portable manner
import pandas as pd
messy_df = pd.read_xlsx(data.emailed.messy)
extras_df = pd.read_xlsx(data.remote.extras)

# Load shared code functions as Python modules
from my_project.shared_code import transform_data
transformed_df = transform_data(messy_df, extras_df)

# Save output
transformed_df.to_csv(data.user.processed)

Features

  • Write portable notebooks by loading configs with load_data_yaml() and load_config_yaml()

  • R/Python packages automatically version-locked using renv and PDM

  • Import shared code as Python modules

  • Install shared code across projects with pip install

  • Use both Python and R notebooks in a single project (see Python/R projects)

  • Share projects by copying a project directory and running pdm install

  • Works with common IDEs (RStudio, VSCode), notebook tools for linting (nbqa), notebook version control (nbstripout) or workflows (Ploomber)

Documentation

Full documentation available at: https://igor-sb.github.io/bluprint/.

Installation

Install pipx and PDM. Then run:

pipx install bluprint

References

Bluprint integrates:

Bluprint is heavily inspired by these resources:

License

Bluprint is released under MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bluprint-0.1.5.tar.gz (19.4 kB view hashes)

Uploaded Source

Built Distribution

bluprint-0.1.5-py3-none-any.whl (22.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page