This is a package for creating a data science project structure

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- Portuguese
Operating System
- OS Independent
Programming Language

Project description

Data Alfred: Structuring Data Projects in Seconds

About the Project

Data Alfred is a tool designed to streamline the setup of new data projects by creating essential folders and files structure in seconds. Aimed at data teams and data analysis projects, Data Alfred establishes a solid and standardized foundation, allowing data scientists and analysts to focus on what truly matters: extracting valuable insights from data. Inpired by https://drivendata.github.io/cookiecutter-data-science/

Features

Creates directories for raw, preprocessed, and mischaracterized data.
Initializes directories for documentation, machine learning models, Jupyter notebooks, frontend for interactive visualizations, and references.
Prepares a source code directory structure for visualization, data manipulation, feature engineering, models, and testing.
Generates initial files, including .env for environment variables, .gitignore, README.md, requirements.txt for dependencies, setup.py for package installation, test_environment.py for testing, and a Dockerfile for containerization.

Prerequisites

To use Data Alfred, you will need to have Python installed on your system. The tool has been developed and tested in environments supporting Python 3.6 or newer.

How to Install

pip install data-alfred

How to Use

import data_alfred

data_alfred.create_project_structure()

Data Alfred will take care of the rest, creating the necessary directory and file structure for your data project.

Created Directory Structure

data/: Subdivided into preprocessed, raw, and mischaracterized for different stages of data handling.
docs/: Contains mkdocs.md and config.yml for project documentation.
models/: Intended to store trained machine learning models.
notebooks/: For Jupyter notebooks of data analysis and exploration.
frontend/: Includes __init__.py and streamlit_app.py for development of data visualization applications.
references/: To store project references and resources.
reports/: Intended for data analysis reports and visualizations.
src/: Contains subdirectories for visualization, data, features, models, and tests, along with an __init__.py to treat the contents as a Python package.

Contributing

Contributions to Data Alfred are welcome! If you have a suggestion to improve this tool, feel free to open an issue or pull request on the project repository. Let's work together to make starting data projects a quick and effortless task!

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Alestan Alves - https://github.com/alestanalves

Project Link: https://github.com/TOTVS-Privacidade-de-Dados/data-alfred

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- Portuguese
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

0.2.6

Mar 1, 2024

0.2.5

Feb 26, 2024

0.2.4

Feb 23, 2024

0.2.2

Feb 23, 2024

0.2.1

Feb 23, 2024

0.2.0

Feb 23, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-alfred-0.2.6.tar.gz (3.2 kB view hashes)

Uploaded Mar 1, 2024 Source

Built Distribution

data_alfred-0.2.6-py3-none-any.whl (3.7 kB view hashes)

Uploaded Mar 1, 2024 Python 3

Hashes for data-alfred-0.2.6.tar.gz

Hashes for data-alfred-0.2.6.tar.gz
Algorithm	Hash digest
SHA256	`f20a9300c2a1af0f6c341827be21c146e0e09ac4515ac0b9ca3ae93669e2134a`
MD5	`ba80c07fe5a430ec9469383a15c98874`
BLAKE2b-256	`a04ca1b7ffc574723fb6378fcbce25819be3f936a6b9fb9910941a200c7ceda0`

Hashes for data_alfred-0.2.6-py3-none-any.whl

Hashes for data_alfred-0.2.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6d8c8390f32768ffc42213b705c48240c88a799f9f39271691280b3fae9fb21f`
MD5	`214586cc7067d15fa30db835567eca3b`
BLAKE2b-256	`aa376436381bbefc3bd78541e26d77d54df635cd418b111d608496b69d08c16f`