This is a package for creating a data science project structure
Project description
Data Alfred: Structuring Data Projects in Seconds
About the Project
Data Alfred is a tool designed to streamline the setup of new data projects by creating essential folders and files structure in seconds. Aimed at data teams and data analysis projects, Data Alfred establishes a solid and standardized foundation, allowing data scientists and analysts to focus on what truly matters: extracting valuable insights from data.
Features
- Creates directories for raw, preprocessed, and mischaracterized data.
- Initializes directories for documentation, machine learning models, Jupyter notebooks, frontend for interactive visualizations, and references.
- Prepares a source code directory structure for visualization, data manipulation, feature engineering, models, and testing.
- Generates initial files, including
.env
for environment variables,.gitignore
,README.md
,requirements.txt
for dependencies,setup.py
for package installation,test_environment.py
for testing, and aDockerfile
for containerization.
Prerequisites
To use Data Alfred, you will need to have Python installed on your system. The tool has been developed and tested in environments supporting Python 3.6 or newer.
How to Install
pip install data-alfred
How to Use
python data-alfred
- Data Alfred will take care of the rest, creating the necessary directory and file structure for your data project.
Created Directory Structure
data/
: Subdivided intopreprocessed
,raw
, andmischaracterized
for different stages of data handling.docs/
: Containsmkdocs.md
andconfig.yml
for project documentation.models/
: Intended to store trained machine learning models.notebooks/
: For Jupyter notebooks of data analysis and exploration.frontend/
: Includes__init__.py
andstreamlit_app.py
for development of data visualization applications.references/
: To store project references and resources.reports/
: Intended for data analysis reports and visualizations.src/
: Contains subdirectories forvisualization
,data
,features
,models
, andtests
, along with an__init__.py
to treat the contents as a Python package.
Contributing
Contributions to Data Alfred are welcome! If you have a suggestion to improve this tool, feel free to open an issue or pull request on the project repository. Let's work together to make starting data projects a quick and effortless task!
License
Distributed under the MIT License. See LICENSE
for more information.
Contact
Alestan - privacidade@totvs.com.br
Project Link: https://github.com/TOTVS-Privacidade-de-Dados/data-alfred
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for data_alfred-0.2.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b88e14c196cb51487491a200445d0453d4888e0a9ce8946ab54564c520e749e |
|
MD5 | 98524cd46c3cc6e7a3ed55a7419c2782 |
|
BLAKE2b-256 | b3fec085dce2ab31eacea76d75d5ce26695f005efb5b3dce891babbe5a4fa375 |