Package for Practical & efficient Data Science in Python. Initially written for data-science-keras repo
Project description
Data science projects with Keras (Poetry Version)
Author: Angel Martinez-Tenor
Repository: Github link
This repo contains a set of data science projects solved with artificial neural networks implemented in Keras. It is based on a set of use cases from Udacity, Coursera & Kaggle
The repo also introduces a minimal package ds_boost initally implemented as a helper for this repo
Disclaimer: This notebooks-based repo was developed in early 2018. Since July 2022, I'm updating it using the best practices I've learned implementing solutions in production environment my experience as a lead data scientist
A non-poetry version of this repo is available in the branch no-poetry
Scenarios
Classification models
-
Enron Scandal Identifies Enron employees who may have committed fraud
-
Property Maintenance Fines Predicts the probability of a set of blight tickets to be paid on time
-
Sentiment IMDB Predicts positive or negative sentiments from movie reviews (NLP)
-
Spam detector Predicts the probability that a given email is a spam email (NLP)
-
Student Admissions Predicts student admissions to graduate school at UCLA
-
Titanic Predicts survival probabilities from the sinking of the RMS Titanic
Regression models
-
Bike Rental Predicts daily bike rental ridership
-
House Prices Predicts house sales prices from Ames Housing database
-
Simple tickets Predicts the number of tickets requested by different clients
Recurrent models
-
Machine Translation Translates sentences from English to French (NLP)
-
Simple Stock Prediction Predicts Alphabet Inc. stock price
-
Text generator Creates an English language sequence generator (NLP)
Social network models
- Network Predicts missing salaries and new email connections from a company's email network
Setup & Usage
Python 3.8+ required. Conda environment with Python 3.10 suggested
-
Clone the repository using
git
:git clone https://github.com/angelmtenor/data-science-keras.git
-
Enter to the root path of the repo and use or create a new conda environment for development:
$ conda create -n dev python=3.10 -y && conda activate dev
-
Install the minimal package developed as a helper for this repo:
pip install dist/ds_boost-0.1.0-py3-none-any.whl
-
Open the desired project/s with Jupyter Notebook
cd data-science-keras jupyter notebook
Development Mode
In the root folder of the cloned repository, install all the required dev packages and the ds-boost mini package (Make required):
make setup
To install tensorflow with GPU support, follow the instructions of this guide: Install TensorFlow GPU.
QA (manual pre-commit):
make qa
Development Tools Required:
A Container/Machine with Conda, Git and Poetry as closely as defined in .devcontainer/Dockerfile
:
- This Dockerfile contains a non-root user so the same configuration can be applied to a WSL Ubuntu Machine and any Debian/Ubuntu CLoud Machine (Vertex AI workbench, Azure VM ...)
- In case of having an Ubuntu/Debian machine with non-root user (e.g.: Ubuntu in WSL, Vertex AI VM ...), just install the tools from non-root user (no sudo)* section of
.devcontainer/Dockerfile
(sudo apt-get install <software> may be required) - A pre-configured Cloud VM usually has Git and Conda pre-installed, those steps can be skipped
- The development container defined in
.devcontainer/Dockerfile
can be directly used for a fast setup (Docker required). With Visual Studio Code, just open the root folder of this repo, pressF1
and select the option Dev Containers: Open Workspace in Container. The container will open the same workspace after the Docker Image is built.
Contributing
Check out the contributing guidelines
License
ds_boost
was created by Angel Martinez-Tenor. It is licensed under the terms of the MIT license.
Credits
ds_boost
was created from a Data Science Template developed by Angel Martinez-Tenor. The template was built upon py-pkgs-cookiecutter
[template] (https://github.com/py-pkgs/py-pkgs-cookiecutter)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.