My package description

These details have not been verified by PyPI

Project links

Homepage

Project description

Titanic package

Motivation

Many novice data scientists begin their journey in data science by building models on the well-known Titanic dataset. They tend to do that in jupyter notebooks, which is a nice tool for EDA and building simple models. However, when it comes to pushing the built models to production this tool becomes inconvenient.

In fact, there are some steps that should be done in order to prepare the model for production such as organizing the code in modules, writing tests, adding linters and type checks and e.t.c. However, I noticed that the majority of my students are not aware of such steps.

Therefore, I created this repository to teach my students on how to switch from jupyter notebooks to production code and wrap the models into python package, so that it could be used later in different applications such as web application. As an example in this repo the model is built on the Titanic dataset, therefore the built package is called "titanic_model".

This repo is heavily influenced by the excellent course at Udemy "Deployment of Machine Learning Models".

Code structure

Configs

The model parameters are set via configs. The configs are represented by yaml files. The values for parameters can be set in titanic_model/config.yml file. The cofigs are parsed and validated in titanic_model/config/core.py module using StrictYaml lib for parsing and Pydantic lib for type checking the values.

Setting the pipeline and training

The pipeline is set in titanic_model/pipeline.py file. Training is set in titanic_model/train_pipeline.py file. All the data processing steps are made in the same Scikit-learn style including custom transformations, stored in titanic_model/processing/features.py file.

Making predictions

The code for prediction is set in titanic_model/predict.py file. Before every prediction the validation of input data is made. The code for validation can be found in titanic_model/processing/validation.py file.

How to run the code

The code can be run via the Tox tool. Tox is a convenient way to set up the environment and python paths automatically and run the required commands from the command line. The file with description for tox can be found in tox.ini file. The following commands can be run from the command line using tox:

Run training: first create a directory for saving models if there is no any mkdir ./titanic_model/trained_models and then run tox -e train
Run testing (via pytest): tox -e test_package
Run typechecking (via mypy): tox -e typechecks
Run style checks (via black, isort, mypy and flake8): tox -e stylechecks

How to install the package

In order to install the package run

pip install titanic-model

After that you can make predictions, using the package:

from titanic_model.predict import make_prediction

# Example input
input_dict = {'PassengerId': [0], 'Pclass': [1], 'Name': ['Snyder, Mrs. John Pillsbury (Nelle Stevenson)'], 
              'Sex': ['female'], 'Age': [23], 'SibSp': [1], 'Parch': [0], 'Ticket': [21228], 'Fare': [82.2667], 
              'Cabin': ['B45'], 'Embarked': ['S']}

result = make_prediction(input_data=input_dict)

print(result)

Web application

Link to the app: https://github.com/Emilien-mipt/titanic-webapp

Link to the corresponding Heroku link: https://titanicwebapp.herokuapp.com/

CI (Continuous Integration)

CI has been added to the project using Github Actions in order to automate package testing step and upload to PyPI step. The files that stand for CI are located in ./github/workflows/ directory. CI.yml file stands for automatic testing of the package every pull-request and push to the main branch, while PyPI.yml file is responsible for the automatic upload of the package to the PyPI every time the release is made for the corresponding version of the package.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.0.6

May 22, 2025

This version

0.0.2

May 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

titanic_model_ashmanova_startzev-0.0.2.tar.gz (13.5 kB view details)

Uploaded May 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

titanic_model_ashmanova_startzev-0.0.2-py3-none-any.whl (11.6 kB view details)

Uploaded May 21, 2025 Python 3

File details

Details for the file titanic_model_ashmanova_startzev-0.0.2.tar.gz.

File metadata

Download URL: titanic_model_ashmanova_startzev-0.0.2.tar.gz
Upload date: May 21, 2025
Size: 13.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for titanic_model_ashmanova_startzev-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`2cfdeb60cfbbed9ca5e4fa9ba325e530181c2eddf5e6bb73bbd7bcdbb146a770`
MD5	`b264039900ee580bf1d06dfa1baffe34`
BLAKE2b-256	`cf74eeeb6b9595f9de254aa43b2ab7c11ff86c2c1ff876caf811889fa28187d3`

See more details on using hashes here.

File details

Details for the file titanic_model_ashmanova_startzev-0.0.2-py3-none-any.whl.

File metadata

Download URL: titanic_model_ashmanova_startzev-0.0.2-py3-none-any.whl
Upload date: May 21, 2025
Size: 11.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for titanic_model_ashmanova_startzev-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5be4b4770673a9655fe7ceeb4f036795c8fe483ac39e8373c99a5a2baa1c17cf`
MD5	`cece0a44fc7aca3ccf5120197c644025`
BLAKE2b-256	`ea559fa22bc8cfd641c00854c38f606e97acde6240c63478203b32eb86d7396d`

See more details on using hashes here.

titanic-model-ashmanova-startzev 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Titanic package

Motivation

Code structure

Configs

Setting the pipeline and training

Making predictions

How to run the code

How to install the package

Web application

CI (Continuous Integration)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes