MNIST auto-encoder
Project description
mnist_ae – From Notebook to Python Package
This guide walks you step-by-step through turning the CIML25_MNIST_Intro_v6.ipynb notebook into a distributable Python package that you can install anywhere (even on TSCC). It assumes you already know how to run a Jupyter notebook, and that you have Python ≥ 3.8 available (Python 3.11 recommended).
0 Clone the repository
git clone https://github.com/<your-username>/mnist_ae.git
cd mnist_ae
Feel free to fork the project first if you want your own remote.
1 Set up a clean Python environment
Windows ( PowerShell or cmd )
:: create & activate a virtual-env in the project root
python -m venv .venv
.venv\Scripts\activate # cmd
# or
.\.venv\Scripts\Activate.ps1 # PowerShell
macOS / Linux ( bash / zsh )
python3 -m venv .venv
source .venv/bin/activate
Upgrade pip & install build-time tools:
pip install --upgrade pip nbdev build wheel twine
Install project requirements (to run the notebook)
The notebook itself depends on PyTorch and torchvision (plus NumPy, etc.). The easiest way is to use the pinned list that comes with the repo:
pip install -r requirements.txt # installs CPU wheels by default
If you already have GPU-enabled PyTorch, feel free to skip this step or install only the libraries you miss:
pip install torch torchvision
🗒️ Why a venv? Keeping build tools isolated avoids polluting your base Python and makes the process reproducible.
1½ Place the notebook in nbs/
If your starting file is CIML25_MNIST_Intro_v6.ipynb, move (or copy) it into the nbs/ directory and rename it to the more compact 01_mnist_intro.ipynb so nbdev can pick it up.
Windows
move CIML25_MNIST_Intro_v6.ipynb nbs\01_mnist_intro.ipynb
macOS / Linux
mv CIML25_MNIST_Intro_v6.ipynb nbs/01_mnist_intro.ipynb
nbdev scans all notebooks inside
nbs/. The numeric prefix (01_,02_, …) also sets the order of the generated documentation.
2 Run & explore the notebook
jupyter notebook nbs/01_mnist_intro.ipynb
Execute a few cells to verify the model trains as expected (each epoch should take only a few seconds on CPU).
3 Export code with nbdev
nbdev turns specially-marked cells into a Python module. The two directives you need to know are:
#| default_exp mnist_training– appears once, tells nbdev which module file to create (mnist_training.py).#| export– placed on any cell whose code you want included in the library.
The intro notebook already contains these directives, so exporting is a one-liner:
nbdev_export # generates mnist_ae/mnist_training.py
(Optional) update metadata in settings.ini – package name, version, runtime requirements, author, etc. nbdev will read this file when we build the wheel.
3½ Sync metadata & version (optional but recommended)
Before building, open settings.ini and update:
version = 0.0.2 # bump each release
requirements = torch torchvision # runtime deps only
Then run
nbdev_prepare # sync settings → pyproject.toml, tag version, install git hooks
Inspect what nbdev generated
nbdev_prepare rewrites pyproject.toml, regenerates type stubs, and may reformat your code. Open the mnist_ae/ folder and look at the newly-created or updated modules.
Recommendations:
- Do not mark long training loops or plotting cells with
#| export. Keep exploratory code in the notebook; only export reusable library functions and models. Heavy loops inside the package will run every time someone imports it and can waste GPU/CPU hours. - The exported file can be a single, monolithic script – notebooks aren’t always written with clean architecture in mind. After export, audit the code (or ask an advanced LLM, o3 from ChatGPT is recommended, as well as Gemini2.5 or any other reasoning model) and refactor it into small, SOLID-compliant modules.
Use this starter prompt to guide the refactor:
You are a senior Python engineer. Rewrite the file `mnist_ae/mnist_training.py` so that:
• Each class/function has one clear responsibility (Single-Responsibility Principle).
• Related functionality is grouped into modules (e.g. data, model, training, cli).
• Internal helpers are made private (_prefix).
• No global execution at import-time; provide a `main()` entry point.
• Add type hints and docstrings.
Return the full, refactored code as a valid Python package structure.
What is SOLID? It’s a set of five design guidelines for maintainable OO code:
- S — Single Responsibility: each module/class/function does one job.
- O — Open/Closed: code is open for extension but closed for modification.
- L — Liskov Substitution: derived classes can stand in for their base without breaking behaviour.
- I — Interface Segregation: prefer many small, specific interfaces over one large general-purpose interface.
- D — Dependency Inversion: depend on abstractions (interfaces), not concrete implementations.
Spend some time on this step; clean structure pays off later.
4 Build the wheel (binary package)
python -m build --wheel # produces dist/mnist_ae-0.0.1-py3-none-any.whl
The file inside dist/ is a portable package that can be installed with pip install <file>.whl on any machine that has Python ≥ the minimum you set.
4½ Test the wheel locally
4¾ Run unit tests from source
If you’re working from the cloned repo rather than the installed wheel, install the package in editable mode so Python can find it:
pip install -e .[dev] # or just `pip install -e .` if you skipped dev extras
pytest --cov=mnist_ae -q # run tests **and** show coverage %
If mnist_ae is not importable you’ll get a ModuleNotFoundError; the editable install (or adding the repo root to PYTHONPATH) solves that.
pip install --force-reinstall dist/mnist_ae-*.whl
python -m mnist_ae.mnist_training --epochs 1 --batch_size 128 # quick sanity run
5 Publish to (Test)PyPI
(skip if you only need a local wheel)
- Create an account on pypi.org (and on test.pypi.org for dry-runs).
- Generate an API token: Settings → API tokens → New token.
- Upload:
# one-time: store credentials safely or export as env-vars
export TWINE_USERNAME="__token__"
export TWINE_PASSWORD="pypi-********************************"
# upload to TestPyPI first
python -m twine upload --repository testpypi dist/*
# if everything looks good, push to the real PyPI
python -m twine upload dist/*
Once published, anyone can install with
pip install mnist_ae # replace with the final project name
6 Install & run on TSCC (or any HPC)
# inside a job script or interactive srun session
module load python3 cuda # adjust to cluster versions
python -m venv ~/mnist_env && source ~/mnist_env/bin/activate
# now install your package from PyPI
pip install mnist_ae
# (alternative) install a local wheel -- You'd have to scp your local *.whl to TSCC.
# pip install ~/dist/mnist_ae-0.0.1-py3-none-any.whl
# launch training
python -m mnist_ae.mnist_training --epochs 5 --batch_size 256
Check the time it takes for these 5 epocs and compare to your local run. Spot any significant difference?
Appendix – Common commands (Windows vs Unix)
| Task | Windows (PowerShell) | macOS / Linux (bash) |
|---|---|---|
| Activate venv | .\.venv\Scripts\Activate.ps1 |
source .venv/bin/activate |
| Deactivate venv | deactivate |
deactivate |
| Upgrade pip | python -m pip install --upgrade pip |
pip install --upgrade pip |
| Run nbdev export | nbdev_export |
nbdev_export |
| Build wheel | python -m build --wheel |
python -m build --wheel |
| Upload with twine | python -m twine upload dist/* |
same |
| Install wheel | pip install dist\mnist_ae-*.whl |
pip install dist/mnist_ae-*.whl |
That’s it! You’ve gone from a Jupyter notebook to a published, pip-installable Python package 🎉
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mnist_ae-0.0.9.tar.gz.
File metadata
- Download URL: mnist_ae-0.0.9.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b43f4c49453e6c2ef504bdec5d54d26919d501cbb62c36b190e29adeab1fbc00
|
|
| MD5 |
7f5263581d0c7826cd7d583f7707c176
|
|
| BLAKE2b-256 |
fd05dd203696a16766cafd65a18500a0c98d59e4b738947c978131a89e38a23b
|
Provenance
The following attestation bundles were made for mnist_ae-0.0.9.tar.gz:
Publisher:
Publish.yml on ofgarzon2662/mnist_ae
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mnist_ae-0.0.9.tar.gz -
Subject digest:
b43f4c49453e6c2ef504bdec5d54d26919d501cbb62c36b190e29adeab1fbc00 - Sigstore transparency entry: 1076585512
- Sigstore integration time:
-
Permalink:
ofgarzon2662/mnist_ae@765f05d00bced9111845e2b9e9dadce22d030d51 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/ofgarzon2662
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
Publish.yml@765f05d00bced9111845e2b9e9dadce22d030d51 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mnist_ae-0.0.9-py3-none-any.whl.
File metadata
- Download URL: mnist_ae-0.0.9-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fe53948d0c8ac270a6f7d433dbe9b5b09512b7b58efeb6a3b0e1561e2f69a58
|
|
| MD5 |
c9c006c20401d853c4493bf88d773575
|
|
| BLAKE2b-256 |
5cf48142ada3ac62bba71701ce474e1b1692872711db206c8a3e7991f310ce1d
|
Provenance
The following attestation bundles were made for mnist_ae-0.0.9-py3-none-any.whl:
Publisher:
Publish.yml on ofgarzon2662/mnist_ae
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mnist_ae-0.0.9-py3-none-any.whl -
Subject digest:
3fe53948d0c8ac270a6f7d433dbe9b5b09512b7b58efeb6a3b0e1561e2f69a58 - Sigstore transparency entry: 1076585515
- Sigstore integration time:
-
Permalink:
ofgarzon2662/mnist_ae@765f05d00bced9111845e2b9e9dadce22d030d51 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/ofgarzon2662
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
Publish.yml@765f05d00bced9111845e2b9e9dadce22d030d51 -
Trigger Event:
push
-
Statement type: