Skip to main content

A template for nbdev-based project

Project description

buildNanoGPT

buildNanoGPT is developed based on Andrej Karpathy’s build-nanoGPT repo and Let’s reproduce GPT-2 (124M) with added notes and details for teaching purposes using nbdev, which enables package development, testing, documentation, and dissemination all in one place - Jupyter Notebook or Visual Studio Code Jupyter Notebook in my case 😄.

Literate Programming

buildNanoGPT

flowchart LR
  A(Andrej's build-nanoGPT) --> C((Combination))
  B(Jeremy's nbdev) --> C
  C -->|Literate Programming| D(buildNanoGPT)

micrograd2023

Disclaimers

buildNanoGPT is written based on Andrej Karpathy’s build-nanoGPT and his “Neural Networks: Zero to Hero” lecture series. Andrej is the man who needs no introduction in the field of Deep Learning.

Andrej released a series of lectures called Neural Network: Zero to Hero, which I found extremely educational and practical. I am reviewing the lectures and creating notes for myself and for teaching purposes.

I developed makemore2023 using nbdev, which was developed by Jeremy Howard, the man who also needs no introduction in the field of Deep Learning. Jeremy also created fastai Deep Learning software library and Courses that are extremely influential. I highly recommend fastai if you are interested in starting your journey and learning with ML and DL.

nbdev is a powerful tool that can be used to efficiently develop, build, test, document, and distribute software packages all in one place, Jupyter Notebook or Jupyter Notebooks in VS Code, which I am using.

If you study lectures by Andrej and Jeremy you will probably notice that they are both great educators and utilize both top-down and bottom-up approaches in their teaching, but Andrej predominantly uses bottom-up approach while Jeremy predominantly uses top-down one. I personally fascinated by both educators and found values from both of them and hope you are too!

Usage

Prepare FineWeb-Edu-10B data

from buildNanoGPT import data
import tiktoken
import numpy as np
enc = tiktoken.get_encoding("gpt2")
eot = enc._special_tokens['<|endoftext|>'] # end of text token
eot
50256
t_ref = [eot]
t_ref.extend(enc.encode("Hello, world!"))
t_ref = np.array(t_ref).astype(np.uint16)
t_ref
array([50256, 15496,    11,   995,     0], dtype=uint16)
t_ref = [eot]
t_ref.extend(enc.encode("Hello, world!"))
t_ref = np.array(t_ref).astype(np.int32)
t_ref
array([50256, 15496,    11,   995,     0], dtype=int32)
doc = {"text":"Hello, world!"}
t_test = data.tokenize(doc)
t_test
array([50256, 15496,    11,   995,     0], dtype=uint16)
assert np.all(t_ref == t_test)
# Download and Prepare the FineWeb-Edu-10B sample Data
data.edu_fineweb10B_prep(is_test=True)
Resolving data files:   0%|          | 0/1630 [00:00<?, ?it/s]

Loading dataset shards:   0%|          | 0/98 [00:00<?, ?it/s]

'Hello from `prepare_edu_fineweb10B()`! if you want to download the dataset, set is_test=False and run again.'

Prepare HellaSwag Evaluation data

data.hellaswag_val_prep(is_test=True)
'Hello from `hellaswag_val_prep()`! if you want to download the dataset, set is_test=False and run again.'

How to install

The buildNanoGPT package was uploaded to PyPI and can be easily installed using the below command.

pip install buildNanoGPT

Developer install

If you want to develop buildNanoGPT yourself, please use an editable installation.

git clone https://github.com/hdocmsu/buildNanoGPT.git

pip install -e "buildNanoGPT[dev]"

You also need to use an editable installation of nbdev, fastcore, and execnb.

Happy Coding!!!

Note: buildNanoGPT is currently Work in Progress (WIP).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

buildnanogpt-0.0.5.tar.gz (31.1 kB view hashes)

Uploaded Source

Built Distribution

buildNanoGPT-0.0.5-py3-none-any.whl (31.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page