babyGPT

An educational module for experimenting with unsupervised learning in large language modeling

These details have not been verified by PyPI

Project links

Project description

Consult the module API page at

https://engineering.purdue.edu/kak/distBabyGPT/babyGPT-1.0.7.html

for all information related to this module, including information related to the latest changes to the code. The page at the URL shown above lists all of the module functionality you can invoke in your own code.

Creating an instance of babyGPT:

        baby_gpt = babyGPT(
                            max_seq_length = max_seq_length,
                            batch_size = batch_size,
                            embedding_size = embedding_size,
                            num_basic_decoders = num_basic_decoders,
                            num_atten_heads = num_atten_heads,
                            optimizer_params = optimizer_params,
                            num_warmup_steps = num_warmup_steps,
                            masking = masking,
                            verify_text_corpus = False,
                            path_saved_model = {"decoder" : "./saved_decoder",
                                                "embedding_generator" : "./saved_embedding_generator",
                                               },
                          )

Since babyGPT calls on TransformerFG for language modeling, you must also construct an instance of that class:

        xformer = baby_gpt.TransformerFG(
                            max_seq_length = max_seq_length,
                            embedding_size = embedding_size,
                            tokenizer_json = tokenizer_json,
                            num_warmup_steps = num_warmup_steps,
                            optimizer_params = optimizer_params,
                  )

Within the TransformerFG module, it is the MasterDecoder class that is needed for the next token prediction for the purpose of self-supervised learning:

        master_decoder = baby_gpt.MasterDecoderWithMasking(
                            xformer,
                            num_basic_decoders = num_basic_decoders,
                            num_atten_heads = num_atten_heads,
                            masking = masking
                         )


Finally, here is an instance of the dataloader you're going to need:

        dataloader = baby_gpt.ArticleDatasetWithBufferedContext(
                            gpt = baby_gpt,
                            tokenizer_json = tokenizer_json,
                            context_window_size = context_window_size,
                            context_buffer_size = context_buffer_size,
                            articles_dir = articles_dir,
                     )

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.4

Apr 14, 2026

1.1.3

Sep 17, 2025

1.1.2

Aug 24, 2025

1.1.1

Aug 20, 2025

1.1.0

Aug 7, 2025

1.0.9

Aug 7, 2025

1.0.8

Aug 1, 2025

This version

1.0.7

May 29, 2025

1.0.6

Apr 23, 2025

1.0.5

Apr 13, 2025

1.0.4 yanked

Apr 13, 2025

Reason this release was yanked:

error in homepage URL

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

babyGPT-1.0.7.tar.gz (647.8 kB view details)

Uploaded May 29, 2025 Source

File details

Details for the file babyGPT-1.0.7.tar.gz.

File metadata

Download URL: babyGPT-1.0.7.tar.gz
Upload date: May 29, 2025
Size: 647.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for babyGPT-1.0.7.tar.gz
Algorithm	Hash digest
SHA256	`671d3af15ed389ce81b1a9b3335a609f5aded123a1278f3d8efe6e25dec4d5ba`
MD5	`033a03d8ea8ff92c51506df164a59a6d`
BLAKE2b-256	`52e73dd9fa743a248e599e985ff36be72b6d713d81175f52dbef0d3acd2c399f`

See more details on using hashes here.

babyGPT 1.0.7

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes