No project description provided
Project description
See https://github.com/noanabeshima/tiny_model
TinyModel is a 44M parameter model trained on TinyStories V2 for mechanistic interpretability.
It has 4 layers, uses ReLU activations, and has no layernorms.
It was trained for 3 epochs on a preprocessed version of TinyStoriesV2.
from tiny_model import TinyModel, tokenizer
lm = TinyModel()
# for inference
tok_ids, attn_mask = tokenizer(['Once upon a time', 'In the forest'])
logprobs = lm(tok_ids)
# or
lm.generate('Once upon a time, Ada was happily walking through a magical forest with')
# To decode tok_ids you can use
tokenizer.decode(tok_ids)
Tokenization is done as follows:
- the top-10K most frequent tokens using the GPT-NeoX tokenizer are selected and sorted by frequency.
- To tokenize a document, first tokenize with the GPT-NeoX tokenizer. Then replace tokens not in the top 10K tokens with a special [UNK] token id. All token ids are then mapped to be between 1 and 10K, roughly sorted from most frequent to least.
- Finally, prepend the document with a [BEGIN] token id.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tinystoriesmodel-0.1.1.tar.gz
(73.8 kB
view details)
Built Distribution
File details
Details for the file tinystoriesmodel-0.1.1.tar.gz
.
File metadata
- Download URL: tinystoriesmodel-0.1.1.tar.gz
- Upload date:
- Size: 73.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-35-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 198acc9a078237f6ce039222f4fab4f35a18fd36d9dfb25e157328d1b482e9af |
|
MD5 | bf49c134512bdbc0ed48dbeb92dc4c56 |
|
BLAKE2b-256 | fd80d088b93ecda47109715f15629c54315eae86a1ff9cb36e42ae6cb35ecf5e |
File details
Details for the file tinystoriesmodel-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: tinystoriesmodel-0.1.1-py3-none-any.whl
- Upload date:
- Size: 72.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-35-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 054cdcf709f437f9257963bf62056b1f3f5bcb243b73cac217689c601cb7277d |
|
MD5 | 02442e06b5bc44989a621d4c6cf626bb |
|
BLAKE2b-256 | 8df124f42ef38361f45f88868f436d6d1b33e7b415b0df2095c6fa3b97cb7a18 |