Crilla is a simple way to introduce optimized single-GPU training into your project

Project description

[!IMPORTANT]
For a much nicer README visit Cirilla

Ciri from The Witcher 4 trailer

Cirilla

Cirilla is an open source learning project aiming at implmenting various LLMs. It is focused mainly on showing how to make, train, infer and deploy a LLM from scratch using Pytorch and a budget friendly GPU (RTX 4060Ti 16GiB ~500$).

Who is Cirilla

Cirilla Fiona Elen Riannon, known as Ciri, is one of the central characters in The Witcher saga by Andrzej Sapkowski and its adaptations.
She is the princess of Cintra, granddaughter of Queen Calanthe, and the sole heir to a powerful lineage marked by the mysterious Elder Blood.

Ciri is defined by her destiny, adaptability, and potential. Unlike kings who wield authority by birthright, her strength comes from surviving chaos, learning from mentors like Geralt and Yennefer, and unlocking extraordinary powers.

Her unique abilities make her one of the most pivotal figures in the saga. Known as the Lady of Space and Time, the Lion Cub of Cintra, and the Child of the Elder Blood, she can manipulate space and time, travel between worlds, and influence the course of events in ways few can.

Fig.1 Ciri Gwent card by Bogna Gawrońska

Why name a LLM Cirilla

Unlike rulers who inherit authority, Cirilla embodies potential realized through learning, experience, and adaptability. She is resilient, capable of navigating complex and unpredictable worlds, and able to respond to challenges with skill and precision - qualities that mirror how an language model can shift between tasks, domains, and contexts.

Guided by mentors and shaped by hardships, Ciri develops her abilities quickly, mastering both strategy and instinct while remaining flexible in the face of unforeseen circumstances.

Her combination of innate talent, adaptability, and the capacity for growth makes her an fitting symbol for a language model designed to acquire knowledge, evolve over time, and connect information across domains.

Fig.2 Ciri Gwent card by Anna Podedworna

What is a LLM

On a high level: imagine a toddler with an huge amount of knowledge but still possessing a toddler-like way of reasoning and understanding.

On a lower level: an LLM is a neural network trained on so-called big data to recognize patterns, generate human-like responses, and predict the most likely next word in a given context. While it can process and recall information efficiently, it lacks true understanding, reasoning, or consciousness, relying only on statistical correlations rather than genuine comprehension. the reasoning of LLMs is being impoved in projects (most notably) like DeepSeek, which focus on enhancing the ability to understand context and simulating human-like reasoning.

Repo organization:

Cirilla - a LLM made on a budget/
  │
  ├── BERT/                           # overview of BERT
  │   └── RAG/                        # overview of RAG
  │
  ├── Cirilla_model/                  # implementation of Cirilla LLM
  │   ├── model.py
  │   ...
  │
  ├── Decoder_only_architecture/      # overview of decoder only transformer architecture
  │   └── Llama2/                     # implementation of Llama 2 inference loop
  │   └── Mistral/                    # overview of the Mistral 7B architecture and inference tricks
  │
  ├── LLM_pieces/                     # elements of decoder-only model you can use
  │   ├── SMoE.py                     # Sparse mixture of Experts
  │   ...
  │
  ├── synth_data/
  │   ├── multi_turn_vllm.py          # create multi turn instructions with VLLM
  │   ├── Ollama_create.py            # synthetic data creation with Ollama
  │   ├── reason_gym_synthetic.py     # create synthetic reasoning dataset with reasoning_gym
  │   ├── rm_duplicate_instruct.py    # remove duplicate instructions from Ollama
  │   └── witcher_mr_gather.py        # create multi turn instructions with Witcher
  │
  ├── Training_optimizations/
  │   ├──FlexAttention/               # overview of Pytorch's FlexAttention
  │   └── HF_kernels/                 # overview of HF's kernel hub
  │
  └── Transformer_from_scratch/       # transformer implementation
      ├── model.py                    # transformer model
      ├── dataset.py                  # dataset for MLM - masked language modelling
      ├── train.py                    # main transformer training loop
      └── LongNet.py                  # LongNet - crude dilated attention implementation

Project details

Release history Release notifications | RSS feed

1.0.0

Feb 19, 2026

0.2.0

Jan 1, 2026

0.1.90

Dec 14, 2025

0.1.82

Dec 14, 2025

0.1.81

Nov 9, 2025

0.1.80

Nov 8, 2025

0.1.62

Oct 19, 2025

0.1.61

Oct 11, 2025

0.1.8

Nov 8, 2025

0.1.7

Oct 23, 2025

0.1.6

Oct 4, 2025

0.1.5

Sep 27, 2025

0.1.4

Sep 26, 2025

0.1.3

Sep 24, 2025

0.1.2

Sep 20, 2025

This version

0.1.1

Sep 20, 2025

0.1.0

Sep 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cirilla-0.1.1.tar.gz (6.9 kB view details)

Uploaded Sep 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cirilla-0.1.1-py3-none-any.whl (6.7 kB view details)

Uploaded Sep 20, 2025 Python 3

File details

Details for the file cirilla-0.1.1.tar.gz.

File metadata

Download URL: cirilla-0.1.1.tar.gz
Upload date: Sep 20, 2025
Size: 6.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.0

File hashes

Hashes for cirilla-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`4f4f4c0befec2e6acb065b022bff0eb3bbd22cf0eeec40099d7d1d8b831de51f`
MD5	`2b9117176ee3280a57fced5490e2166d`
BLAKE2b-256	`d27c5effa0cb027f7f0a8e04257beb233716e6631fea641a8f04441e9c6c8a93`

See more details on using hashes here.

File details

Details for the file cirilla-0.1.1-py3-none-any.whl.

File metadata

Download URL: cirilla-0.1.1-py3-none-any.whl
Upload date: Sep 20, 2025
Size: 6.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.0

File hashes

Hashes for cirilla-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`11e548e5d4a98b93bd13cfa98efaef893435d82b85bbe4418f6219c213d442a8`
MD5	`1b752a2b34c9b0e660e2eec384cc7c75`
BLAKE2b-256	`27dc258639f015f33daf358ddb3bed592897fd39979527be2e5c7d33b44e406e`

See more details on using hashes here.

Cirilla 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Cirilla

Who is Cirilla

Why name a LLM Cirilla

What is a LLM

Repo organization:

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes