Gradient-Efficient Knowledge Optimization - Smart training for any LLM

These details have not been verified by PyPI

Project links

Homepage

Project description

GEKO: Gradient-Efficient Knowledge Optimization

GEKO

Python PyTorch License

A plug-and-play training framework that makes LLM training more efficient.

Like LoRA revolutionized fine-tuning, GEKO revolutionizes training.

Key Insight

GEKO Insight

Traditional training treats all samples equally:

$$\mathcal{L}{standard} = \frac{1}{N} \sum{i=1}^{N} \ell(x_i, y_i)$$

GEKO weights samples by their learning value:

$$\mathcal{L}{GEKO} = \frac{1}{N} \sum{i=1}^{N} w_i \cdot \ell(x_i, y_i) \quad \text{where} \quad w_i = f(bucket_i)$$

Installation

pip install geko

Quick Start

from geko import GEKOTrainer, GEKOConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

trainer = GEKOTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=your_dataset,
)

trainer.train()
print(trainer.get_efficiency_report())

The GEKO Algorithm

Sample Partitioning

Bucket Classification

flowchart TD
    A[Sample] --> B{Correct?}
    B -->|Yes| C{Confident & High Quality?}
    B -->|No| D{High Confidence?}
    C -->|Yes| E[🔵 FREEZE<br/>w = 0<br/>Never train]
    C -->|No| F[🟢 LIGHT<br/>w = 0<br/>Low priority]
    D -->|Yes| G[🔴 HARD<br/>w = 3<br/>Highest priority]
    D -->|No| H[🟠 FOCUS<br/>w = 1<br/>Medium priority]

    style E fill:#3498db,color:#fff
    style F fill:#2ecc71,color:#fff
    style G fill:#e74c3c,color:#fff
    style H fill:#f39c12,color:#fff

Bucket Definitions

Bucket	Condition	Weight	Description
🔵 FREEZE	$correct \land c > 0.85 \land q > 0.80$	$w = 0$	Mastered
🟢 LIGHT	$correct \land (c \leq 0.85 \lor q \leq 0.80)$	$w = 0$	Uncertain
🟠 FOCUS	$\neg correct \land c \leq 0.60$	$w = 1$	Wrong
🔴 HARD	$\neg correct \land c > 0.60$	$w = 3$	Confident-wrong

Mountain Curriculum

%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#3498db'}}}%%
xychart-beta
    title "Mountain Curriculum - Difficulty vs Progress"
    x-axis "Training Progress" [0, 0.15, 0.35, 0.65, 0.85, 1.0]
    y-axis "Difficulty" 0 --> 1
    line [0.2, 0.5, 1.0, 1.0, 0.5, 0.2]

Five Phases

gantt
    title Mountain Curriculum Phases
    dateFormat X
    axisFormat %s

    section Difficulty
    WARMUP (Easy)       :a1, 0, 15
    ASCENT (Medium)     :a2, 15, 35
    PEAK (Hard)         :a3, 35, 65
    DESCENT (Medium)    :a4, 65, 85
    CONSOLIDATE (Easy)  :a5, 85, 100

Phase	Progress	HARD	FOCUS	LIGHT	Strategy
WARMUP	0-15%	1	2	3	Build foundation
ASCENT	15-35%	2	3	1	Increase difficulty
PEAK	35-65%	5	2	0	Maximum learning
DESCENT	65-85%	2	3	1	Reduce difficulty
CONSOLIDATE	85-100%	1	2	3	Reinforce

Q-Value Learning

Each sample maintains a Q-value representing "learnability":

$$Q_{t+1}(s) = (1 - \alpha) \cdot Q_t(s) + \alpha \cdot \left(1 - \frac{\ell_t(s)}{\ell_{max}}\right)$$

graph LR
    A[Sample Loss ↓] --> B[Q-Value ↑]
    B --> C{Q > threshold?}
    C -->|Yes| D[Move to FREEZE]
    C -->|No| E[Stay trainable]

    style D fill:#3498db,color:#fff
    style E fill:#f39c12,color:#fff

Efficiency Analysis

Efficiency Curve

Compute Savings Over Time

%%{init: {'theme': 'base'}}%%
pie showData
    title "Bucket Distribution (Epoch 10)"
    "FREEZE (Saved)" : 80
    "LIGHT" : 15
    "FOCUS" : 4
    "HARD" : 1

Training Progression

Epoch	FREEZE	LIGHT	FOCUS	HARD	Compute Saved
1	0%	20%	60%	20%	0%
2	15%	25%	45%	15%	15%
3	35%	30%	25%	10%	35%
5	55%	25%	15%	5%	55%
10	80%	15%	4%	1%	80%

Architecture

GEKO Architecture

flowchart TB
    subgraph Input
        A[Any LLM Model]
        B[Training Dataset]
    end

    subgraph GEKO["GEKO Framework"]
        C[GEKOTrainer]
        D[Sample Partitioner]
        E[Mountain Curriculum]
        F[Sample States]

        C --> D
        C --> E
        D --> F
        E --> F
    end

    subgraph Output
        G[Efficient Training]
        H[Compute Savings]
    end

    A --> C
    B --> C
    C --> G
    C --> H

    style GEKO fill:#f5f5f5,stroke:#333

Theoretical Guarantees

Convergence

Under standard assumptions, GEKO converges:

$$\sum_{t=1}^{\infty} w_t^{(s)} = \infty \quad \forall s \notin \text{FREEZE}$$

Efficiency Bound

$$T_{GEKO} \leq T_{standard} \cdot (1 - \mathbb{E}[F])$$

Where $\mathbb{E}[F]$ = expected freeze fraction.

Results

Metric	Standard	GEKO	Improvement
Training Time	100%	50-70%	30-50% faster
Compute Cost	100%	50-70%	30-50% cheaper
Final Loss	$\ell^*$	$\leq \ell^*$	Equal or better

Citation

@software{geko2026,
  author = {Syed Abdur Rehman},
  title = {GEKO: Gradient-Efficient Knowledge Optimization},
  year = {2026},
  url = {https://github.com/ra2157218-boop/GEKO}
}

License

Apache 2.0

GEKO - Train smarter, not harder.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.3.1

Mar 1, 2026

0.3.0

Feb 27, 2026

0.2.0

Feb 23, 2026

0.1.1

Jan 8, 2026

This version

0.1.0

Jan 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gekolib-0.1.0.tar.gz (21.8 kB view details)

Uploaded Jan 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gekolib-0.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Jan 7, 2026 Python 3

File details

Details for the file gekolib-0.1.0.tar.gz.

File metadata

Download URL: gekolib-0.1.0.tar.gz
Upload date: Jan 7, 2026
Size: 21.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for gekolib-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2d6dd63c8a1ec7a45b194cb227f376e9837529f36a5e9c38306bc814d57a12af`
MD5	`3e927029ff20dcf3af4bb412c82a9781`
BLAKE2b-256	`42df252b9e4689339f6562ad1f36c691cbaf1ab35adcf3e22d0cc6188707410d`

See more details on using hashes here.

File details

Details for the file gekolib-0.1.0-py3-none-any.whl.

File metadata

Download URL: gekolib-0.1.0-py3-none-any.whl
Upload date: Jan 7, 2026
Size: 20.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for gekolib-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9861d6b4252d3246694a2bce5e56abe39ca2c39f5357eaa4e9821297886cdbc8`
MD5	`a601aeb8984637f519d0a3f69c8052fe`
BLAKE2b-256	`7b85fb01550877e26b9e342a843eae2295e58f4b4aa041f893ced13eac7a7b17`

See more details on using hashes here.

gekolib 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GEKO: Gradient-Efficient Knowledge Optimization

Key Insight

Installation

Quick Start

The GEKO Algorithm

Sample Partitioning

Bucket Definitions

Mountain Curriculum

Five Phases

Q-Value Learning

Efficiency Analysis

Compute Savings Over Time

Training Progression

Architecture

Theoretical Guarantees

Convergence

Efficiency Bound

Results

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes