Design, verify, and generate LLM architectures before you waste GPU compute
Project description
ArchGene
Verify your LLM architecture before you waste $50K on compute.
Training an LLM costs $10K–$100K+. The #1 reason training fails? Architecture misconfiguration — hidden dimension misalignment, attention bugs, incompatible layer configurations.
ArchGene catches these issues BEFORE you spend on GPU time.
The Problem
You spend $50K on GPU cluster
↓
Start training
↓
Day 3: OOM errors, NaN outputs, training crashes
↓
Why? Hidden dimension not divisible by attention heads
↓
$50K wasted
ArchGene prevents this.
What It Does
| Feature | What It Tells You |
|---|---|
| Z3 Verification | "Your architecture is mathematically valid" or "Here's what's broken" |
| Cost Estimation | "This will cost $12K to train on 8x A100s" |
| Benchmark Projections | "Expected MMLU score: ~42%" |
| Model Zoo | Compare against GPT-2, Llama-2, Mistral, etc. |
| Design Session | Conversational Q&A that designs a verified architecture for your use case |
| Kernel Generation | Generates runnable PyTorch model.py, config.json, and train.py |
Quick Start
# Install
pip install archgene
# Design an architecture through conversational Q&A
archgene design
# Verify your architecture BEFORE training
archgene verify --hidden 4096 --heads 32 --layers 24
# Generate runnable PyTorch code
archgene generate --session 0
# Get cost estimate
archgene cost gpt2 --gpu A100
# Check against known architectures
archgene zoo-evaluate llama2_7b
Why This Matters
- Don't waste compute: Catch bugs before GPU costs begin
- Know your bill: Estimate training cost before you start
- Validate fast: Z3 proves correctness mathematically
Use Cases
- Building a custom LLM? Verify architecture before training
- Fine-tuning an existing model? Check your config is valid
- Comparing architectures? Benchmark against model zoo
CLI Examples
# Verify custom architecture
archgene verify --hidden 4096 --heads 32 --layers 24
# Cost estimation
archgene cost gpt2 --gpu H100 --batch-size 16
# List pre-trained architectures
archgene zoo-list
# Benchmark estimate
archgene benchmark llama2_7b
# Design an architecture through conversational Q&A
archgene design
# Generate runnable PyTorch code from a design
archgene generate -d 4096 -l 32 -n 16 -i 11008
Architecture Parameters
| Parameter | Description | Example Values |
|---|---|---|
| vocab_dim | Vocabulary size | 32000, 50257 |
| hidden_dim | Hidden dimension | 768, 4096, 8192 |
| num_layers | Layer count | 12, 24, 32 |
| num_heads | Attention heads | 8, 16, 32 |
| head_dim | Head dimension | 64, 128 |
| intermediate_size | FFN hidden | 2048, 11008 |
Cost Reference
| Model | Parameters | VRAM (FP16) | Training Cost (1T tokens) |
|---|---|---|---|
| GPT-2 | 176M | 0.4 GB | ~$50 |
| Llama-2-7B | 6.4B | 14 GB | ~$2,500 |
| Llama-2-70B | 70B | 145 GB | ~$25,000 |
Tech Stack
- Python 3.12+
- Z3 theorem prover (formal verification)
- PyTorch (code generation)
- Streamlit (optional web UI:
pip install archgene[web])
Links
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file archgene-0.4.1.tar.gz.
File metadata
- Download URL: archgene-0.4.1.tar.gz
- Upload date:
- Size: 39.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bed9a946b96a91b513f9e6ab337ecd57baa3888d59e0538b24214b415a9674fe
|
|
| MD5 |
742fe340667a1e6885ce47af09f92708
|
|
| BLAKE2b-256 |
0fb058c80062434290322046c39e28b1eee1658ad8f0a1916c51aca2a9dc74a4
|
File details
Details for the file archgene-0.4.1-py3-none-any.whl.
File metadata
- Download URL: archgene-0.4.1-py3-none-any.whl
- Upload date:
- Size: 41.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4267ad315bb88dccc48cb720d0cce00bc72bcaf33f6581497dfa24cd0ceedd2
|
|
| MD5 |
7215bf706870850e10d45aa1d2199082
|
|
| BLAKE2b-256 |
35447233c93faac7f85bdd0bce1c5543c07a4f6021c7e657c70019a42a8ee9b6
|