BiDoRA/LoRA fine-tuning toolkit for 3D code generation and spatial intelligence

These details have not been verified by PyPI

Project links

Project description

BiDoRA: Bi-Level Optimization for Parameter-Efficient Fine-Tuning

BiDoRA is a Python package implementing true BiDoRA (Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation) for efficient fine-tuning of Large Language Models. Specifically optimized for:

3D Code Generation (Rust, Blender, CAD)
Spatial Intelligence Tasks
Small Datasets (<10k samples)
Automatic Hardware Adaptation (Laptop to A100)

🔬 What is BiDoRA?

BiDoRA uses bi-level optimization to separately optimize magnitude and direction components of weight updates:

W' = m ⊙ (W₀ + BA) / ||W₀ + BA||
     ↑      ↑
  magnitude direction
  (upper)   (lower)

Training Process:

Lower Level: Optimize direction (A, B matrices) on training set
Upper Level: Optimize magnitude (m) on validation set via hypergradients
Final Phase: Direction fine-tuning on combined data with fixed magnitude

Benefits:

✅ Reduces overfitting on small datasets (<10k samples)
✅ Better alignment with full fine-tuning (correlation: -8.042 vs -1.784 for DoRA)
✅ Statistically significant improvements on GLUE (p < 0.001)

Important Notes:

⚠️ Training Time: 3-4x slower than standard LoRA due to bi-level optimization
⚠️ No Quantization: BiDoRA requires full precision (bfloat16) - quantization disabled automatically
⚠️ Memory: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate
✅ Best For: Small specialized datasets where quality > speed

🚀 Features

✅ BiDoRA Bi-Level Optimization: True magnitude-direction decomposition
✅ Auto Hardware Detection: Automatically adapts config to available hardware
✅ Full Precision Training: Optimized for bfloat16 (no quantization needed for BiDoRA)
✅ Flexible Data Formats: JSONL, HuggingFace Datasets
✅ Type-Safe Config: Pydantic-validated configuration
✅ CLI Interface: Simple command-line interface with Typer

📦 Installation

From PyPI (recommended)

pip install bidora

As a project dependency

# With uv (recommended)
uv add bidora

# With pip
pip install bidora

From source (for development)

git clone https://github.com/bjoernbethge/bidora.git
cd bidora
uv sync --dev

🎯 Quick Start

1. Show hardware info

bidora info

Shows available hardware and recommended configuration.

2. Show recommended models

bidora list-models

3. Start BiDoRA training

Important: BiDoRA requires separate train and validation files for bi-level optimization.

Basic training

bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-4B \
  --output ./output \
  --rank 8 \
  --epochs 3

With custom learning rates

bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-4B \
  --lr 2e-4 \
  --upper-lr-mult 2.0 \
  --rank 8

With HuggingFace dataset

bidora train \
  --dataset "code_search_net" \
  --model Qwen/Qwen3-8B \
  --output ./output \
  --rank 8

📊 Data Format

JSONL Format (Instruction-Tuning)

{"instruction": "Generate a Rust function to create a 3D cube mesh", "output": "fn create_cube() -> Mesh { ... }"}
{"instruction": "Write Blender Python code to add a sphere", "input": "radius: 2.0", "output": "import bpy\nbpy.ops.mesh.primitive_uv_sphere_add(radius=2.0)"}

JSONL Format (Code Completion)

{"prompt": "// Generate 3D mesh\nfn create_mesh()", "completion": " -> Mesh {\n    let vertices = vec![...];\n    Mesh::new(vertices)\n}"}

JSONL Format (Code-Only)

{"code": "use bevy::prelude::*;\n\nfn setup_3d_scene(mut commands: Commands) { ... }"}

⚙️ Hardware-Specific Setups

Laptop (8GB GPU)

bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-4B \
  --rank 4 \
  --batch-size 1 \
  --auto-hardware  # Automatic adaptation

Config automatically adjusted:

Precision: bfloat16 (full precision - BiDoRA requirement)
Batch Size: 1-2
Gradient Accumulation: 8-16
Max Seq Length: 1024-2048

Desktop (16GB GPU)

bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-8B \
  --rank 16 \
  --batch-size 2 \
  --auto-hardware

Auto-Config:

Precision: bfloat16 (full precision - BiDoRA requirement)
Batch Size: 2-4
Gradient Accumulation: 4-8
Max Seq Length: 2048

A100 (40GB)

bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-32B \
  --rank 16 \
  --batch-size 8 \
  --auto-hardware

Auto-Config:

Precision: bfloat16 (full precision - BiDoRA requirement)
Batch Size: 4-8
Gradient Accumulation: 2-4
Max Seq Length: 4096

🎛️ Advanced Options

All CLI Parameters

bidora train --help

Most Important Parameters:

Parameter	Description	Default
`--model, -m`	Model name or path	`Qwen/Qwen3-4B`
`--train-file, -t`	Training JSONL	Required
`--val-file, -v`	Validation JSONL	Required for BiDoRA
`--dataset, -d`	HuggingFace Dataset	-
`--output, -o`	Output directory	`./output`
`--rank, -r`	LoRA Rank	`8`
`--epochs, -e`	Training Epochs	`3`
`--batch-size, -b`	Batch Size	`4`
`--lr`	Learning Rate (lower level)	`2e-4`
`--upper-lr-mult`	Upper level LR multiplier	`2.0`
`--max-samples`	Max Training Samples	All
`--auto-hardware`	Auto-adjustment	`True`

Manual Config (without Auto-Hardware)

bidora train \
  --train-file data/train.jsonl \
  --val-file data/val.jsonl \
  --model Qwen/Qwen3-8B \
  --rank 16 \
  --batch-size 8 \
  --lr 3e-4 \
  --epochs 5 \
  --no-auto-hardware  # Manual config

💾 Memory Requirements

Qwen3 Model Sizes (BiDoRA - Full Precision)

⚠️ Note: BiDoRA requires full precision (bfloat16) - no quantization. Memory requirements higher than standard LoRA.

Model	Parameter	VRAM (bf16)	Training VRAM	Recommended For
Qwen3-0.6B	0.6B	~2GB	~6GB	Laptop GPU (6-8GB)
Qwen3-1.7B	1.7B	~4GB	~10GB	Laptop GPU (8GB+)
Qwen3-4B	4B	~8GB	~16GB	Desktop GPU (12-16GB)
Qwen3-8B	8B	~16GB	~24GB	Desktop GPU (24GB+) / A100
Qwen3-14B	14B	~28GB	~40GB	A100 (40GB)
Qwen3-32B	32B	~64GB	~80GB	A100 (80GB)

💡 Memory Optimization: Uses 8-bit AdamW optimizer (75% memory reduction) to compensate for full precision requirement.

Trainable Parameters (LoRA Rank=8)

Base Model	LoRA Params	Reduction
7B	~2M	3500×
14B	~4M	3500×
32B	~8M	4000×

🧪 Example Workflow: 3D Rust Code Fine-Tuning

1. Prepare data

# data/rust_3d_train.jsonl
{"instruction": "Create a three-rs mesh for a cube", "output": "use three::*;\n\nfn create_cube(size: f32) -> Mesh {\n    let geometry = Geometry::cuboid(size, size, size);\n    Mesh::new(geometry, Material::default())\n}"}
{"instruction": "Generate Bevy 3D scene setup", "output": "use bevy::prelude::*;\n\nfn setup(mut commands: Commands) {\n    commands.spawn(Camera3dBundle::default());\n    commands.spawn(PbrBundle {\n        mesh: meshes.add(Mesh::from(shape::Cube { size: 1.0 })),\n        ..default()\n    });\n}"}

2. Start training

bidora train \
  --train-file data/rust_3d_train.jsonl \
  --val-file data/rust_3d_val.jsonl \
  --model Qwen/Qwen3-4B \
  --output ./rust_3d_model \
  --rank 8 \
  --epochs 3 \
  --batch-size 2

3. Use model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load base model with BiDoRA adapters
model = AutoModelForCausalLM.from_pretrained(
    "./rust_3d_model/final_model",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B")

# Generate
prompt = "### Instruction:\nCreate a three-rs function to render a sphere\n\n### Response:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🔧 Programmatic Usage

from bidora import (
    FullConfig, ModelConfig, BiDoRAConfig, TrainingConfig, DataConfig,
    load_model_and_tokenizer, prepare_bidora_model,
    load_and_prepare_dataset, prepare_dataset_for_training,
    train_bidora
)
from pathlib import Path

# Create config
config = FullConfig(
    model=ModelConfig(
        model_name="Qwen/Qwen3-4B",
        quantization="none"  # BiDoRA requires full precision (bfloat16)
    ),
    bidora=BiDoRAConfig(
        rank=8,
        use_bidora=True,  # Enable BiDoRA bi-level optimization
        upper_lr_multiplier=2.0
    ),
    training=TrainingConfig(
        batch_size=2,
        learning_rate=2e-4,
        num_epochs=3
    ),
    data=DataConfig(
        train_file=Path("data/train.jsonl"),
        val_file=Path("data/val.jsonl")  # Required for BiDoRA
    ),
    output_dir=Path("./output")
)

# Auto-adjust for hardware (will keep full precision for BiDoRA)
config.auto_adjust_for_hardware()

# Load model with BiDoRA layers
model, tokenizer = load_model_and_tokenizer(config.model)
model = prepare_bidora_model(model, config.bidora, quantized=False)

# Load data
dataset = load_and_prepare_dataset(config.data)
tokenized_dataset = prepare_dataset_for_training(
    dataset, tokenizer, config.training.max_seq_length
)

# Train with bi-level optimization
trainer = train_bidora(model, tokenizer, tokenized_dataset, config)

🐛 Troubleshooting

CUDA Out of Memory

# Reduce batch size
bidora train --batch-size 1 ...

# Or use smaller model
bidora train --model Qwen/Qwen3-1.7B ...

# Note: BiDoRA cannot use quantization (requires full precision)

Flash Attention Error

If Flash Attention 2 is not available:

Automatically disabled
Or manually: Set use_flash_attention=False in ModelConfig

Import Errors

# Reinstall dependencies
uv pip install --force-reinstall transformers accelerate peft bitsandbytes

📚 Further Resources

BiDoRA Paper - Original bi-level optimization paper
LoRA Paper - Low-Rank Adaptation
DoRA Paper - Weight-Decomposed LoRA
Qwen3 Models - HuggingFace model collection

📖 Citation

If you use BiDoRA in your research, please cite:

@article{liu2024bidora,
  title={BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation},
  author={Liu, Peiran and Wang, Luning and Sun, Yanchao and Tang, Zhongwei and Xu, Dawei and Li, Jiaxi and Xu, Zhili},
  journal={arXiv preprint arXiv:2410.09758},
  year={2024}
}

📝 License

MIT License - see LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Oct 27, 2025

This version

0.1.1

Oct 27, 2025

0.1.0

Oct 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bidora-0.1.1.tar.gz (205.1 kB view details)

Uploaded Oct 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bidora-0.1.1-py3-none-any.whl (23.0 kB view details)

Uploaded Oct 27, 2025 Python 3

File details

Details for the file bidora-0.1.1.tar.gz.

File metadata

Download URL: bidora-0.1.1.tar.gz
Upload date: Oct 27, 2025
Size: 205.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for bidora-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`148fdfb498b589d98185d36a2fd266a2ea5bec21e1b90723aa71a7154815ea1f`
MD5	`aee1bad7aa7f5e4c35f216bf95b645b0`
BLAKE2b-256	`5b068c1204055613e415ad5c683cdbac662c617ecd84361d7da39b6a3884d939`

See more details on using hashes here.

File details

Details for the file bidora-0.1.1-py3-none-any.whl.

File metadata

Download URL: bidora-0.1.1-py3-none-any.whl
Upload date: Oct 27, 2025
Size: 23.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for bidora-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`01d3800576a400b44edd7c82e08be4b8551dfa6fc15f579c068662f1cc90811e`
MD5	`ec21bf55c3287911d0a646b92ea60b72`
BLAKE2b-256	`2bf54e9baebedab7bc1e6f86a547b1803797b4f1410a61282922f3c682685128`

See more details on using hashes here.

bidora 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

BiDoRA: Bi-Level Optimization for Parameter-Efficient Fine-Tuning

🔬 What is BiDoRA?

🚀 Features

📦 Installation

From PyPI (recommended)

As a project dependency

From source (for development)

🎯 Quick Start

1. Show hardware info

2. Show recommended models

3. Start BiDoRA training

Basic training

With custom learning rates

With HuggingFace dataset

📊 Data Format

JSONL Format (Instruction-Tuning)

JSONL Format (Code Completion)

JSONL Format (Code-Only)

⚙️ Hardware-Specific Setups

Laptop (8GB GPU)

Desktop (16GB GPU)

A100 (40GB)

🎛️ Advanced Options

All CLI Parameters

Manual Config (without Auto-Hardware)

💾 Memory Requirements

Qwen3 Model Sizes (BiDoRA - Full Precision)

Trainable Parameters (LoRA Rank=8)

🧪 Example Workflow: 3D Rust Code Fine-Tuning

1. Prepare data

2. Start training

3. Use model

🔧 Programmatic Usage

🐛 Troubleshooting

CUDA Out of Memory

Flash Attention Error

Import Errors

📚 Further Resources

📖 Citation

📝 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes