Config-driven LLM fine-tuning with safety evaluation, EU AI Act compliance, 6 alignment methods, and one-command bundled quickstart templates.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cemililik

These details have not been verified by PyPI

Project description

ForgeLM

A config-driven LLM fine-tuning toolkit for everyone — from solo researchers to enterprise platform teams. SFT → DPO → SimPO → KTO → ORPO → GRPO, with safety evaluation, EU AI Act compliance, and CI/CD-native design baked in. YAML in — fine-tuned model + audit artefacts out.

Use it interactively from a Jupyter notebook, drop it into a CI/CD pipeline, or run it from the terminal — the same YAML and the same Python API drive every entry point. Runs on Linux, macOS, and Windows.[^1]

Quick Start

pip install forgelm

# Fastest path: a bundled template that runs on a 12 GB GPU
forgelm quickstart customer-support

# Or generate a config interactively
forgelm --wizard

# Validate, fit-check, then train
forgelm --config my_config.yaml --dry-run
forgelm --config my_config.yaml --fit-check
forgelm --config my_config.yaml

# After training: chat, export, deploy
forgelm chat ./checkpoints/final_model
forgelm export ./checkpoints/final_model --quant q4_k_m
forgelm deploy ./checkpoints/final_model --target ollama

See the Quick Start Guide for the full walkthrough.

Why ForgeLM

Config-driven. Behaviour is set in validated YAML — reproducible across notebooks, terminals, and CI runs with no hidden env-var flags.
Full alignment stack. Every modern post-training method in one tool, one schema.
Safety and compliance are first-class. Not an afterthought, not a separate product.
CI/CD-native. Stable exit codes (0/1/2/3/4/5), JSON output, append-only audit log, deterministic dry-runs.
Bring-your-own-data. PDF / DOCX / EPUB / Markdown → SFT-ready JSONL with a single command.
Open source. Apache-2.0, no telemetry, no required cloud service.

Features

Training

6 trainer types: SFT, DPO, SimPO, KTO, ORPO, GRPO
Memory-efficient methods: 4-bit QLoRA, DoRA, PiSSA, rsLoRA, GaLore
Backends: Unsloth (2–5× faster) or standard Transformers
Distributed: DeepSpeed ZeRO-2/3, FSDP, multi-GPU, MoE-aware (Qwen3, Mixtral, DeepSeek)
Long-context: RoPE scaling, NEFTune, sliding-window attention, sample packing
Multi-dataset mixing and synthetic data distillation (teacher → student)

Data Pipeline

forgelm ingest — PDF / DOCX / EPUB / TXT / Markdown → SFT-ready JSONL, with token-aware and markdown-aware chunking
forgelm audit — length, language, near-duplicate detection (SimHash + optional MinHash LSH), cross-split leakage, PII (TR / DE / FR / US-SSN, Luhn-validated), and a 9-family secrets scan
PII masking on ingest (emails, phones, cards, IBAN, national IDs) and secrets masking before chunks land in the JSONL
Croissant 1.0 dataset cards — the same JSON doubles as your EU AI Act Article 10 governance artefact

Evaluation & Safety

Benchmarks via lm-evaluation-harness
LLM-as-Judge scoring (OpenAI API or local model)
Llama Guard safety classifier with S1–S14 harm categories, severity tiers, and cross-run trend tracking
Auto-revert — runs that fail loss, benchmark, or safety thresholds are discarded before artefacts are written
VRAM fit-check — pre-flight FITS / TIGHT / OOM / UNKNOWN estimator with concrete recommendations

Production & Deployment

forgelm chat — streaming REPL with slash commands and optional safety routing
forgelm export — GGUF export (6 quant levels) via llama-cpp-python
forgelm deploy — generates Ollama, vLLM, TGI, or HF Endpoints configs
Model merging (TIES, DARE, SLERP, linear) and auto-generated HF model cards
Webhooks (Slack / Teams) and tracking via W&B / MLflow / TensorBoard
Stable Python API: from forgelm import ForgeTrainer, audit_dataset, verify_audit_log, ... — every CLI surface has a typed entry point

Compliance & Safety

Most fine-tuning tools stop at "the model trained." ForgeLM produces the artefacts an auditor will ask for next:

EU AI Act — auto-generated Annex IV technical documentation, Article 10 data governance, Article 14 human-oversight staging gate
GDPR — forgelm purge (Article 17 right-to-erasure) and forgelm reverse-pii (Article 15 right-of-access)
Model & log integrity — a SHA-256 manifest per trained model (forgelm verify-integrity) and a tamper-evident audit chain (forgelm verify-audit) give you a one-command proof-of-integrity before you ship
Append-only audit log — HMAC-chained when FORGELM_AUDIT_SECRET is configured; every decision gate emits a structured event
Supply-chain hardening — CycloneDX 1.5 SBOM per release, nightly pip-audit + bandit, gitleaks pre-commit
ISO 27001 / SOC 2 alignment — software cannot be certified, but ForgeLM produces the change-management, data-lineage, and audit-trail evidence your deployer's auditor needs. See the Deployer Audit Guide.

Full details: Safety & Compliance Guide · Supply-Chain Security

Documentation

Topic	English	Türkçe
Quick Start	quickstart.md	—
Document Ingestion	ingestion.md	ingestion-tr.md
Dataset Audit	data_audit.md	data_audit-tr.md
Alignment (DPO / SimPO / KTO / GRPO)	alignment.md	—
Multi-Stage Pipelines	pipeline.md	pipeline-tr.md
CI/CD Integration	cicd_pipeline.md	cicd_pipeline-tr.md
Enterprise Deployment	enterprise_deployment.md	—
Safety & Compliance	safety_compliance.md	—
Troubleshooting & FAQ	troubleshooting.md	—
Architecture Reference	architecture.md	architecture-tr.md
Configuration Reference	configuration.md	configuration-tr.md
Product Strategy & Roadmap	product_strategy.md · roadmap.md	product_strategy-tr.md · roadmap-tr.md

Notebooks

Featured walkthroughs, runnable in Colab on a free T4 GPU:

See notebooks/ for the full set (DPO, KTO, multi-dataset, GaLore, synthetic data, post-training workflow, data curation).

Installation

# From PyPI
pip install forgelm

# From source
git clone https://github.com/HodeTech/ForgeLM.git
cd ForgeLM
pip install -e .

Prerequisites: Python 3.10+, torch>=2.2.0. Platform-specific notes are in the installation guide.

Optional extras

pip install "forgelm[qlora]"            # 4-bit quantization (Linux)
pip install "forgelm[unsloth]"          # Unsloth backend (Linux)
pip install "forgelm[eval]"             # lm-evaluation-harness
pip install "forgelm[tracking]"         # W&B / MLflow
pip install "forgelm[distributed]"      # DeepSpeed
pip install "forgelm[merging]"          # model merging (TIES/DARE/SLERP — native, no extra deps)
pip install "forgelm[ingestion]"        # PDF / DOCX / EPUB / Markdown
pip install "forgelm[ingestion-scale]"  # MinHash LSH for large corpora
pip install "forgelm[ingestion-pii-ml]" # Presidio NER (also needs spaCy model)
pip install "forgelm[export]"           # GGUF via llama-cpp-python
pip install "forgelm[chat]"             # Rich terminal rendering

Docker

docker build -t forgelm --build-arg INSTALL_EVAL=true .

docker run --gpus all \
  -v $(pwd)/my_config.yaml:/workspace/config.yaml \
  -v $(pwd)/output:/workspace/output \
  forgelm --config /workspace/config.yaml

Multi-GPU and air-gapped deployment patterns are documented in the Enterprise Deployment Guide.

Contributing & License

Contributions are welcome — start with CONTRIBUTING.md and the engineering standards in docs/standards/.

Licensed under the Apache License 2.0.

[^1]: qlora and unsloth extras depend on Linux-only upstream wheels; on macOS and Windows the install succeeds but those backends are skipped via a sys_platform == 'linux' marker. All other extras are cross-platform.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cemililik

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.8.0

Jun 16, 2026

0.7.0

May 15, 2026

0.6.0

May 11, 2026

0.5.7

May 11, 2026

0.5.6

May 10, 2026

0.5.5

May 10, 2026

0.5.0

Apr 29, 2026

0.4.5

Apr 26, 2026

0.4.0

Apr 26, 2026

0.3.0

Mar 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

forgelm-0.8.0.tar.gz (1.0 MB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

forgelm-0.8.0-py3-none-any.whl (592.1 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file forgelm-0.8.0.tar.gz.

File metadata

Download URL: forgelm-0.8.0.tar.gz
Upload date: Jun 16, 2026
Size: 1.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for forgelm-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`50efd6ec47e9edcac0bcef6cc37210b8e06386ec6b89fdf81651e7218c36a658`
MD5	`6426b4f4e0196347a3128ec67b4c978a`
BLAKE2b-256	`8efc6b39ee07cdf2e432877887290c0946f7c8fb2b35454d9d67f7438850b898`

See more details on using hashes here.

Provenance

The following attestation bundles were made for forgelm-0.8.0.tar.gz:

Publisher: publish.yml on HodeTech/ForgeLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: forgelm-0.8.0.tar.gz
- Subject digest: 50efd6ec47e9edcac0bcef6cc37210b8e06386ec6b89fdf81651e7218c36a658
- Sigstore transparency entry: 1832621574
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: HodeTech/ForgeLM@fb2f47156109e0ed4b538b700cd3598145006352
- Branch / Tag: refs/tags/v0.8.0
- Owner: https://github.com/HodeTech
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@fb2f47156109e0ed4b538b700cd3598145006352
- Trigger Event: push

File details

Details for the file forgelm-0.8.0-py3-none-any.whl.

File metadata

Download URL: forgelm-0.8.0-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 592.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for forgelm-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c25efc834cc893e538e2c6c2b3f645b01c69bfb6c1a9f22617495339ba341552`
MD5	`f132a31bc668e7d3b5aa848821378587`
BLAKE2b-256	`e0469e540970919019aee0ce272cd41f80b930dbb9deb29224ac3e18a5646844`

See more details on using hashes here.

Provenance

The following attestation bundles were made for forgelm-0.8.0-py3-none-any.whl:

Publisher: publish.yml on HodeTech/ForgeLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: forgelm-0.8.0-py3-none-any.whl
- Subject digest: c25efc834cc893e538e2c6c2b3f645b01c69bfb6c1a9f22617495339ba341552
- Sigstore transparency entry: 1832621678
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: HodeTech/ForgeLM@fb2f47156109e0ed4b538b700cd3598145006352
- Branch / Tag: refs/tags/v0.8.0
- Owner: https://github.com/HodeTech
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@fb2f47156109e0ed4b538b700cd3598145006352
- Trigger Event: push

forgelm 0.8.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

ForgeLM

Quick Start

Why ForgeLM

Features

Training

Data Pipeline

Evaluation & Safety

Production & Deployment

Compliance & Safety

Documentation

Notebooks

Installation

Optional extras

Docker

Contributing & License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance