Skip to main content

Create, retrain, ablate, and convert models for local Ollama

Project description

ollama-forge

PyPI

Get models from Hugging Face, convert them, add adapters, and run them in Ollama — without needing deep expertise. One place for fetch, convert, adapters, and recipes.

Install: pip install ollama-forge or uv tool install ollama-forgePyPI. From this repo: uv sync then uv run ollama-forge; or uv tool install . to use the CLI from anywhere.

Quick start:

ollama-forge fetch TheBloke/Llama-2-7B-GGUF --name my-model && ollama run my-model

Or the shortest path: ollama-forge start --name my-model then ollama run my-model.


Documentation (Wiki)

Detailed guides live in the wiki/:

Topic Description
Installation Setup, check, doctor, setup-llama-cpp
Quick Start start / quickstart, profiles, task presets
Auto & Plan Auto-detect source, dry-run planner
Fetch & Convert GGUF from HF, GGUF file → Ollama
Recipes One-file YAML/JSON build
Modelfile Ollama Modelfile basics
Adapters LoRA: search, recommend, fetch-adapter, retrain
Training Data JSONL validate, prepare, train script
Retrain Pipeline Data → adapter → Ollama
Abliterate Refusal removal
Security Eval LLM security evaluation: prompt sets, KPIs, UI
Downsizing Teacher → student distillation
Hugging Face without GGUF When the repo has no GGUF
Quantization Smaller/faster GGUF (Q4_K_M, Q8_0, etc.)
CI / Automation Example GitHub Actions
Command Reference All commands

Why ollama-forge

  • One place — Fetch, convert, adapters, recipes; no scattered scripts.
  • Simple — Clear commands and docs; try things without being an ML expert.
  • Local-first — Get models running in Ollama on your machine.

Setup (one-time)

  • Python 3.10+. From PyPI: pip install ollama-forge or uv tool install ollama-forge (PyPI). From repo: uv sync then uv run ollama-forge; use uv tool install . from the repo root to put ollama-forge on your PATH.
  • OllamaInstall and ensure ollama is on your PATH.
  • Verify: ollama-forge check — see what’s installed. ollama-forge doctor for diagnosis; doctor --fix to apply safe fixes. See Installation for optional llama.cpp (finetune/quantize).
  • Optional extras: pip install ollama-forge[net] adds requests for HTTP paths (proxy, security-eval, download-lists); ollama-forge[abliterate] for abliterate run/proxy (see Abliterate).
  • Optional: Run Ruff and tests before commit/push: git config core.hooksPath .githooks. See .githooks/README.md. To fix lint before pushing without hooks: ./scripts/lint-fix.sh.

Commands at a glance

What you want Command
Easiest one-command start start or quickstart [--name my-model]
Auto-detect source and run auto <source> [--name my-model]
Preview operations (dry-run) plan <quickstart|auto|doctor-fix|adapters-apply> ...
GGUF from HF → Ollama fetch <repo_id> --name <name>
GGUF file → Ollama convert --gguf <path> --name <name>
Find / use adapters adapters search, adapters recommend, fetch-adapter, retrain
One-file config build build recipe.yaml
Check / fix environment check, doctor [--fix]
Install llama.cpp setup-llama-cpp

Full list: Command Reference. Run ollama-forge --help for options.


Simplest workflows

Beginner (one command):

uv run ollama-forge start --name my-model
ollama run my-model

Uses default model + balanced profile. Use --profile fast|balanced|quality|low-vram and --task chat|coding|creative. See Quick Start.

Auto (any source): Recipe, GGUF path, HF repo, base model, or adapter — the tool detects and runs the right flow:

uv run ollama-forge auto ./recipe.yaml
uv run ollama-forge auto TheBloke/Llama-2-7B-GGUF --name my-model
uv run ollama-forge auto llama3.2 --name my-assistant --system "You are helpful."

See Auto & Plan.

Fetch from Hugging Face: When the repo has GGUF files:

uv run ollama-forge fetch TheBloke/Llama-2-7B-GGUF --name my-model
ollama run my-model

Use --quant Q4_K_M to pick size. For gated or private repos, set HF_TOKEN or run huggingface-cli login. See Fetch & Convert.

Local GGUF: uv run ollama-forge convert --gguf /path/to/model.gguf --name my-model. Optional --quantize Q4_K_M (needs llama.cpp on PATH). See Quantization.

Recipe (one file): uv run ollama-forge build recipe.yaml. See Recipes for format and examples. Sampling options (temperature, top_p, repeat_penalty) are available on fetch, convert, build, and create-from-base (Modelfile, Recipes).

Adapters: adapters search "llama lora", then fetch-adapter <repo> --base <base> --name <name>, or retrain --base <base> --adapter <path> --name <name>. See Adapters.

Training data → model: Validate JSONL, prepare for trainer, generate script: train --data ./data/ --base llama3.2 --name my-model --write-script train.sh. See Training Data and Retrain Pipeline.


Other topics

  • Hugging Face repo without GGUF — Convert with llama.cpp first, then convert. Wiki.
  • Refusal removal (abliterate)abliterate compute-dir; optional deps: uv sync --extra abliterate. For agents with tool support use the lightweight proxy: abliterate proxy --name <name>. Wiki.
  • Downsizing (distillation)downsize --teacher <hf> --student <hf> --name <name>. Wiki.
  • LLM security evaluation — Run prompt sets against Ollama/serve, score refusal/compliance, get ASR and KPIs: security-eval run <prompt_set>. Optional UI: uv sync --extra security-eval-ui then security-eval ui. Wiki: Security Eval.
  • CI — Example GitHub Actions in CI / Automation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_forge-1.0.1.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollama_forge-1.0.1-py3-none-any.whl (857.7 kB view details)

Uploaded Python 3

File details

Details for the file ollama_forge-1.0.1.tar.gz.

File metadata

  • Download URL: ollama_forge-1.0.1.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for ollama_forge-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d02ef925e6b91c11fabc94e2096a42ed36acad2a64070b7d0dfc8288d0591a11
MD5 d2d328fa278cbdb5076f036dba28a8df
BLAKE2b-256 93ea13c2c92dc22746e93f2a077ae50c80209b86a0851575f7e442b4c3dbf1a8

See more details on using hashes here.

File details

Details for the file ollama_forge-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ollama_forge-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 857.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for ollama_forge-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9d823b39bcc3f7fcf5878fe1e7f864c8d6ceb3e7e980ec63fc67c26bc9cd008c
MD5 00b00f2801bb2d90177d363c97c11f6f
BLAKE2b-256 3baf06a538d4423f7c008c9370c44705ab6db90b6d5e60f95f2b0dcd8f9bdeb2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page