Download a HuggingFace model and convert it to GGUF for Ollama.
Project description
hf2ollama
English | 中文 | हिन्दी | Español | Français | العربية | বাংলা | Русский | Português | اردو
Run any HuggingFace text model inside Ollama with one command.
Point hf2ollama at a HuggingFace repo — it fetches the model, converts it
to the GGUF format Ollama needs, and prints the two ollama commands you
run to register and chat with it. No manual convert_hf_to_gguf.py,
no hand-written Modelfile.
Requires Python 3.11+ and a working Ollama install.
Install
pip install hf2ollama
Usage
Put your HuggingFace token in a .env next to where you'll run the command,
then point the tool at a repo:
echo "HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxx" > .env
hf2ollama SicariusSicariiStuff/Assistant_Pepe_70B
When it finishes, the tool prints two commands. Run them and you're chatting:
ollama create assistant-pepe-70b -f <path>/Modelfile
ollama run assistant-pepe-70b
(Get a HuggingFace token at https://huggingface.co/settings/tokens with Read scope.
It's needed only for private and gated models, but having one set never hurts.)
Optional flags
# See which GGUF quants are inside a *-GGUF repo (no download):
hf2ollama some-user/some-model-GGUF --list
# Download a single quant — other .gguf files are skipped:
hf2ollama some-user/some-model-GGUF --quant Q5_K_M
# Custom Ollama model name:
hf2ollama some-user/some-model --ollama-name my-model
Install from git
For the latest unreleased changes:
pip install git+https://github.com/wachawo/hf2ollama.git
# or, over SSH:
pip install git+ssh://git@github.com/wachawo/hf2ollama.git
Install from source
For local development:
git clone git@github.com:wachawo/hf2ollama.git
cd hf2ollama
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
cp .env.example .env # then put your HF_TOKEN into .env
hf2ollama --help
Configuration
.env in the directory you run hf2ollama from:
HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional. f16 (default) | f32 | bf16 | q8_0 | auto
OUTTYPE=f16
Path overrides
Everything is written under the current working directory. Override with env vars if you want to share things between workspaces:
| Variable | Default | Purpose |
|---|---|---|
HF2OLLAMA_WORKSPACE |
$PWD |
Base directory for everything below. |
HF2OLLAMA_HF_DIR |
<workspace>/hf |
Where HF snapshots and GGUFs go. |
HF2OLLAMA_CACHE_DIR |
<workspace>/.hf_cache |
huggingface_hub cache. |
HF2OLLAMA_LLAMA_CPP_DIR |
<workspace>/llama.cpp |
Where to clone llama.cpp. |
What ends up on disk
<workspace>/
├── .env # HF_TOKEN, OUTTYPE
├── hf/ # HF snapshots land here
│ └── <org>/<name>/ # source files + resulting GGUF + Modelfile, all in one folder
│ ├── config.json
│ ├── model.safetensors
│ ├── ...
│ ├── <name>.f16.gguf
│ └── Modelfile
└── .hf_cache/ # local huggingface_hub cache
<workspace>/llama.cpp/ # cloned on first conversion run
The first run that needs conversion also clones
llama.cpp into your workspace so
the next run is fast. Only repos that ship prebuilt GGUF files skip this step.
To share one clone across several workspaces, set HF2OLLAMA_LLAMA_CPP_DIR
(e.g. HF2OLLAMA_LLAMA_CPP_DIR=../llama.cpp).
Example output
======================================================================
GGUF: <workspace>/hf/SicariusSicariiStuff/Assistant_Pepe_70B/Assistant_Pepe_70B.f16.gguf
Modelfile: <workspace>/hf/SicariusSicariiStuff/Assistant_Pepe_70B/Modelfile
Done. To import the model into Ollama, run these 2 commands:
ollama create assistant-pepe-70b -f <workspace>/hf/SicariusSicariiStuff/Assistant_Pepe_70B/Modelfile
ollama run assistant-pepe-70b
======================================================================
ollama create copies the layers into ~/.ollama/models/blobs/ and creates
a manifest itself. Do not manually copy files into ~/.ollama/models/ —
Ollama stores blobs by sha256, and a manual copy will break its index.
Troubleshooting
RepositoryNotFoundError
The repo does not exist on HuggingFace. Some models are published elsewhere —
e.g. xai/grok-* lives behind the xAI API rather than HF, and this pipeline
cannot fetch it.
GatedRepoError
The model requires accepting a license. Open the model page on HF, click
"Agree and access", and make sure the token in .env has access to that repo.
No .safetensors files found
The repo only ships weights in the legacy pytorch_model.bin format. By default
.bin is excluded to avoid duplicating safetensors. Remove *.bin and *.pth
from IGNORE_PATTERNS in hf2ollama.py and rerun.
Out of disk or VRAM
f16 for a 70B model is roughly 140 GB on disk and the same in VRAM when
loading. Two ways out:
- Convert to
f16first, then quantize toQ4_K_M(see below) —Q4_K_Mshrinks the file ~4× with minimal quality loss. - Look for a community-built GGUF of the same model (
<org>/<name>-GGUF) — then--listwill show available quants, and--quant Q4_K_Mdownloads just the one you want.
Manual quantization (optional)
If you've converted to f16 and want a smaller Q4_K_M:
cd llama.cpp # or wherever HF2OLLAMA_LLAMA_CPP_DIR points
cmake -B build && cmake --build build --target llama-quantize -j
./build/bin/llama-quantize \
<workspace>/hf/<org>/<name>/<name>.f16.gguf \
<workspace>/hf/<org>/<name>/<name>.Q4_K_M.gguf \
Q4_K_M
Then create a new Modelfile pointing at the Q4_K_M.gguf and run
ollama create with a new name.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hf2ollama-0.0.2.tar.gz.
File metadata
- Download URL: hf2ollama-0.0.2.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1f486a6349980342708a2caec50443970ac39e2a1b698d103224b2bbb2d7075
|
|
| MD5 |
b4692344f4f9ca4acd6aa13455a39950
|
|
| BLAKE2b-256 |
103e13947f9a04c6b6f3ba24aaf8202dd49152f85e04048036e918e6048d7695
|
Provenance
The following attestation bundles were made for hf2ollama-0.0.2.tar.gz:
Publisher:
publish.yml on wachawo/hf2ollama
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hf2ollama-0.0.2.tar.gz -
Subject digest:
f1f486a6349980342708a2caec50443970ac39e2a1b698d103224b2bbb2d7075 - Sigstore transparency entry: 1549048782
- Sigstore integration time:
-
Permalink:
wachawo/hf2ollama@338e7aef9a1fcba353ec3c599ce1eb674aeaa989 -
Branch / Tag:
refs/tags/0.0.2 - Owner: https://github.com/wachawo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@338e7aef9a1fcba353ec3c599ce1eb674aeaa989 -
Trigger Event:
push
-
Statement type:
File details
Details for the file hf2ollama-0.0.2-py3-none-any.whl.
File metadata
- Download URL: hf2ollama-0.0.2-py3-none-any.whl
- Upload date:
- Size: 11.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7ceb51271b81fa9e340c81248915668e27e044bd303b0b3287de8236fbfec42
|
|
| MD5 |
fd9b7b7fae39634bbcbdb196a2fde2ca
|
|
| BLAKE2b-256 |
1fd18362e454e66771a499ad75ce62ab6685c6441873555ddb9dcada6abcadd9
|
Provenance
The following attestation bundles were made for hf2ollama-0.0.2-py3-none-any.whl:
Publisher:
publish.yml on wachawo/hf2ollama
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hf2ollama-0.0.2-py3-none-any.whl -
Subject digest:
b7ceb51271b81fa9e340c81248915668e27e044bd303b0b3287de8236fbfec42 - Sigstore transparency entry: 1549048883
- Sigstore integration time:
-
Permalink:
wachawo/hf2ollama@338e7aef9a1fcba353ec3c599ce1eb674aeaa989 -
Branch / Tag:
refs/tags/0.0.2 - Owner: https://github.com/wachawo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@338e7aef9a1fcba353ec3c599ce1eb674aeaa989 -
Trigger Event:
push
-
Statement type: