Download and quantize the HyperNix PyTorch model to GGUF (fp32/fp16/Q8_0/Q6_K/Q4_K_M).
Project description
hypernix
Download the ray0rf1re/hyper-nix.1
PyTorch model and export GGUF files at fp32 and fp16 precision — on
Ubuntu, Arch, Fedora, openSUSE, Alpine, NixOS, or anything else with CPython
3.10+ and PyTorch 2.7.1. k-quants (Q8_0, Q6_K, Q4_K_M, Q5_K_M) are
available as opt-in via --quants when a llama-quantize binary is
available locally.
The converter is architecture-agnostic: it introspects the state dict, so any HyperNix checkpoint works regardless of depth, hidden size, head count, FFN width, or vocabulary size.
Install
From PyPI (recommended):
pip install "hypernix[llama-cpp]"
From a GitHub Release download (the .whl or .tar.gz attached to a
release) — these are real Python distributions, pip accepts them
directly:
pip install hypernix-0.1.2-py3-none-any.whl
# or:
pip install hypernix-0.1.2.tar.gz
From a GitHub Actions artifact download (the hypernix-dist-*.zip
you get from the Actions tab) — GitHub wraps every artifact in a .zip
on download, so you must extract first:
unzip hypernix-dist-0.1.2-*.zip -d hypernix-dist
pip install hypernix-dist/hypernix-0.1.2-py3-none-any.whl
Each artifact zip contains an INSTALL.txt with platform-specific
one-liners; the build also publishes hypernix-wheel-<ver> and
hypernix-sdist-<ver> artifacts so you can grab the raw wheel or sdist
without unzipping a multi-file bundle.
Or install everything with the distro bootstrap script:
./scripts/install_deps.sh # handles apt, pacman, dnf, zypper, apk, nix
source .venv/bin/activate
hypernix doctor # sanity-check the environment
Requires:
- Linux (x86_64 / aarch64)
- Python 3.10 – 3.13 (3.12 recommended)
- PyTorch 2.7 or newer (CPU, CUDA 11.8, CUDA 12.x, or ROCm)
llama-quantize— shipped via the[llama-cpp]extra, or frompacman -S llama.cpp/dnf install llama-cpp/ a source build. If none of those are available, hypernix auto-downloads a prebuilt CPU binary from the upstreamggml-org/llama.cppGitHub release and caches it under~/.cache/hypernix/bin/. Disable with--no-auto-fetch, or pre-seed the cache withhypernix fetch-llama-quantize.
Picking a torch build (CPU / CUDA 11 / CUDA 12)
pip install hypernix pulls the default torch wheel for your platform,
which is currently a CUDA-12 build on Linux x86_64. If you have an older
CUDA 11 driver (or a GPU that doesn't have CUDA-12 support) install
torch from the CUDA-11.8 index first, then install hypernix — pip
will reuse the already-installed torch instead of replacing it:
# CUDA 11.8 (older drivers / Kepler-through-Pascal GPUs on old stacks)
pip install --index-url https://download.pytorch.org/whl/cu118 torch
pip install hypernix
# CUDA 12.x (modern default)
pip install --index-url https://download.pytorch.org/whl/cu124 torch
pip install hypernix
# CPU-only (no GPU, or you don't want to pull CUDA deps)
pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install hypernix
hypernix doctor prints the installed torch build (cuda=11.8,
cuda=12.4, or cpu) so you can confirm after install.
Quickstart
# Default: fp32 + fp16 only — no external binary required.
hypernix \
--repo-id ray0rf1re/hyper-nix.1 \
--output-dir ./hypernix-gguf
# Opt in to k-quants (needs llama-quantize on PATH, or the [llama-cpp] extra):
hypernix \
--repo-id ray0rf1re/hyper-nix.1 \
--output-dir ./hypernix-gguf \
--quants fp32 fp16 q8_0 q6_k q4_k_m
Python API:
from hypernix import download_model, convert_to_gguf, quantize_gguf
model_dir = download_model("ray0rf1re/hyper-nix.1")
fp16 = convert_to_gguf(model_dir, "hyper-nix-fp16.gguf", dtype="fp16")
quantize_gguf(fp16, "hyper-nix-q4_k_m.gguf", "q4_k_m")
quantize_gguf(fp16, "hyper-nix-q6_k.gguf", "q6_k")
quantize_gguf(fp16, "hyper-nix-q8_0.gguf", "q8_0")
Publish all artifacts to ray0rf1re/HyperNix.1-gguf:
HF_TOKEN=hf_xxx hypernix --upload-to ray0rf1re/HyperNix.1-gguf
Supported distros
| Distro | Install |
|---|---|
| Ubuntu 22.04+ / Debian 12+ | sudo apt install python3.12 python3.12-venv && pip install "hypernix[llama-cpp]" |
| Arch / Manjaro / EndeavourOS | sudo pacman -S python python-pip llama.cpp && pip install hypernix |
| Fedora / RHEL / Alma / Rocky | sudo dnf install python3.12 llama-cpp && pip install hypernix |
| openSUSE Tumbleweed | sudo zypper install python312 llama.cpp && pip install hypernix |
| Alpine 3.20+ | sudo apk add python3 py3-pip bash && pip install "hypernix[llama-cpp]" |
| NixOS | nix-shell -p python312 llama-cpp --run 'pip install hypernix' |
hypernix doctor prints an environment report with the exact llama-quantize
path it will use, Python / PyTorch / dependency versions, and distro id.
Laptop-grade CPUs
scripts/quantize_i7_7660u.sh is tuned for a 2C/4T Kaby Lake ultrabook
(Intel i7-7660U) and works on any faster CPU:
# fp16 + Q8_0 + Q6_K + Q4_K_M into ./hypernix-gguf
./scripts/quantize_i7_7660u.sh
# build, then push to ray0rf1re/HyperNix.1-gguf
HF_TOKEN=hf_xxx ./scripts/quantize_i7_7660u.sh --upload
It caps BLAS/OpenMP to 4 threads, runs at reduced CPU + I/O priority
when nice/ionice are available, and only keeps a single fp16
intermediate on disk.
CLI reference
hypernix <subcommand> [options]
Subcommands (all script-friendly, each wraps one library function):
all download -> convert -> [quantize] (default)
download fetch a HuggingFace snapshot
convert produce an fp32 or fp16 GGUF from a snapshot
quantize run llama-quantize on an fp16/fp32 GGUF
verify read-validate a GGUF and print its headers
info package + optional GGUF header summary
upload push files to a HuggingFace repo
doctor environment diagnostic
fetch-llama-quantize pre-seed the llama-quantize cache
train init create a fresh HyperNix snapshot
train expand warm-start a bigger model from a smaller one
train run minimal causal-LM training loop
hypernix with only flags (no subcommand) still runs the full all
pipeline, so existing scripts keep working.
| Quant alias | llama.cpp enum |
|---|---|
fp32, f32 |
F32 |
fp16, f16 |
F16 |
q8, q8_0 |
Q8_0 |
q6, q6_k |
Q6_K |
q4km, q4_k_m |
Q4_K_M |
q5km, q5_k_m |
Q5_K_M |
Training larger HyperNix models
The train subcommand is a small scaffold for standing up a same-size
or bigger HyperNix model. Install with:
pip install "hypernix[train]"
Initialize a new HyperNix at a chosen shape:
hypernix train init \
--out-dir ./hyper-nix-v2 \
--tokenizer-source ./hyper-nix-v1 \
--hidden-size 1536 --intermediate-size 6144 \
--num-hidden-layers 24 --num-attention-heads 24
Warm-start a bigger model from an existing smaller checkpoint —
overlapping rows/columns copy over, new slots init from N(0, std),
extra blocks duplicate the last old block:
hypernix train expand \
--src-dir ./hyper-nix-v1 \
--dst-dir ./hyper-nix-v2 \
--hidden-size 1536 --intermediate-size 6144 \
--num-hidden-layers 24
Run a minimal causal-LM training loop on a raw-text file (smoke-test / short continue-pretrain, not a full trainer):
hypernix train run \
--model-dir ./hyper-nix-v2 \
--dataset ./corpus.txt \
--out-dir ./hyper-nix-v2-trained \
--steps 1000 --batch-size 2 --context-length 512
The output of train init/train expand/train run is a standard
HuggingFace snapshot directory, so you can feed it straight into
hypernix convert (or hypernix all --model-dir).
How it works
huggingface_hub.snapshot_downloadpulls weights + tokenizer files.- The converter loads the state dict, infers dimensions from tensor shapes (so any HyperNix size works), and maps tensor names onto llama.cpp's canonical GGUF layout when a recognizable pattern matches (Llama, GPT-NeoX, GPT-2, nanoGPT). Unknown names round-trip verbatim.
llama-quantizeconsumes the fp16 GGUF to produce each k-quant.
The CLI emits exactly one fp16 intermediate and reuses it for every k-quant in the plan.
Examples
examples/quickstart.py— 5-line Python API demo.examples/custom_arch.py— arbitrary-size HyperNix.examples/upload_to_hub.py— publish to the Hub.
Build / release
Local build:
pip install build twine
python -m build # produces dist/*.whl and dist/*.tar.gz
twine check --strict dist/*
GitHub Actions
Three workflows live under .github/workflows/:
| Workflow | Trigger | Does |
|---|---|---|
ci.yml |
push / PR | ruff lint, pytest across Python 3.10–3.13 on ubuntu-latest + ubuntu-22.04, editable install, setup.py --version compat, plus a build-check job that verifies the sdist really contains tests/, examples/, scripts/, .github/workflows/, MANIFEST.in, setup.py, setup.cfg. |
build.yml |
reusable (workflow_call) + manual workflow_dispatch + push to main touching packaging |
Builds sdist + wheel, runs twine check --strict, test-installs both the wheel and the sdist in clean venvs, bundles scripts/ + examples/ + README + LICENSE into an extra tarball for non-pip users, generates SHA256SUMS, and uploads a single 90-day-retention artifact containing all four files (*.whl, *.tar.gz, *-scripts-examples.tar.gz, SHA256SUMS). |
release.yml |
tag vX.Y.Z (or vX.Y.Z-rc1 / -pre / aN / bN) |
Calls build.yml for a single source of truth, verifies the tag matches pyproject.toml, classifies stable vs prerelease, creates a GitHub Release with all artifacts attached (prerelease flag set automatically), then publishes to PyPI (stable tags) or TestPyPI (prerelease tags) via Trusted Publishing — no API token needed. Manual workflow_dispatch can also push to TestPyPI for smoke-testing. |
public-release.yml |
manual workflow_dispatch with version input |
One-click public release: validates PEP 440 version, bumps pyproject.toml + setup.cfg + src/hypernix/__init__.py, runs ruff + pytest + python -m build, commits the bump, creates an annotated tag with an auto-generated changelog from git log, and pushes — which fires release.yml. Has a dry_run toggle that preserves the built artifacts as an artifact for inspection. |
Trusted Publishing setup (one-time, per registry):
- On PyPI → add publisher: repo
minerofthesoal/hypernix-pip, workflowrelease.yml, environmentpypi. - On TestPyPI → same repo, workflow
release.yml, environmenttestpypi.
Cutting a release:
# bump version in pyproject.toml AND setup.cfg -> e.g. 0.2.0
git commit -am "hypernix 0.2.0"
git tag -a v0.2.0 -m "hypernix 0.2.0"
git push origin main v0.2.0 # triggers release.yml
For a prerelease:
git tag -a v0.2.0-rc1 -m "hypernix 0.2.0-rc1"
git push origin v0.2.0-rc1 # -> TestPyPI + prerelease GitHub Release
Users can verify downloads with the attached SHA256SUMS:
sha256sum --check SHA256SUMS
License
Apache-2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hypernix-0.31.1.tar.gz.
File metadata
- Download URL: hypernix-0.31.1.tar.gz
- Upload date:
- Size: 68.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f143ada6ab492ec6373f06214d67b41fb3b4ae8c34146fe01a3b94c266339eb5
|
|
| MD5 |
7688c6656d56a169c72811ab83cd5715
|
|
| BLAKE2b-256 |
315ac5997723092efa8db5108aa0715849bfd06e601ecbb8e53c7071d7dc283c
|
Provenance
The following attestation bundles were made for hypernix-0.31.1.tar.gz:
Publisher:
public-release.yml on minerofthesoal/HyperNix-pip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hypernix-0.31.1.tar.gz -
Subject digest:
f143ada6ab492ec6373f06214d67b41fb3b4ae8c34146fe01a3b94c266339eb5 - Sigstore transparency entry: 1354942672
- Sigstore integration time:
-
Permalink:
minerofthesoal/HyperNix-pip@e0669c5d11d5d01442207262625e3193428f849e -
Branch / Tag:
refs/heads/claude/pytorch-quantization-package-cJMQp - Owner: https://github.com/minerofthesoal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
public-release.yml@e0669c5d11d5d01442207262625e3193428f849e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file hypernix-0.31.1-py3-none-any.whl.
File metadata
- Download URL: hypernix-0.31.1-py3-none-any.whl
- Upload date:
- Size: 55.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f230a51aa79ff4992f88f5cde80adf9e8f247fbf6e219c55ca5ff9ffcbe586dd
|
|
| MD5 |
aa78d59100900109de14d6ee2d3b4318
|
|
| BLAKE2b-256 |
5c3d20b1fd66b7f5bd079a8256e34e5e0f68a617e7acc0ca2abb7e7d26ee0cef
|
Provenance
The following attestation bundles were made for hypernix-0.31.1-py3-none-any.whl:
Publisher:
public-release.yml on minerofthesoal/HyperNix-pip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
hypernix-0.31.1-py3-none-any.whl -
Subject digest:
f230a51aa79ff4992f88f5cde80adf9e8f247fbf6e219c55ca5ff9ffcbe586dd - Sigstore transparency entry: 1354943059
- Sigstore integration time:
-
Permalink:
minerofthesoal/HyperNix-pip@e0669c5d11d5d01442207262625e3193428f849e -
Branch / Tag:
refs/heads/claude/pytorch-quantization-package-cJMQp - Owner: https://github.com/minerofthesoal
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
public-release.yml@e0669c5d11d5d01442207262625e3193428f849e -
Trigger Event:
workflow_dispatch
-
Statement type: