Automated setup for AMD Strix Halo APU: ROCm, MIGraphX, custom ORT, and AI model deployment.
Project description
stxh-setup-amd
Automated setup for AMD Strix Halo APU (gfx1151 / RDNA 3.5).
Takes a fresh Ubuntu 24.04 machine from zero to running 23 AI models on ORT MIGraphX Execution Provider -- ROCm 7.2, MIGraphX, 400 GB swap, GRUB tuning, Python ML stack, and a custom ONNX Runtime build with unified-memory fixes.
Install
pip install stxh-setup-amd
Usage
# Full setup (all 8 stages)
export SMB_USER="user@amd.com"
export SMB_PASS="password"
sudo -E stxh-setup --all
# Resume from a specific stage
sudo stxh-setup --from-stage 5
# Run a single stage
sudo stxh-setup --stage 3 # ROCm only
# Skip slow steps
sudo stxh-setup --all --skip-copy --skip-ort-build
# Check what's already done
sudo stxh-setup --status
# Verify everything works
sudo stxh-setup --verify
Fallback if stxh-setup is not on PATH:
sudo python3 -m stxh_setup --all
The 8 Stages
| Stage | What it does | Time |
|---|---|---|
| 1 | System update + base packages (build-essential, cmake, ffmpeg, ...) | 5 min |
| 2 | Copy R1models from network share (~206 GB) | 30-90 min |
| 3 | GPU driver + ROCm 7.2 (requires reboot) | 10 min |
| 4 | MIGraphX | 2 min |
| 5 | Swap space (~400 GB for JIT compilation) | 5 min |
| 6 | GRUB kernel parameters (amdgpu.gttsize=28672, requires reboot) | 1 min |
| 7 | Python dependencies (PyTorch+ROCm, JAX+ROCm, TF, HuggingFace, ...) | 15 min |
| 8 | Custom ORT 1.25.0 build with MIGraphX fixes (or installs existing wheel) | 60-90 min |
Each stage is idempotent -- if it detects the work is already done, it skips.
Custom ORT Fixes
The stock onnxruntime-migraphx from PyPI OOMs on any model > 512 MB because
the Strix Halo APU only has 512 MB dedicated VRAM. Stage 8 applies two source
patches before building:
| Fix | What |
|---|---|
| B | hipMalloc -> hipMallocManaged -- spill to 28 GB GTT when VRAM is full |
| C | [=] -> [=, this] -- C++20 lambda capture build fix |
Fix A (skip constant initializer double-allocation) is already upstream in ORT 1.25.0.
Hardware Target
- APU: AMD Strix Halo (gfx1151 / RDNA 3.5)
- VRAM: 512 MB dedicated + 28 GB GTT (unified memory)
- OS: Ubuntu 24.04 (Noble), kernel 6.18.0+
- ROCm: 7.2
- Key setting:
HSA_XNACK=1(demand paging for unified memory)
License
Apache 2.0
Copyright 2026 Sudheer Ibrahim Daniel Devu
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stxh_setup_amd-1.0.0.tar.gz.
File metadata
- Download URL: stxh_setup_amd-1.0.0.tar.gz
- Upload date:
- Size: 16.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39b859fe540909153ad353843c63c8e8afe676c71d1f3ffd01ff02b555bab4fb
|
|
| MD5 |
47e4425c5984c0a374cf0ec309a0ae3e
|
|
| BLAKE2b-256 |
2a1dd008d56bbf46561f62536f4e886529f7fdb0b49ff0a2fe7ae781ee46100a
|
File details
Details for the file stxh_setup_amd-1.0.0-py3-none-any.whl.
File metadata
- Download URL: stxh_setup_amd-1.0.0-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3270624727498c53211deb14f590f2a6f4325a941a985cd5bdb9ad0bc3201466
|
|
| MD5 |
49d4fa941bd9173b78590d5227a10fef
|
|
| BLAKE2b-256 |
218e2f95fa5168bd7f016e5c9e8d85e7adeedbc9055fdb0ffbcf354222010475
|