SHARK inference library and serving engine

These details have been verified by PyPI

Owner

Advanced Micro Devices

Maintainers

marbre safaizal saienduri scotttodd

These details have not been verified by PyPI

Project links

Project description

shortfin - SHARK inference library and serving engine

The shortfin project is SHARK's open source, high performance inference library and serving engine. Shortfin consists of these major components:

The "libshortfin" inference library written in C/C++ and built on IREE
Python bindings for the underlying inference library
Example applications in 'shortfin_apps' built using the python bindings

Prerequisites

Python 3.11+

Simple user installation

Install the latest stable version:

pip install shortfin

Developer guides

Quick start: install local packages and run tests

After cloning this repository, from the shortfin/ directory:

pip install -e .

Install test requirements:

pip install -r requirements-tests.txt

Run tests:

pytest -s tests/

Simple dev setup

We recommend this development setup for core contributors:

Check out this repository as a sibling to IREE if you already have an IREE source checkout. Otherwise, a pinned version will be downloaded for you
Ensure that python --version reads 3.11 or higher (3.12 preferred).
Run ./dev_me.py to build and install the shortfin Python package with both a tracing-enabled and default build. Run it again to do an incremental build and delete the build/ directory to start over
Run tests with python -m pytest -s tests/
Test optional features:
- pip install iree-base-compiler to run a small suite of model tests intended to exercise the runtime (or use a source build of IREE).
- pip install onnx to run some more model tests that depend on downloading ONNX models
- Run tests on devices other than the CPU with flags like: --system amdgpu --compile-flags="--iree-hal-target-device=hip --iree-hip-target=gfx1100"
- Use the tracy instrumented runtime to collect execution traces: export SHORTFIN_PY_RUNTIME=tracy

Refer to the advanced build options below for other scenarios.

Advanced build options

Native C++ build
Local Python release build
Package Python release build
Python dev build

Prerequisites

A modern C/C++ compiler, such as clang 18 or gcc 12
A modern Python, such as Python 3.12

Native C++ builds

cmake -GNinja -S. -Bbuild \
    -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
    -DCMAKE_LINKER_TYPE=LLD
cmake --build build --target all

If Python bindings are enabled in this mode (-DSHORTFIN_BUILD_PYTHON_BINDINGS=ON), then pip install -e build/ will install from the build dir (and support build/continue).

Package Python release builds

To build wheels for Linux using a manylinux Docker container:
```
sudo ./build_tools/build_linux_package.sh
```

To build a wheel for your host OS/arch manually:

# Build shortfin.*.whl into the dist/ directory
#   e.g. `shortfin-0.9-cp312-cp312-linux_x86_64.whl`
python3 -m pip wheel -v -w dist .

# Install the built wheel.
python3 -m pip install dist/*.whl

Python dev builds

# Install build system pre-reqs (since we are building in dev mode, this
# is not done for us). See source of truth in pyproject.toml:
pip install setuptools wheel

# Optionally install cmake and ninja if you don't have them or need a newer
# version. If doing heavy development in Python, it is strongly recommended
# to install these natively on your system as it will make it easier to
# switch Python interpreters and build options (and the launcher in debug/asan
# builds of Python is much slower). Note CMakeLists.txt for minimum CMake
# version, which is usually quite recent.
pip install cmake ninja

SHORTFIN_DEV_MODE=ON pip install --no-build-isolation -v -e .

Note that the --no-build-isolation flag is useful in development setups because it does not create an intermediate venv that will keep later invocations of cmake/ninja from working at the command line. If just doing a one-shot build, it can be ommitted.

Once built the first time, cmake, ninja, and ctest commands can be run directly from build/cmake and changes will apply directly to the next process launch.

Several optional environment variables can be used with setup.py:

SHORTFIN_CMAKE_BUILD_TYPE=Debug : Sets the CMAKE_BUILD_TYPE. Defaults to Debug for dev mode and Release otherwise.
SHORTFIN_ENABLE_ASAN=ON : Enables an ASAN build. Requires a Python runtime setup that is ASAN clean (either by env vars to preload libraries or set suppressions or a dev build of Python with ASAN enabled).
SHORTFIN_IREE_SOURCE_DIR=$(pwd)/../../iree
SHORTFIN_RUN_CTESTS=ON : Runs ctest as part of the build. Useful for CI as it uses the version of ctest installed in the pip venv.

Running tests

The project uses a combination of ctest for native C++ tests and pytest. Much of the functionality is only tested via the Python tests, using the _shortfin.lib internal implementation directly. In order to run these tests, you must have installed the Python package as per the above steps.

Which style of test is used is pragmatic and geared at achieving good test coverage with a minimum of duplication. Since it is often much more expensive to build native tests of complicated flows, many things are only tested via Python. This does not preclude having other language bindings later, but it does mean that the C++ core of the library must always be built with the Python bindings to test the most behavior. Given the target of the project, this is not considered to be a significant issue.

Python tests

Run platform independent tests only:

pytest tests/

Run tests including for a specific platform (in this example, a gfx1100 AMDGPU):

(note that not all tests are system aware yet and some may only run on the CPU)

pytest tests/ --system amdgpu \
    --compile-flags="--iree-hal-target-device=hip --iree-hip-target=gfx1100"

8b Accuracy Test

You can launch an accuracy test against meta_llama3.1_8b_fp16 to verify changes in sharktank and/or shortfin do not cause accuracy regressions.

This tests our server e2e against a dataset of custom prompts and validates the output against known good outputs.

For testing against GPUs, for example (gfx942):

IRPA_PATH=/path/to/your/irpa \
TOKENIZER_PATH=/path/to/your/tokenizer.json \
pytest -s app_tests/integration_tests/llm/shortfin/accuracy/accuracy_test.py \
  --log-cli-level=INFO \
  --test_device=gfx942

Production library building

In order to build a production library, additional build steps are typically recommended:

Compile all deps with the same compiler/linker for LTO compatibility
Provide library dependencies manually and compile them with LTO
Compile dependencies with -fvisibility=hidden
Enable LTO builds of libshortfin
Set flags to enable symbol versioning

Miscellaneous build topics

Free-threaded Python

Support for free-threaded Python builds (aka. "nogil") is in progress. It is currently being tested via CPython 3.13 with the --disable-gil option set. There are multiple ways to acquire such an environment:

Generally, see the documentation at https://py-free-threading.github.io/installing_cpython/

If using pyenv:

# Install a free-threaded 3.13 version.
pyenv install 3.13t

# Test (should print "False").
pyenv shell 3.13t
python -c 'import sys; print(sys._is_gil_enabled())'

Project details

These details have been verified by PyPI

Owner

Advanced Micro Devices

Maintainers

marbre safaizal saienduri scotttodd

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.9.0

Dec 17, 2025

This version

3.8.0

Oct 14, 2025

3.7.0

Sep 5, 2025

3.6.0

Jul 21, 2025

3.5.0

Jun 11, 2025

3.4.0

May 5, 2025

3.3.0

Mar 24, 2025

3.2.0

Feb 10, 2025

3.1.0

Jan 8, 2025

3.0.0

Nov 18, 2024

2.9.2

Nov 15, 2024

2.9.1

Nov 14, 2024

2.9.0

Nov 11, 2024

0.1.dev3 pre-release yanked

Apr 21, 2024

Reason this release was yanked:

Iterating on initial setup

0.1.dev2 pre-release yanked

Apr 20, 2024

Reason this release was yanked:

Wrong name

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

shortfin-3.8.0-cp313-cp313t-manylinux_2_28_x86_64.whl (3.1 MB view details)

Uploaded Oct 14, 2025 CPython 3.13tmanylinux: glibc 2.28+ x86-64

shortfin-3.8.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.1 MB view details)

Uploaded Oct 14, 2025 CPython 3.13manylinux: glibc 2.28+ x86-64

shortfin-3.8.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.1 MB view details)

Uploaded Oct 14, 2025 CPython 3.12manylinux: glibc 2.28+ x86-64

shortfin-3.8.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.1 MB view details)

Uploaded Oct 14, 2025 CPython 3.11manylinux: glibc 2.28+ x86-64

File details

Details for the file shortfin-3.8.0-cp313-cp313t-manylinux_2_28_x86_64.whl.

File metadata

Download URL: shortfin-3.8.0-cp313-cp313t-manylinux_2_28_x86_64.whl
Upload date: Oct 14, 2025
Size: 3.1 MB
Tags: CPython 3.13t, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for shortfin-3.8.0-cp313-cp313t-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`03537e8a17988cd345248ea5738693aa828dcd2e30015a5a28ff69ceba4059ca`
MD5	`7a73617ae2f9b25f0ae79757663e75cc`
BLAKE2b-256	`33e4a48b1c635e631b9eb95ee60b58f308f50be7cce226457caec5cfe3a4c641`

See more details on using hashes here.

File details

Details for the file shortfin-3.8.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

Download URL: shortfin-3.8.0-cp313-cp313-manylinux_2_28_x86_64.whl
Upload date: Oct 14, 2025
Size: 3.1 MB
Tags: CPython 3.13, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for shortfin-3.8.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`fd1c6aaf011f7ffcf61580e2f0cbbe6f2aecc7f315cc755a0fc484fc63cbd1e5`
MD5	`89b0df23ec92af1b17238d40aefea870`
BLAKE2b-256	`5f0391e7dfe650e0c9711709ebe6e4d12798722a8cf90684f5a2fdf6e2be60a7`

See more details on using hashes here.

File details

Details for the file shortfin-3.8.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

Download URL: shortfin-3.8.0-cp312-cp312-manylinux_2_28_x86_64.whl
Upload date: Oct 14, 2025
Size: 3.1 MB
Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for shortfin-3.8.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`bd75a013cd534ac949db66412ab53369dd60904aa3c4f7f7e34ca9a2e5d48d3c`
MD5	`2d9009cfa26ad003e760e727a3c47f20`
BLAKE2b-256	`0961c934eac15b4dcc445a506dde2ed2d556bf2db3410bd99ddc8434b941b801`

See more details on using hashes here.

File details

Details for the file shortfin-3.8.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

Download URL: shortfin-3.8.0-cp311-cp311-manylinux_2_28_x86_64.whl
Upload date: Oct 14, 2025
Size: 3.1 MB
Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for shortfin-3.8.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`afd694a5f7bf01e88e639e7b2bf98ebc799c28bef4fe3a1d5dff33332a2c7681`
MD5	`af3520e0f73368ea1db6f27af2ebe795`
BLAKE2b-256	`29983eda0f73685c98336a1e62cc0677b23b00a1409a62384ec7852a7da28d81`

See more details on using hashes here.

shortfin 3.8.0

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

shortfin - SHARK inference library and serving engine

Prerequisites

Simple user installation

Developer guides

Quick start: install local packages and run tests

Simple dev setup

Advanced build options

Native C++ builds

Package Python release builds

Python dev builds

Running tests

Python tests

8b Accuracy Test

Production library building

Miscellaneous build topics

Free-threaded Python

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes