SHARK inference library and serving engine
Project description
shortfin - SHARK inference library and serving engine
The shortfin project is SHARK's open source, high performance inference library and serving engine. Shortfin consists of these major components:
- The "libshortfin" inference library written in C/C++ and built on IREE
- Python bindings for the underlying inference library
- Example applications in 'shortfin_apps' built using the python bindings
Prerequisites
- Python 3.11+
Simple user installation
Install the latest stable version:
pip install shortfin
Developer guides
Quick start: install local packages and run tests
After cloning this repository, from the shortfin/
directory:
pip install -e .
Install test requirements:
pip install -r requirements-tests.txt
Run tests:
pytest -s tests/
Simple dev setup
We recommend this development setup for core contributors:
- Check out this repository as a sibling to IREE if you already have an IREE source checkout. Otherwise, a pinned version will be downloaded for you
- Ensure that
python --version
reads 3.11 or higher (3.12 preferred). - Run
./dev_me.py
to build and install theshortfin
Python package with both a tracing-enabled and default build. Run it again to do an incremental build and delete thebuild/
directory to start over - Run tests with
python -m pytest -s tests/
- Test optional features:
pip install iree-base-compiler
to run a small suite of model tests intended to exercise the runtime (or use a source build of IREE).pip install onnx
to run some more model tests that depend on downloading ONNX models- Run tests on devices other than the CPU with flags like:
--system amdgpu --compile-flags="--iree-hal-target-backends=rocm --iree-hip-target=gfx1100"
- Use the tracy instrumented runtime to collect execution traces:
export SHORTFIN_PY_RUNTIME=tracy
Refer to the advanced build options below for other scenarios.
Advanced build options
- Native C++ build
- Local Python release build
- Package Python release build
- Python dev build
Prerequisites
- A modern C/C++ compiler, such as clang 18 or gcc 12
- A modern Python, such as Python 3.12
Native C++ builds
cmake -GNinja -S. -Bbuild \
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_LINKER_TYPE=LLD
cmake --build build --target all
If Python bindings are enabled in this mode (-DSHORTFIN_BUILD_PYTHON_BINDINGS=ON
),
then pip install -e build/
will install from the build dir (and support
build/continue).
Package Python release builds
-
To build wheels for Linux using a manylinux Docker container:
sudo ./build_tools/build_linux_package.sh
-
To build a wheel for your host OS/arch manually:
# Build shortfin.*.whl into the dist/ directory # e.g. `shortfin-0.9-cp312-cp312-linux_x86_64.whl` python3 -m pip wheel -v -w dist . # Install the built wheel. python3 -m pip install dist/*.whl
Python dev builds
# Install build system pre-reqs (since we are building in dev mode, this
# is not done for us). See source of truth in pyproject.toml:
pip install setuptools wheel
# Optionally install cmake and ninja if you don't have them or need a newer
# version. If doing heavy development in Python, it is strongly recommended
# to install these natively on your system as it will make it easier to
# switch Python interpreters and build options (and the launcher in debug/asan
# builds of Python is much slower). Note CMakeLists.txt for minimum CMake
# version, which is usually quite recent.
pip install cmake ninja
SHORTFIN_DEV_MODE=ON pip install --no-build-isolation -v -e .
Note that the --no-build-isolation
flag is useful in development setups
because it does not create an intermediate venv that will keep later
invocations of cmake/ninja from working at the command line. If just doing
a one-shot build, it can be ommitted.
Once built the first time, cmake
, ninja
, and ctest
commands can be run
directly from build/cmake
and changes will apply directly to the next
process launch.
Several optional environment variables can be used with setup.py:
SHORTFIN_CMAKE_BUILD_TYPE=Debug
: Sets the CMAKE_BUILD_TYPE. Defaults toDebug
for dev mode andRelease
otherwise.SHORTFIN_ENABLE_ASAN=ON
: Enables an ASAN build. Requires a Python runtime setup that is ASAN clean (either by env vars to preload libraries or set suppressions or a dev build of Python with ASAN enabled).SHORTFIN_IREE_SOURCE_DIR=$(pwd)/../../iree
SHORTFIN_RUN_CTESTS=ON
: Runsctest
as part of the build. Useful for CI as it uses the version of ctest installed in the pip venv.
Running tests
The project uses a combination of ctest for native C++ tests and pytest. Much
of the functionality is only tested via the Python tests, using the
_shortfin.lib
internal implementation directly. In order to run these tests,
you must have installed the Python package as per the above steps.
Which style of test is used is pragmatic and geared at achieving good test coverage with a minimum of duplication. Since it is often much more expensive to build native tests of complicated flows, many things are only tested via Python. This does not preclude having other language bindings later, but it does mean that the C++ core of the library must always be built with the Python bindings to test the most behavior. Given the target of the project, this is not considered to be a significant issue.
Python tests
Run platform independent tests only:
pytest tests/
Run tests including for a specific platform (in this example, a gfx1100 AMDGPU):
(note that not all tests are system aware yet and some may only run on the CPU)
pytest tests/ --system amdgpu \
--compile-flags="--iree-hal-target-backends=rocm --iree-hip-target=gfx1100"
Production library building
In order to build a production library, additional build steps are typically recommended:
- Compile all deps with the same compiler/linker for LTO compatibility
- Provide library dependencies manually and compile them with LTO
- Compile dependencies with
-fvisibility=hidden
- Enable LTO builds of libshortfin
- Set flags to enable symbol versioning
Miscellaneous build topics
Free-threaded Python
Support for free-threaded Python builds (aka. "nogil") is in progress. It
is currently being tested via CPython 3.13 with the --disable-gil
option set.
There are multiple ways to acquire such an environment:
-
Generally, see the documentation at https://py-free-threading.github.io/installing_cpython/
-
If using
pyenv
:# Install a free-threaded 3.13 version. pyenv install 3.13t # Test (should print "False"). pyenv shell 3.13t python -c 'import sys; print(sys._is_gil_enabled())'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file shortfin-2.9.2-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
.
File metadata
- Download URL: shortfin-2.9.2-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
- Upload date:
- Size: 2.5 MB
- Tags: CPython 3.13t, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d940ad7396ddb12dd138bf763e619dad8b50fc2140685c0fb39606edf0e01bbd |
|
MD5 | b6662fe31aab90a73d1448b78eae746f |
|
BLAKE2b-256 | 1433b770e6b07673c2a4e40521b2494474017caa120b15fbcfffc2884d8d3660 |
File details
Details for the file shortfin-2.9.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
.
File metadata
- Download URL: shortfin-2.9.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
- Upload date:
- Size: 2.5 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0856515e312cd70a0f2fa55eb713ed1dc326d6d0ba3678336ddfd76de686bc4f |
|
MD5 | f8f7d392d3731d6a3fc2c833f2196488 |
|
BLAKE2b-256 | fd1fac0da61c6e8818f6fa0ce6fe602e080114c2e67e687a761b15202af01fa1 |
File details
Details for the file shortfin-2.9.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
.
File metadata
- Download URL: shortfin-2.9.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
- Upload date:
- Size: 2.5 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10d35b471f0363cd6eb56cd6463b83c2fdecbf25aaf281e6422bfcbe78033f48 |
|
MD5 | b8a5d34bbb0c286ee69617683ef2511f |
|
BLAKE2b-256 | 9197a8827324eee4e05a395205595992e81dcce16341af06de536f51a4c120b4 |
File details
Details for the file shortfin-2.9.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
.
File metadata
- Download URL: shortfin-2.9.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
- Upload date:
- Size: 2.5 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3089ce9d85261dd0dda03cc4c273ccc69a7540a70d6fdc86ca8ac33ec0697c6 |
|
MD5 | 3250b0db67aab0b568100a586d1f82a3 |
|
BLAKE2b-256 | bebc7b4d7c61842f0257e425accb7c6cb15629d990aa9dce9a93c62a22cc068b |