Skip to main content

SHARK layers and inference models for genai

Project description

SHARK Tank

WARNING: This is an early preview that is in progress. It is not ready for general use.

Light weight inference optimized layers and models for popular genai applications.

This sub-project is a work in progress. It is intended to be a repository of layers, model recipes, and conversion tools from popular LLM quantization tooling.

Project Status

CI - Perplexity

Examples

The repository will ultimately grow a curated set of models and tools for constructing them, but for the moment, it largely contains some CLI exmaples. These are all under active development and should not yet be expected to work.

Perform batched inference in PyTorch on a paged llama derived LLM:

python -m sharktank.examples.paged_llm_v1 \
  --hf-dataset=open_llama_3b_v2_f16_gguf \
  "Prompt 1" \
  "Prompt 2" ...

Export an IREE compilable batched LLM for serving:

python -m sharktank.examples.export_paged_llm_v1 \
  --hf-dataset=open_llama_3b_v2_f16_gguf \
  --output-mlir=/tmp/open_llama_3b_v2_f16.mlir \
  --output-config=/tmp/open_llama_3b_v2_f16.json

Dump parsed information about a model from a gguf file:

python -m sharktank.tools.dump_gguf --hf-dataset=open_llama_3b_v2_f16_gguf

Package Python Release Builds

  • To build wheels for Linux:

    ./build_tools/build_linux_package.sh
    

    That should produce build_tools/wheelhouse/sharktank-{X.Y.Z}.dev0-py3-none-any.whl, which can then be installed with

    python3 -m pip install build_tools/wheelhouse/sharktank-{X.Y.Z}.dev0-py3-none-any.whl
    
  • To build a wheel for your host OS/arch manually:

    # Build sharktank.*.whl into the dist/ directory
    #   e.g. `sharktank-3.0.0.dev0-py3-none-any.whl`
    python3 -m pip wheel -v -w dist .
    
    # Install the built wheel.
    python3 -m pip install dist/*.whl
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

sharktank-2.9.2-py3-none-any.whl (253.0 kB view details)

Uploaded Python 3

File details

Details for the file sharktank-2.9.2-py3-none-any.whl.

File metadata

  • Download URL: sharktank-2.9.2-py3-none-any.whl
  • Upload date:
  • Size: 253.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for sharktank-2.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 49fbd54dcd77a5ee0986cd76c02c804c80880f50d1ac10d709b5a643ac8d9615
MD5 c79fda3e7d320f3b5fe1c2e0cd176268
BLAKE2b-256 503351f9fe9cafc6a597aeeb3d8632db8bc9363a5068d4b97fdf9c53868e7dee

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page