Skip to main content

Attention kernels

Project description

attention-kernels

attention-kernels is a standalone package with the paged attention and cache reshape kernels from vLLM, with modifications for TGI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

attention_kernels-0.1.0.tar.gz (49.8 kB view details)

Uploaded Source

File details

Details for the file attention_kernels-0.1.0.tar.gz.

File metadata

  • Download URL: attention_kernels-0.1.0.tar.gz
  • Upload date:
  • Size: 49.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for attention_kernels-0.1.0.tar.gz
Algorithm Hash digest
SHA256 90c40acc4ce625ca293f28136194de5da6dc335c898a7adebda06d22db5bcc96
MD5 3e0faa3af2e6b164e214ac5a1d4580f8
BLAKE2b-256 1227ff9833e02e7c025fba2a48abe764080467a557b1b989ec34fd1924b590ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page