Skip to main content

Attention kernels

Project description

attention-kernels

attention-kernels is a standalone package with the paged attention and cache reshape kernels from vLLM, with modifications for TGI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

attention_kernels-0.2.0.post2.tar.gz (53.3 kB view details)

Uploaded Source

File details

Details for the file attention_kernels-0.2.0.post2.tar.gz.

File metadata

  • Download URL: attention_kernels-0.2.0.post2.tar.gz
  • Upload date:
  • Size: 53.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.8

File hashes

Hashes for attention_kernels-0.2.0.post2.tar.gz
Algorithm Hash digest
SHA256 876024d93b0312b9792c0c0b0678eee2b814b6d4c48b1c1c5ce2b4651a827235
MD5 09213fc0abe1b9d2f288a71331ee797f
BLAKE2b-256 b379a4f57e0a5b8eaeef4701d8f599956ffe45a953e703d98cd563880fa49c46

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page