TileOPs kernels for efficient inference
Project description
TileOPs (TOP)
TileOPs (TOP) is a high-performance machine learning operator collections built to run on the TileLang backend. It offers efficient, modular, and composable implementations optimized for AI workloads.
📦 Installation
Requirements
- Python 3.8+
- PyTorch >= 2.1
- TileLang
- Triton (optional, for selected fast kernels)
Install (editable mode for development)
git clone https://github.com/tile-ai/TileOPs
cd TileOPs
git submodule update --init --recursive
pip install -e .
🚀 Quick Usage
import torch
import top
from top import MLAKernel
device = "cuda"
dtype = torch.float16
batch = 128
heads = 64
kv_heads = 1
kv_ctx = 8192
dim = 512
pe_dim = 64
# Query input: [batch, heads, dim]
q = torch.randn(batch, heads, dim, device=device, dtype=dtype)
# Query positional encoding: [batch, heads, pe_dim]
q_pe = torch.randn(batch, heads, pe_dim, device=device, dtype=dtype)
# KV cache input: [batch, kv_ctx, kv_heads, dim]
kv = torch.randn(batch, kv_ctx, kv_heads, dim, device=device, dtype=dtype)
# KV positional encoding: [batch, kv_ctx, kv_heads, pe_dim]
k_pe = torch.randn(batch, kv_ctx, kv_heads, pe_dim, device=device, dtype=dtype)
# Use MLA kernel
block_N = 64
block_H = 64
num_split = 1
mla = MLAKernel(batch, heads, kv_heads, kv_ctx, dim, pe_dim, block_N, block_H, num_split)
out = mla(q, q_pe, kv, k_pe)
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tileops-0.0.1.dev0.tar.gz
(7.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tileops-0.0.1.dev0.tar.gz.
File metadata
- Download URL: tileops-0.0.1.dev0.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6184371639498039652983d1bcd035bce5c09505c21497827ffb4d05fac8957
|
|
| MD5 |
16a991aac55afaae7e676e5b8ecc75bd
|
|
| BLAKE2b-256 |
d00ced8727b8c6f6ef4a9fc2a5320792e27fc87a8c21d686fe787b4890755d6d
|
File details
Details for the file tileops-0.0.1.dev0-py3-none-any.whl.
File metadata
- Download URL: tileops-0.0.1.dev0-py3-none-any.whl
- Upload date:
- Size: 3.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3938e7d6787513d85db2eaaf475a1957bea3e1ed0fc1bc969eef58e9c21a38e1
|
|
| MD5 |
4a1e5575457bc69a63a7db0979aa3dc6
|
|
| BLAKE2b-256 |
dc0140be1c0a495a2778f245a65e755a823a609f9e32aede7b2e32a75c565095
|