Causal depthwise conv1d in CUDA, with a PyTorch interface
Project description
Causal depthwise conv1d in CUDA with a PyTorch interface
Features:
- Support fp32, fp16, bf16.
- Kernel size 2, 3, 4.
How to use
from causal_conv1d import causal_conv1d_fn
def causal_conv1d_fn(x, weight, bias=None, activation=None):
"""
x: (batch, dim, seqlen)
weight: (dim, width)
bias: (dim,)
activation: either None or "silu" or "swish"
out: (batch, dim, seqlen)
"""
Equivalent to:
import torch.nn.functional as F
F.conv1d(x, weight.unsqueeze(1), bias, padding=width - 1, groups=dim)[..., :seqlen]
Additional Prerequisites for AMD cards
Patching ROCm
If you are on ROCm 6.0, run the following steps to avoid errors during compilation. This is not required for ROCm 6.1 onwards.
-
Locate your ROCm installation directory. This is typically found at
/opt/rocm/, but may vary depending on your installation. -
Apply the Patch. Run with
sudoin case you encounter permission issues.patch /opt/rocm/include/hip/amd_detail/amd_hip_bf16.h < rocm_patch/rocm6_0.patch
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
causal_conv1d-1.6.0.tar.gz
(29.4 kB
view details)
File details
Details for the file causal_conv1d-1.6.0.tar.gz.
File metadata
- Download URL: causal_conv1d-1.6.0.tar.gz
- Upload date:
- Size: 29.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4eae3220d08e1e88238f3a0a88783147cbdf47f612cc610add75127c7a37ca3e
|
|
| MD5 |
2f9e2e04e172b0741408175f2ca1f02d
|
|
| BLAKE2b-256 |
dbdf63a384c49743b9fc8fec4c05dbd0b515e1c1c2b07e4559acc4fc37c69223
|