Causal depthwise conv1d in CUDA, with a PyTorch interface
Project description
Causal depthwise conv1d in CUDA with a PyTorch interface
Features:
- Support fp32, fp16, bf16.
- Kernel size 2, 3, 4.
How to use
from causal_conv1d import causal_conv1d_fn
def causal_conv1d_fn(x, weight, bias=None, activation=None):
"""
x: (batch, dim, seqlen)
weight: (dim, width)
bias: (dim,)
activation: either None or "silu" or "swish"
out: (batch, dim, seqlen)
"""
Equivalent to:
import torch.nn.functional as F
F.conv1d(x, weight.unsqueeze(1), bias, padding=width - 1, groups=dim)[..., :seqlen]
Additional Prerequisites for AMD cards
Patching ROCm
If you are on ROCm 6.0, run the following steps to avoid errors during compilation. This is not required for ROCm 6.1 onwards.
-
Locate your ROCm installation directory. This is typically found at
/opt/rocm/, but may vary depending on your installation. -
Apply the Patch. Run with
sudoin case you encounter permission issues.patch /opt/rocm/include/hip/amd_detail/amd_hip_bf16.h < rocm_patch/rocm6_0.patch
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
causal_conv1d-1.5.1.tar.gz
(21.3 kB
view details)
File details
Details for the file causal_conv1d-1.5.1.tar.gz.
File metadata
- Download URL: causal_conv1d-1.5.1.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d30091ee40e04b93c4d4db276ce3426b9d9d540cbf07eeaf1cf821f92213b7ae
|
|
| MD5 |
d2cbd50e80361b9df5e20be418475a53
|
|
| BLAKE2b-256 |
6add7ce4a771e973795406000be31d6472d4a3e3bd661e23a9fec6dcb02d4c9f
|