Skip to main content

CKKS Homomorphic Encryption backend with CUDA 12.8 GPU acceleration

Project description

English | 한국어

CuKKS

GPU-accelerated CKKS Homomorphic Encryption for PyTorch

Build Status License Python 3.11

Run trained PyTorch models on encrypted data — preserving privacy while maintaining accuracy.
Built on OpenFHE with CUDA acceleration.


Quick Start

import torch.nn as nn
import cukks

# 1. Define and train your model (standard PyTorch)
model = nn.Sequential(nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 10))

# 2. Convert to encrypted model (polynomial ReLU approximation)
enc_model, ctx = cukks.convert(model)

# 3. Run encrypted inference
enc_input = ctx.encrypt(test_input)
enc_output = enc_model(enc_input)
output = ctx.decrypt(enc_output)

Installation

Automatic (Recommended)

pip install cukks        # Auto-detects PyTorch's CUDA and installs matching backend

pip install cukks detects the CUDA version your PyTorch was built with and automatically installs the matching cukks-cuXXX GPU backend. No manual version matching needed.

Manual

pip install cukks-cu121  # Explicitly install for CUDA 12.1
Package CUDA Supported GPUs
cukks-cu118 11.8 V100, T4, RTX 20/30/40xx, A100, H100
cukks-cu121 12.1 V100, T4, RTX 20/30/40xx, A100, H100
cukks-cu124 12.4 V100, T4, RTX 20/30/40xx, A100, H100
cukks-cu128 12.8 All above + RTX 50xx

Or use extras: pip install cukks[cu121]

Post-install CLI & environment variables
cukks-install-backend             # Auto-detect & install
cukks-install-backend cu128       # Install specific backend
cukks-install-backend --status    # Show CUDA compatibility status
cukks-install-backend --dry-run   # Preview without installing
Variable Effect
CUKKS_BACKEND=cukks-cu128 Force a specific backend
CUKKS_NO_BACKEND=1 Skip backend (CPU-only)
Docker images
CUDA Compatible Docker Images
11.8 pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime
12.1 pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
12.4 pytorch/pytorch:2.4.0-cuda12.4-cudnn9-runtime
12.8 nvidia/cuda:12.8.0-cudnn9-runtime-ubuntu22.04
docker run --gpus all -it pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime bash
pip install cukks  # auto-detects CUDA 12.1
Build from source
git clone https://github.com/devUuung/CuKKS.git && cd CuKKS
pip install -e .

# Build OpenFHE backend
cd openfhe-gpu-public && mkdir build && cd build
cmake .. -DWITH_CUDA=ON && make -j$(nproc)

cd ../../bindings/openfhe_backend
pip install -e .

Features

Feature Description
PyTorch API Familiar interface — just call cukks.convert(model)
GPU Acceleration CUDA-accelerated HE operations via OpenFHE
Auto Optimization BatchNorm folding, BSGS matrix multiplication
Wide Layer Support Linear, Conv2d, ReLU/GELU/SiLU, Pool, LayerNorm, Attention

Supported Layers

Layer Encrypted Version Notes
nn.Linear EncryptedLinear BSGS optimization
nn.Conv2d EncryptedConv2d im2col method
nn.ReLU/GELU/SiLU Polynomial approx Configurable degree
nn.AvgPool2d EncryptedAvgPool2d Rotation-based
nn.BatchNorm Folded Merged into prev layer
nn.LayerNorm EncryptedLayerNorm Polynomial approx
nn.Attention EncryptedApproxAttention seq_len=1
Full layer support table
PyTorch Layer Encrypted Version Notes
nn.Linear EncryptedLinear Full support with BSGS optimization
nn.Conv2d EncryptedConv2d Via im2col method
nn.ReLU EncryptedReLU Polynomial approximation
nn.GELU EncryptedGELU Polynomial approximation
nn.SiLU EncryptedSiLU Polynomial approximation
nn.Sigmoid EncryptedSigmoid Polynomial approximation
nn.Tanh EncryptedTanh Polynomial approximation
nn.AvgPool2d EncryptedAvgPool2d Full support
nn.MaxPool2d EncryptedMaxPool2d Approximate via polynomial
nn.Flatten EncryptedFlatten Logical reshape
nn.BatchNorm1d/2d Folded Merged into preceding layer
nn.Sequential EncryptedSequential Full support
nn.Dropout EncryptedDropout No-op during inference
nn.LayerNorm EncryptedLayerNorm Pure HE polynomial approximation
nn.MultiheadAttention EncryptedApproxAttention Polynomial softmax (seq_len=1)

Activation Functions

CKKS only supports polynomial operations. CuKKS approximates activations (ReLU, GELU, SiLU, etc.) using polynomial fitting:

# Default: degree-4 polynomial approximation (recommended)
enc_model, ctx = cukks.convert(model)

# Higher degree for better accuracy (costs more multiplicative depth)
enc_model, ctx = cukks.convert(model, activation_degree=8)

The default activation_degree=4 provides a good balance between accuracy and depth consumption. Higher degrees approximate the original activation more closely but require deeper circuits.

GPU Acceleration

Operation Accelerated
Add/Sub/Mul/Square ✅ GPU
Rotate/Rescale ✅ GPU
Bootstrap ✅ GPU
Encrypt/Decrypt CPU
from ckks.torch_api import CKKSContext, CKKSConfig

config = CKKSConfig(poly_mod_degree=8192, scale_bits=40)
ctx = CKKSContext(config, enable_gpu=True)  # GPU enabled by default

Examples

# Quick demo (no GPU required)
python -m cukks.examples.encrypted_inference --demo conversion

# MNIST encrypted inference
python examples/mnist_encrypted.py --hidden 64 --samples 5
CNN example
import torch.nn as nn
import cukks

class MNISTCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 8, kernel_size=3, padding=1)
        self.act1 = nn.ReLU()
        self.pool1 = nn.AvgPool2d(2)
        self.flatten = nn.Flatten()
        self.fc = nn.Linear(8 * 14 * 14, 10)
    
    def forward(self, x):
        return self.fc(self.flatten(self.pool1(self.act1(self.conv1(x)))))

model = MNISTCNN()
enc_model, ctx = cukks.convert(model)

enc_input = ctx.encrypt(image)
prediction = ctx.decrypt(enc_model(enc_input)).argmax()

Note: All operations in forward() must be layer attributes (e.g., self.act1), not inline operations like x ** 2.

Batch processing
# Pack multiple samples into a single ciphertext (SIMD)
samples = [torch.randn(784) for _ in range(8)]
enc_batch = ctx.encrypt_batch(samples)
enc_output = enc_model(enc_batch)
outputs = ctx.decrypt_batch(enc_output, num_samples=8)

Troubleshooting

Issue Solution
Out of Memory Reduce poly_mod_degree (8192 instead of 16384)
Low Accuracy Increase activation_degree (e.g., 8 or 16) for better approximation
Slow Performance Enable batch processing, reduce network depth

Documentation

License

Apache License 2.0

Citation

@software{cukks,
  title = {CuKKS: PyTorch-compatible Encrypted Deep Learning},
  year = {2024},
  url = {https://github.com/devUuung/CuKKS}
}

Related

Libraries

Papers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cukks_cu128-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cukks_cu128-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file cukks_cu128-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cukks_cu128-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 866c1be71ce6a6a5b76bdaebe180e8f4b4ebb0334418589159a1bdd1d82efadd
MD5 da76e5eab87e8dec03f5af5496400dc6
BLAKE2b-256 4799f067d4389f5161e71bc5031a8fdb6cab4e7d03e8d6b95415447c10c24376

See more details on using hashes here.

Provenance

The following attestation bundles were made for cukks_cu128-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on devUuung/CuKKS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cukks_cu128-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cukks_cu128-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 431f9c52bd68dd2d64c06db8081ba390b879d21c984af9cdae7ee402507725bd
MD5 7da77346e7d0f09e4be83291dc95aa59
BLAKE2b-256 70da1fae2d704c0d9b8896b11491f612cf5b810959eec434400cf5dbfab317b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for cukks_cu128-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build-wheels.yml on devUuung/CuKKS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page