Skip to main content

CKKS Homomorphic Encryption backend with CUDA 12.8 GPU acceleration

Project description

English | 한국어

CuKKS

GPU-accelerated CKKS Homomorphic Encryption for PyTorch

Build Wheels Test Python License Python 3.10-3.13

Run trained PyTorch models on encrypted data — no decryption needed, no privacy compromised.
Built on OpenFHE with CUDA acceleration.


Why CuKKS?

Traditional machine learning requires access to raw input data — a privacy risk for sensitive domains like healthcare, finance, and biometrics. CuKKS lets you deploy models that never see plaintext:

User: encrypt(input) → [ciphertext] → Server: model([ciphertext]) → [encrypted output] → User: decrypt

The server performs full inference without ever decrypting the data. CuKKS makes this practical with:

  • One-line conversioncukks.convert(model) transforms any trained PyTorch model
  • GPU acceleration — CUDA-accelerated HE operations via OpenFHE
  • 37 layer types — from Linear and Conv2d to Attention, GroupNorm, and ConvTranspose2d

Quick Start

import torch.nn as nn
import cukks

# 1. Train your model normally
model = nn.Sequential(nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 10))

# 2. Convert to encrypted model
enc_model, ctx = cukks.convert(model)

# 3. Run encrypted inference
enc_input = ctx.encrypt(test_input)
enc_output = enc_model(enc_input)
output = ctx.decrypt(enc_output)  # Same result, never decrypted on server

Installation

pip install cukks[cu121]  # Match your PyTorch CUDA version: cu118, cu121, cu124, cu128
Command CUDA Compute Capability
pip install cukks[cu118] 11.8 sm_50 – sm_90 (Maxwell ~ Hopper)
pip install cukks[cu121] 12.1 sm_50 – sm_90 (Maxwell ~ Hopper)
pip install cukks[cu124] 12.4 sm_50 – sm_90a (Maxwell ~ Hopper)
pip install cukks[cu128] 12.8 sm_50 – sm_100 (Maxwell ~ Blackwell)

Not sure which CUDA version? Run python -c "import torch; print(torch.version.cuda)".

Docker, CLI tools, and building from source

Docker

docker run --gpus all -it pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime bash
pip install cukks[cu121]

Auto-install CLI

pip install cukks
cukks-install-backend  # Auto-detects PyTorch CUDA and installs matching backend

Build from source

git clone https://github.com/devUuung/CuKKS.git && cd CuKKS
pip install -e .

# Build OpenFHE backend
cd openfhe-gpu-public && mkdir build && cd build
cmake .. -DWITH_CUDA=ON && make -j$(nproc)

cd ../../bindings/openfhe_backend
pip install -e .

Key Features

Drop-in Model Conversion

No model rewriting. No custom HE code. Just call cukks.convert(model):

# Any PyTorch model — MLP, CNN, Transformer — converts automatically
enc_model, ctx = cukks.convert(model, activation_degree=4)

BatchNorm folding, BSGS matrix multiplication, and CNN optimizations are applied automatically.

37 Supported Layer Types

Category Layers
Linear Linear, BlockDiagonalLinear, BlockDiagLowRankLinear
Convolution Conv1d, Conv2d, ConvTranspose2d
Pooling AvgPool2d, MaxPool2d, AdaptiveAvgPool2d
Normalization LayerNorm, GroupNorm, InstanceNorm1d/2d, BatchNorm (folded)
Activation ReLU, GELU, SiLU, Sigmoid, Tanh, Square
Attention MultiheadAttention (seq_len ≤ 8)
Embedding Embedding
Spatial Upsample, PixelShuffle, PixelUnshuffle
Padding ZeroPad2d, ConstantPad2d, ReflectionPad2d, ReplicationPad2d
Other Flatten, Dropout, Sequential, ResidualBlock

Full layer table →

GPU-Accelerated HE Operations

All core HE operations run on GPU:

Operation GPU
Add / Sub / Mul / Square
Rotate / Rescale
Bootstrap
Plaintext cache

Polynomial Activation Approximation

CKKS only supports polynomial operations. CuKKS approximates non-polynomial activations (ReLU, GELU, SiLU, etc.) using Chebyshev polynomial fitting:

# Default: degree-4 (good accuracy / depth balance)
enc_model, ctx = cukks.convert(model)

# Higher degree for better accuracy (costs more depth)
enc_model, ctx = cukks.convert(model, activation_degree=8)

Packed Batch Inference

Process multiple samples in a single ciphertext:

samples = [torch.randn(784) for _ in range(8)]
enc_batch = ctx.encrypt_batch(samples)
enc_output = enc_model(enc_batch)
outputs = ctx.decrypt_batch(enc_output, sample_shape=(8,))

Examples

# MNIST classification (MLP)
python examples/mnist_encrypted.py --hidden 64 --samples 5

# UNet-style segmentation (ConvTranspose2d, AdaptiveAvgPool2d, Upsample)
python examples/unet_encrypted.py --samples 2

# ResNet-style classification (GroupNorm, AdaptiveAvgPool2d)
python examples/resnet_encrypted.py --samples 1

# Transformer-style NLP (Embedding, LayerNorm)
python examples/transformer_encrypted.py --samples 2

See examples/ for full scripts.

Benchmarking

CuKKS includes a benchmark suite for measuring encrypted inference performance:

# Run all benchmarks
python benchmarks/run_benchmarks.py

# Benchmark a specific model
python benchmarks/run_benchmarks.py --model mlp

# Save results to JSON
python benchmarks/run_benchmarks.py --output results.json

Supported models:

Model Params Input Architecture
MLP 50,890 (1, 784) Linear(784→64) → ReLU → Linear(64→10)
CNN 15,770 (1, 1, 28, 28) Conv2d(1→8, 3×3) → ReLU → AvgPool2d(2) → Linear(1568→10)
ResNet 1,300 (1, 1, 8, 8) Conv2d(1→8, 3×3) → GroupNorm → ReLU → Conv2d(8→16, 1×1) → GroupNorm → ReLU → AdaptiveAvgPool2d(4×4) → Linear(256→4)
Transformer 212 (1, 8) Linear(8→16) → ReLU → Linear(16→4)

Example output:

Model           Plain (ms)   Encrypted (ms)  Overhead   MAE       
--------------------------------------------------------------
mlp             0.01         84.16           6858      x 0.075594
cnn             0.03         2820.15         94053     x 0.048294
resnet          0.04         11444.55        269018    x 0.098026
transformer     0.01         54.27           7794      x 0.087978

Note: Benchmarks require a GPU backend with OpenFHE. Run on a machine with CUDA support for accurate timing.

Supported Layers

PyTorch Layer Encrypted Version Notes
nn.Linear EncryptedLinear BSGS optimization
nn.Conv1d EncryptedConv1d 1D im2col
nn.Conv2d EncryptedConv2d im2col, BSGS
nn.ConvTranspose2d EncryptedConvTranspose2d Transposed convolution
nn.ReLU EncryptedReLU Polynomial approx
nn.GELU EncryptedGELU Polynomial approx
nn.SiLU EncryptedSiLU Polynomial approx
nn.Sigmoid EncryptedSigmoid Polynomial approx
nn.Tanh EncryptedTanh Polynomial approx
nn.AvgPool2d EncryptedAvgPool2d Rotation-based
nn.MaxPool2d EncryptedMaxPool2d Polynomial approx
nn.AdaptiveAvgPool2d EncryptedAdaptiveAvgPool2d Global-pool fast path
nn.Flatten EncryptedFlatten Logical reshape
nn.BatchNorm1d/2d Folded Merged into preceding layer
nn.GroupNorm EncryptedGroupNorm Per-group polynomial
nn.InstanceNorm1d/2d EncryptedInstanceNorm1d/2d Per-channel polynomial
nn.LayerNorm EncryptedLayerNorm Polynomial 1/sqrt
nn.MultiheadAttention EncryptedApproxAttention seq_len ≤ 8
nn.Embedding EncryptedEmbedding One-hot matmul
nn.Upsample EncryptedUpsample Nearest / bilinear
nn.PixelShuffle EncryptedPixelShuffle Channel-to-spatial
nn.PixelUnshuffle EncryptedPixelUnshuffle Spatial-to-channel
nn.ZeroPad2d EncryptedZeroPad2d Scatter matrix
nn.ConstantPad2d EncryptedConstantPad2d Scatter + constant
nn.ReflectionPad2d EncryptedReflectionPad2d Reflection mapping
nn.ReplicationPad2d EncryptedReplicationPad2d Replication mapping
nn.Sequential EncryptedSequential Full support
nn.Dropout EncryptedDropout No-op during inference
nn.ResidualBlock EncryptedResidualBlock Skip connection

Documentation

License

Apache License 2.0

Citation

@software{cukks,
  title = {CuKKS: PyTorch-compatible Encrypted Deep Learning},
  year = {2024},
  url = {https://github.com/devUuung/CuKKS}
}

Related

Libraries

Papers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cukks_cu128-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cukks_cu128-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cukks_cu128-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cukks_cu128-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file cukks_cu128-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cukks_cu128-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 82eb5ee18c34e51afeef7b0b49f97ed6e2e2ee7466637cb09dd18e7b07a94403
MD5 d3547d3c2b0d4a9b7952d98ecb7f25f0
BLAKE2b-256 13c2f064eff89bcf8361d4e4d6a2962c3741025e50498117c7f94b8c31888d70

See more details on using hashes here.

Provenance

The following attestation bundles were made for cukks_cu128-0.2.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish-packages.yml on devUuung/CuKKS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cukks_cu128-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cukks_cu128-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6c443f4fdd89242aff297411a411d2bcffaae6e9e108241139a9b3f90ea59039
MD5 108c09094da6b64a1f27acf1f7e40b39
BLAKE2b-256 f4a65f0edae075264e305c8ec2b4213f75a4809d6f9e1328f25975939297a4e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for cukks_cu128-0.2.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish-packages.yml on devUuung/CuKKS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cukks_cu128-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cukks_cu128-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 53100a7cb0229f9d20e54dea6b416240b3b2d4b22b7aeb80e2ba50fa1497f339
MD5 a1923ee32cef1cf2696cb44d39ad6eee
BLAKE2b-256 37f816a4200769219151507d7c1dfb7f831b3f68685c0e38de0ccd0fce3f6a42

See more details on using hashes here.

Provenance

The following attestation bundles were made for cukks_cu128-0.2.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish-packages.yml on devUuung/CuKKS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cukks_cu128-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cukks_cu128-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a3a9f77df94469310dc67ea91db289873ebd572af35259f08e952a27cb22339d
MD5 5defa57ebddb3dadcc337dc4f39753bb
BLAKE2b-256 63ee83daf7b56105ebb2a715964419c33d02f39e006f5b9e702733ab82a447a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for cukks_cu128-0.2.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish-packages.yml on devUuung/CuKKS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page