A PyTorch-compatible deep learning framework
Project description
Candle
CANN can handle — Not as bright as a torch, but light enough to carry anywhere.
A pure-Python deep learning framework that runs your PyTorch code — no rewrite needed.
Getting Started | Why Candle | Backends | Roadmap | Contributing
Why Candle
PyTorch is powerful — but it's also 2GB+ of C++ binaries, hard to install on edge devices, and locked to CUDA. Candle takes a different approach:
| PyTorch | Candle | |
|---|---|---|
| Install size | ~2 GB | ~10 MB |
| Build from source | C++ toolchain required | pip install candle |
| Ascend NPU | Community fork | First-class ACLNN kernels |
| Apple MPS | Partial | Native Metal shaders |
Run existing import torch code |
— | Zero-change drop-in |
Getting Started
Install
pip install candle
That's it. No CUDA toolkit, no compiler, no 10-minute build.
Write code the way you already know
import candle as torch
import candle.nn as nn
# Tensors, autograd, nn — all the APIs you're used to
model = nn.Sequential(
nn.Linear(784, 256),
nn.ReLU(),
nn.Linear(256, 10),
)
x = torch.randn(2, 784, requires_grad=True)
out = model(x)
out.sum().backward()
print(x.grad.shape) # (2, 784)
Or just import torch — seriously
Candle ships an import hook. Existing PyTorch code runs without changing a single line:
import torch # resolved to candle
import torch.nn.functional as F # resolved to candle.nn.functional
x = torch.randn(3, 4)
y = F.relu(x)
How it works:
USE_CANDLE env var |
PyTorch installed? | import torch gives you |
|---|---|---|
1 / true / yes |
doesn't matter | Candle |
0 / false / no |
doesn't matter | PyTorch (or ImportError) |
| not set | No | Candle |
| not set | Yes | PyTorch |
If PyTorch isn't installed, Candle picks up automatically. If both are installed, set USE_CANDLE=1:
USE_CANDLE=1 python train.py
Backends
Candle runs on multiple hardware backends with a single API:
candle.device("cpu") # NumPy — works everywhere
candle.device("cuda") # NVIDIA GPU
candle.device("mps") # Apple Silicon GPU (Metal)
candle.device("npu") # Huawei Ascend (ACLNN)
Ascend NPU — First-Class Support
Unlike wrapper libraries, Candle calls ACLNN large kernels directly via ctypes — no framework overhead, no Python-to-C++ bridge:
import candle as torch
x = torch.randn(1024, 1024, device="npu")
y = torch.matmul(x, x.T) # runs native ACLNN kernel
Features
- Pure Python — No C++ extensions. Install in seconds, debug in Python, deploy anywhere.
- PyTorch-Compatible API —
Tensor,nn.Module,autograd,optim— the full stack. import torchDrop-in — Built-in import hook. Zero code changes for existing projects.- Multi-Backend — CPU, CUDA, Apple MPS, Ascend NPU from one codebase.
- Ascend NPU Native — Direct ACLNN kernel integration, not a bolted-on afterthought.
- Agentic AI Ready — Lightweight enough to embed in AI agent runtimes.
Roadmap
Most frameworks stop at tensors. Candle doesn't — the end goal is a self-hosting agentic kernel: a system that deploys local models on Ascend, then uses those same models to debug, optimize, and evolve itself.
Phase 1 — Foundation (current)
A PyTorch-compatible pure-Python framework with multi-backend support.
- Core tensor ops, autograd,
nn.Module, optimizers - CPU backend (NumPy)
- Ascend NPU backend (ACLNN native kernels)
- Apple MPS backend (Metal shaders)
-
import torchzero-change drop-in hook - CUDA backend
-
torch.compilegraph-mode acceleration - Distributed training (
DistributedDataParallel) - Full TorchVision / TorchAudio model compatibility
Phase 2 — Cognitive Runtime
Local model deployment becomes a first-class primitive, not an afterthought.
- Local model loading, quantization & serving on Ascend
- Role-based model router (debug model, generation model, judge model)
- Multi-model inference policy (local-first, cloud fallback)
- Self-hosted reasoning & tool-use runtime
Phase 3 — Agentic Kernel
The framework gains the ability to observe, diagnose, and act on itself.
- Dev Layer — bug detection, repro construction, fix suggestions (powered by local debug model)
- Bootstrap Layer — candidate generation, distillation, config search (powered by local generation model)
- ModelOps Layer — evaluation, promotion, rollback, lineage tracking (powered by local judge model)
Phase 4 — Self-Hosting
Local models are no longer just managed artifacts — they become the execution engine of the kernel itself.
- Local-first agentic execution: Dev / Bootstrap / ModelOps agents default to self-deployed models
- Continuous self-improvement loop: trace → evaluate → generate fix → test → promote
- Model-as-kernel: the system uses its own models to improve its own models
┌─────────────────────────────────────────┐
│ Applications │
│ train · debug · infer · deploy │
├─────────────────────────────────────────┤
│ Agentic Kernel │
│ Dev Layer · Bootstrap · ModelOps │
├─────────────────────────────────────────┤
│ Cognitive Runtime │
│ Local Model RT · Router · Policy │
├─────────────────────────────────────────┤
│ Intelligence Substrate │
│ TraceStore · Evaluator · Registry │
├─────────────────────────────────────────┤
│ Foundation │
│ tensor · autograd · nn · compiler │
│ CPU · CUDA · MPS · Ascend NPU │
└─────────────────────────────────────────┘
See docs/support-matrix.md for the full 0.1.x op support matrix.
Used By
Building something with Candle? Open an issue and we'll add you here!
Star History
Contributing
# Clone and install in dev mode
git clone https://github.com/candle-org/candle.git
cd candle
pip install -e ".[test]"
# Run tests
pytest tests/cpu/ tests/contract/ -v --tb=short
# Lint
pip install -e ".[lint]"
pylint src/candle --rcfile=.github/pylint.conf
We welcome contributions! Whether it's new ops, backend support, bug fixes, or docs — open an issue or submit a PR.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file candle_python-0.1.0a1-py3-none-any.whl.
File metadata
- Download URL: candle_python-0.1.0a1-py3-none-any.whl
- Upload date:
- Size: 563.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f63fda31c6b70ddbca82cba4653dbde719827b3dea31d4bacc5b56bc7f09da5a
|
|
| MD5 |
3275575105a100c238097dc4b72135ce
|
|
| BLAKE2b-256 |
f501de528b17d75136972c7a5df72165eb58621a30286d11d51ec51a731e7fa6
|