Skip to main content

Taichi Forge - a community-maintained fork of the Taichi Programming Language (import name: taichi_forge).

Project description

Taichi Forge

A community-maintained fork of taichi focused on compile-time performance, modern toolchains (LLVM 20, VS 2026, Python 3.14), and tighter compile-time safety rails.

license


Install

pip install taichi-forge

The import name is unchanged:

import taichi as ti
ti.init(arch=ti.cuda)

Every public API from upstream Taichi 1.7.x that we still ship behaves the same way — existing user code runs without modification.

Heads up. taichi-forge and the upstream taichi distribution install the same top-level Python package (taichi/). Pick one of them in any given virtual environment; do not install both side by side.


Why a fork?

Upstream Taichi 1.7.4 shipped in mid-2024 against LLVM 15, Python ≤ 3.12, and the legacy LLVM PassManager / typed-pointer API. Since then the JIT ecosystem has moved on:

  • LLVM 15 no longer compiles cleanly with current CUDA / NVPTX toolchains, and typed pointers were fully removed in LLVM 17.
  • Python 3.13 dropped distutils; 3.14 drops a few more.
  • Modern Windows developer setups default to VS 2026, which rejects older MSVC-incompatible headers that Taichi's build scripts had hard-wired.

Taichi Forge is the rolling result of those maintenance upgrades plus a compile-time-performance work stream (the P1–P5 phases in the commit log) that reduces cold-start and warm-start compile latency.


Differences vs. upstream Taichi 1.7.4

Toolchain

Area Upstream 1.7.4 Taichi Forge 0.1.2
LLVM 15.x 20.1.7 (Phase A.1 → A.4 migration: typed→opaque pointers, legacy-PM→NewPM, nvvm_ldg_global_{f,i}load + !invariant.load)
CUDA PTX via LLVM 15 NVPTX via LLVM 20 NVPTX (NVCC 12.x compatible)
Python 3.9 – 3.12 3.10 – 3.14 (3.9 dropped; 3.13/3.14 added)
Windows MSVC VS 2019 / 2022 VS 2026 (Visual Studio 17 2026, MSVC 14.50+)
Build backend legacy scikit-build + setup.py bdist_wheel scikit-build-core via python -m build
CMake floor 3.15 3.20

Public API additions

These are new, fork-only APIs. Nothing in this list breaks existing 1.7.4 programs; they are all strictly additive.

Symbol Introduced Purpose
ti.compile_kernels(kernels_iterable) P5.b Parallel multi-kernel pre-compile. Submits a batch of kernels to num_compile_threads worker threads for cross-kernel compile parallelism on CPU (LLVM), CUDA and Vulkan. Accepts either decorated kernels or (kernel, args_tuple) specialization pairs. Returns the number of kernels submitted.
ti cache warmup script.py [-- args...] P1 CLI entry point that pre-runs script.py with offline cache forced on, populating disk-backed kernel artifacts for later cold-start re-use.
ti.CompileConfig.compile_tier P2.c / P3.d Enum-valued knob: "fast" / "balanced" / "full". fast caps LLVM at -O0 (floor -O1 on NVPTX/AMDGCN) and SPIR-V optimization at level 1; full preserves pre-fork behaviour.
ti.CompileConfig.llvm_opt_level P1 Explicit LLVM -O override (0–3); takes precedence over compile_tier.
ti.CompileConfig.spv_opt_level P1.d Same for SPIR-V (spirv-opt -O0..-O3).
ti.CompileConfig.num_compile_threads P5.b Thread pool size for ti.compile_kernels. Defaults to the machine's logical-core count.
ti.cfg.unrolling_hard_limit P3.a Per-ti.static(for ...) iteration cap. When a single unroll would emit more than this many iterations, compilation aborts with TaichiCompilationError instead of quietly taking tens of seconds. 0 (default) preserves 1.7.4 behaviour.
ti.cfg.unrolling_kernel_hard_limit P3.a Cumulative iteration cap across all unrolls in a single kernel. Catches pathological nested static fors (e.g. 27³ = 19683) that individually stay below unrolling_hard_limit.
ti.cfg.func_inline_depth_limit P3.b Hard cap on non-real @ti.func inline recursion depth.
SNode.snode_tree_id (inherited from 1.8.0 backport) Numeric ID of the owning SNode tree — available on upstream master but not released in 1.7.4.

Behavioural changes

These changes alter observable behaviour relative to 1.7.4. Most are performance-positive and have no API surface change; a few are documented below because they can shift numerical results at the bit level.

Area Change Impact
Offline cache key Dropped random_seed (P1.a) Two runs with the same kernel but different seeds now share cache entries, eliminating spurious recompiles between RNG-using iterations.
Offline cache loader In-process bytes mirror (P1.b) Hot-start repeat ti.init(arch=ti.cuda) in the same process is up to 1.14× faster.
CUDA __ldg intrinsic nvvm_ldg_global_{f,i} intrinsics replaced with load + !invariant.load metadata Generated PTX still emits ld.global.nc; no perf delta observed, but IR differs.
IR passes simplify is deduped via dirty-flag in compile_to_offloads (P2.a); loop_invariant_code_motion is guarded by first-iteration check (P2.b); WholeKernelCSE's MarkUndone walker is O(users) instead of O(N) (pre-P2) CPU compile 0.89×–1.00×, CUDA 0.99×–1.30×, Vulkan 1.30–1.38× faster on the heavy-kernel suite.
scalarize pass Typed-stmt visit bug in HasMatrixStmt fixed (P3.c). Early-exit experiment was reverted (miscompile risk). Behaves correctly in presence of typed matrix stmts. No change if you weren't hitting the miscompile.
Kernel compilation thread safety KernelCompilationManager now holds an internal mutex (P5.a) Enables ti.compile_kernels. Single-threaded callers pay one uncontended mutex lock per compile.

Removed / deprecated

Symbol Status
Python 3.9 support Removed. Minimum is 3.10.
wheel direct build-system dependency Removed — scikit-build-core integrates bdist_wheel natively.
setup.py bdist_wheel invocation Still works via a compatibility shim that delegates to python -m build. Use PEP 517 entry points (pip install ., python -m build -w) in new code.

Not yet validated in this fork

The main branch is tested end-to-end on Linux x86_64 and Windows x86_64 with the CUDA, Vulkan, OpenGL, GLES and CPU backends. The following paths build but have not been regression-tested since the LLVM 20 migration:

  • macOS (Apple Silicon / Intel) — Metal backend
  • AMDGPU backend
  • Android ARM64 C-API

Patches and reports welcome.


Quick start

import taichi as ti

ti.init(arch=ti.cuda, compile_tier="fast")

@ti.kernel
def add(a: ti.types.ndarray(), b: ti.types.ndarray(), c: ti.types.ndarray()):
    for i in a:
        c[i] = a[i] + b[i]

import numpy as np
n = 1 << 20
a = np.random.rand(n).astype(np.float32)
b = np.random.rand(n).astype(np.float32)
c = np.empty_like(a)
add(a, b, c)

Pre-compiling a batch of kernels (fork-only)

import taichi as ti
ti.init(arch=ti.cuda)

@ti.kernel
def k1(x: ti.types.ndarray()): ...
@ti.kernel
def k2(x: ti.types.ndarray(), y: ti.types.ndarray()): ...

# Specialize + compile both on the thread pool before the hot loop.
ti.compile_kernels([k1, k2])

Command-line cache warmup (fork-only)

ti cache warmup train.py -- --epochs 1
# Subsequent `python train.py` runs start with a populated offline cache.

Building from source

git clone https://github.com/taichi-dev/taichi.git
cd taichi
python -m pip install -r requirements_dev.txt
python -m pip install -e . --no-build-isolation -v

The build is driven entirely by pyproject.toml / scikit-build-core. See docs/design/pypi_release.md and compile_doc/LLVM20_升级分析.md for the toolchain details. Windows developers can run scripts/build_llvm20_local.ps1 to produce a local LLVM 20 snapshot under dist/taichi-llvm-20/ before building the wheel.


Versioning

Taichi Forge uses its own SemVer track starting at 0.1.2. Fork release numbers do not match upstream taichi versions.

  • 0.1.x — LLVM 20 + VS 2026 + Python 3.14 + compile-perf work (P1–P5). Backend coverage: Linux/Windows x86_64, CUDA, Vulkan, OpenGL, GLES, CPU.
  • 0.2.x — planned: macOS/Metal regression suite, scikit-build-core wheel tags for manylinux_2_28.

License

Apache 2.0, same as upstream. See LICENSE. All upstream copyright notices are preserved.


Acknowledgements

Taichi Forge is built on top of the work of the upstream Taichi developers at taichi-dev/taichi — the core compiler, runtime, and the vast majority of the Python frontend are theirs. This fork only carries the delta described above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

taichi_forge-0.1.3-cp314-cp314-win_amd64.whl (83.8 MB view details)

Uploaded CPython 3.14Windows x86-64

taichi_forge-0.1.3-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (66.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.1.3-cp313-cp313-win_amd64.whl (81.7 MB view details)

Uploaded CPython 3.13Windows x86-64

taichi_forge-0.1.3-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (66.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.1.3-cp312-cp312-win_amd64.whl (81.7 MB view details)

Uploaded CPython 3.12Windows x86-64

taichi_forge-0.1.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (66.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.1.3-cp311-cp311-win_amd64.whl (81.7 MB view details)

Uploaded CPython 3.11Windows x86-64

taichi_forge-0.1.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (66.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.1.3-cp310-cp310-win_amd64.whl (81.7 MB view details)

Uploaded CPython 3.10Windows x86-64

taichi_forge-0.1.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (66.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

File details

Details for the file taichi_forge-0.1.3-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 3b7ee9f0b8bf73540f5f7de3c17971a25f0823886a4170f65f40f35994dc3cba
MD5 5e67fb0636df886e34152e01d266c672
BLAKE2b-256 2e442f4051497a93c6601112b330e54b0e0179b85f0b3b8cf090d8382539e441

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp314-cp314-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 5201dc3cd2c0694504fd33292defb0b2481f1180a560dde304fe41abfd6df845
MD5 b48f8fad26d47ff35538f5e4cac67d7a
BLAKE2b-256 0a8a92318a4240a006e403bcff0ceec5c8c5dbba06ebaa774bd20cecb4362abd

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 8d921f26af23bac75000a274fee75ed2b2c6d0b1d06c550fe1a16c470be52354
MD5 1f98f3791e90ec9a5aea712d2feff447
BLAKE2b-256 94662d35e2106420b1d6b058217702a25171f0c8c31f78026133d334798f8ea2

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp313-cp313-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 f4c7c16c7fd97c5dbf4e032c635f75ad8cc722c9a511f9beeca8cdd205ab92db
MD5 2107e8b5039d14fb835501de2fe89e01
BLAKE2b-256 6a44b971ed155be1f9d0c1f0c0ce3432b1d89fc38bba45d1bac54f138f3451a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 0115ecca6b48168a9816c6c2cd8a1a378a5d33a86bbd5525653ef7b426295746
MD5 5c248306e954be3e8b9d87c57187a9fe
BLAKE2b-256 979689d786619719754c800b3100e750e401eb93729ff2615f60c68adeb31bb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp312-cp312-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 18433a25d49b25852ce6df3976b0b2f9d8187092f4ac6966fbf09e49e28a2867
MD5 f73de4a44ef1e5042374c4bd8426807b
BLAKE2b-256 d406514cb3dd6eee16904211f5d37105d3bc75ec18cf087306742c6cd1a76e51

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 07ea3e2d2ce30a829aff3f6dd58da19dbfbe6c65c60d97d138beedc60edd9e95
MD5 38aa5ba93cc9aace2942f2af9e0597bf
BLAKE2b-256 0fdbd6f720c4451905d166404181b07a3015d8ebf3e4f8b22a136fd4520a4603

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp311-cp311-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 1e2ef1ce6b5e68730dcf94760eae252acdfbc817a006ffd6fdf98cb5ffce370c
MD5 bc588794f66cadd6fb7ae3a6fce6b488
BLAKE2b-256 1347ed6917602f6db95dc7b2dc40a56b5a36cdc067e5fae1ae4214053f61a2f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 9afd932acbce197d8a74136c93344c223b9aceccb87f34b885cb248768465e07
MD5 28ba4b594e6ab2e563ad5b61cce0f1ec
BLAKE2b-256 fc99ab2c35740bcb07324d7fec966e76b51399a80c9cdbb77ab96c458c0e3fbe

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp310-cp310-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.1.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.1.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 b5ca92916f7b5f93574cc418a0e7bd968338fa9285f86537279ddea619736316
MD5 54a8c6ac8cbc01fc0894281c69784f80
BLAKE2b-256 8eb680c46afb27df29938d35bb03422054c0d6b64ff1fbb67bbb9bdfe021ef17

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.1.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page