Skip to main content

Taichi Forge - a community-maintained fork of the Taichi Programming Language (import name: taichi_forge).

Project description

Taichi Forge

A community-maintained fork of taichi focused on compile-time performance, modern toolchains (LLVM 20, VS 2026, Python 3.10-3.14), and tighter compile-time safety rails.

license


Install

pip install taichi-forge

The import name is unchanged — existing code continues to work as-is:

import taichi as ti
ti.init(arch=ti.cuda)

Every public API from upstream Taichi 1.7.4 that we still ship behaves the same way.


Why a fork?

Upstream Taichi 1.7.4 shipped against LLVM 15, Python ≤ 3.12, and the Visual Studio 2019/2022 toolchain. Since then the JIT ecosystem has moved on:

  • LLVM 15 no longer compiles cleanly with current CUDA / NVPTX toolchains.
  • Python 3.13 dropped distutils; 3.14 removes further deprecated stdlib APIs.
  • Modern Windows developer setups default to VS 2026 (MSVC 14.50+), which rejects some headers hard-wired in the original build scripts.

Taichi Forge is the rolling result of those maintenance upgrades, along with compile-time performance improvements that reduce cold-start and warm-start latency.


Supported toolchain

Area Requirement
Python 3.10 – 3.14 (3.9 dropped)
Windows MSVC VS 2026 (Visual Studio 17 2026, MSVC 14.50+)
LLVM 20.1.7 (included in the wheel)
CMake 3.20+
CUDA (optional) NVCC 12.x

Validated backends

End-to-end tested on Linux x86_64 and Windows x86_64:

  • ✅ CPU (LLVM JIT)
  • ✅ CUDA
  • ✅ Vulkan
  • ✅ OpenGL / GLES

Not yet regression-tested since the LLVM 20 migration:

  • ⚠️ macOS (Apple Silicon / Intel) — Metal backend
  • ⚠️ AMDGPU backend
  • ⚠️ Android ARM64 (C-API)

Patches and reports welcome.


New APIs and settings (fork-only)

All additions are strictly opt-in; default values preserve bit-identical behaviour vs. upstream 1.7.4.

New functions

Symbol Purpose
ti.compile_kernels(kernels) Pre-compile a list of kernels on a background thread pool before the hot loop. Accepts decorated kernels or (kernel, args_tuple) pairs. Returns the number of kernels submitted.
ti cache warmup script.py CLI command — runs script.py once with the offline cache forced on, warming up kernel artifacts for subsequent cold starts.
ti.compile_profile() Context manager — on exit, prints a per-pass timing report and optionally writes a CSV / Chrome trace.
@ti.kernel(opt_level=...) Per-kernel LLVM optimization level override ("fast" / "balanced" / "full" or 0–3). Cache key is isolated per override.

ti.init(...) / CompileConfig knobs

Kwarg Default Purpose
compile_tier "balanced" "fast" lowers LLVM to -O0 (floor -O1 on NVPTX/AMDGCN) and SPIR-V optimizer to level 1. "full" preserves pre-fork behaviour.
llvm_opt_level -1 (use tier) Explicit LLVM -O override (0–3).
spv_opt_level -1 (use tier) Explicit SPIR-V spirv-opt optimization level override.
num_compile_threads logical-core count Thread pool size for ti.compile_kernels.
unrolling_hard_limit 0 (off) Per-ti.static(for ...) unroll iteration cap. Aborts with TaichiCompilationError instead of silently burning seconds.
unrolling_kernel_hard_limit 0 (off) Total unroll iteration cap across a single kernel.
func_inline_depth_limit upstream default Hard cap on @ti.func inline recursion depth.
cache_loop_invariant_global_vars False Set True to opt in to SNode loop-invariant caching in hot loops. (Default matches vanilla 1.7.4.)
use_fused_passes False Enable pipeline_dirty short-circuit for redundant full_simplify invocations. Numerically bit-identical to off.
tiered_full_simplify True Splits full_simplify into a local fixed-point pass followed by a single global round per iteration. Set False to match the legacy cadence.
compile_dag_scheduler True Anti-saturation scheduler for ti.compile_kernels batches; balances inner LLVM thread pool and outer kernel pool. Set False for the legacy two-tier model.
spirv_parallel_codegen False Opt-in task-level parallel SPIR-V codegen per kernel.
spirv_disabled_passes [] Per-call disable list for individual spirv-opt passes (e.g. ["loop-unroll"]).
auto_real_function False Auto-promote expensive @ti.func instances to is_real_function=True (LLVM-only, non-autodiff).
auto_real_function_threshold_us 1000 Promotion threshold in microseconds of estimated compile cost.

Compatibility note

  • SNode.snode_tree_id — backported from upstream master (not in 1.7.4 release); available on all backends.
  • offline_cache_l_sem — internal/testing flag, default off. Not for production use.

Quick start

import taichi as ti

ti.init(arch=ti.cuda, compile_tier="fast")

@ti.kernel
def add(a: ti.types.ndarray(), b: ti.types.ndarray(), c: ti.types.ndarray()):
    for i in a:
        c[i] = a[i] + b[i]

import numpy as np
n = 1 << 20
a = np.random.rand(n).astype(np.float32)
b = np.random.rand(n).astype(np.float32)
c = np.empty_like(a)
add(a, b, c)

Pre-compiling a batch of kernels (fork-only)

import taichi as ti
ti.init(arch=ti.cuda)

@ti.kernel
def k1(x: ti.types.ndarray()): ...
@ti.kernel
def k2(x: ti.types.ndarray(), y: ti.types.ndarray()): ...

# Specialize + compile both on the thread pool before the hot loop.
ti.compile_kernels([k1, k2])

Command-line cache warmup (fork-only)

ti cache warmup train.py -- --epochs 1
# Subsequent `python train.py` runs start with a populated offline cache.

Building from source

git clone https://github.com/fancifulland2718/taichi-forge/taichi.git
cd taichi
python -m pip install -r requirements_dev.txt
python -m pip install -e . --no-build-isolation -v

The build is driven entirely by pyproject.toml / scikit-build-core. On Windows, build a local LLVM 20 snapshot first:

.\scripts\build_llvm20_local.ps1   # produces dist\taichi-llvm-20\

Versioning

Taichi Forge uses its own SemVer track starting at 0.1.2. Fork release numbers do not match upstream taichi versions.

  • 0.1.x — LLVM 20 + VS 2026 + Python 3.14 + initial compile-performance improvements. Backends: Linux/Windows x86_64, CUDA, Vulkan, OpenGL, GLES, CPU.
  • 0.2.x — deeper compile-time, runtime cache, and toolchain modernization. Current line.
  • 0.3.x — planned: additional feature support on top of the 0.2.x stabilization line.

Release notes

0.2.4 (current)

This release rolls up the full 0.2.x compile-time, runtime-cache, IR-pass, and dependency-modernization work into a single wheel. All new behaviour is opt-in via ti.init(...) / CompileConfig knobs (already documented above); defaults remain bit-identical to upstream Taichi 1.7.4.

Compile-time performance

  • Fused-pass driver: use_fused_passes adds a pipeline_dirty short-circuit around full_simplify so that no-op iterations are skipped. Measured ~48.6% of full_simplify invocations are observably no-op on representative workloads.
  • Tiered full_simplify (tiered_full_simplify, default on): splits the legacy fixed-point loop into a local fixed-point phase plus a single global round per outer iteration, while preserving final IR.
  • DAG-aware scheduler for ti.compile_kernels (compile_dag_scheduler, default on): balances the inner LLVM thread pool against the outer kernel pool to avoid thread oversubscription on batch warm-up.
  • Single-offload bypass on the LLVM CPU path: removes the prior 0.89× CPU regression introduced by earlier batch-compile work.
  • Per-kernel opt_level= override and compile_tier="fast"|"balanced"|"full" presets, with isolated cache keys so mixed-tier batches do not poison each other.
  • SPIR-V pipeline gains a per-call spirv_disabled_passes allowlist, with cache-key isolation. Disabling loop-unroll alone gives ~54% SPIR-V codegen wall-time reduction on the validated Vulkan suite; disabling the three heaviest passes gives ~61%, with byte-identical kernel results.
  • Optional task-level parallel SPIR-V codegen per kernel (spirv_parallel_codegen).
  • Auto real-function promotion (auto_real_function + auto_real_function_threshold_us) and budget-aware inlining fallback in the LLVM-only path; both default off.

Offline cache and runtime caches

  • Parallel disk-read for offline cache: metadata-hit but ckd-miss path now reads outside the cache mutex and serializes duplicate requests via an in-progress key set. Validated 12-kernel × Vulkan double-process cold start: 290.1 ms (prime) → 83.1 ms (hit), 3.49× faster with byte-identical per-kernel artifacts.
  • CompileConfig key audit + offline-cache schema versioning: unrecognized cache versions now fall back to recompile cleanly instead of crashing.
  • rhi_cache.bin now uses atomic write-then-rename to eliminate half-written cache files after abrupt termination.

IR / passes

  • pipeline_dirty is now explicit and OR-combined across the five mutating passes that can dirty the pipeline, removing spurious dirty marks at no-op call sites. Validated across CPU / CUDA / Vulkan smoke matrices with no regression.
  • Defensive assert + "type-query forbidden zone" notes on linking_context_data->llvm_context to catch accidental cross-context type queries early.

Toolchain and third-party libraries

  • spdlog 1.14.1 → 1.15.3.
  • Vulkan-Headers / volk / SPIRV-Headers / SPIRV-Tools aligned to Vulkan SDK 1.4.341 as a single coordinated bump.
  • googletest 1.10.0 → 1.17.0 (test-only, no runtime impact).
  • glm 0.9.9.8+187 → 1.0.3.
  • imgui v1.84 (WIP) → v1.91.9b (non-docking branch). The Vulkan backend was migrated to the new ImGui_ImplVulkan_InitInfo layout (RenderPass + ApiVersion fields, self-managed font texture, LoadFunctions(api_version, loader) signature). GGUI visual-regression suite: 90 / 90 passing on Vulkan + CUDA backends.

Compatibility

  • All public Python and C-API surfaces from upstream Taichi 1.7.4 remain unchanged. New configuration knobs are additive; their defaults preserve pre-fork behaviour.
  • Build toolchain: LLVM 20.1.7, MSVC 14.50+ (VS 2026), Python 3.10–3.14 — unchanged from 0.1.x.

License

Apache 2.0, same as upstream. See LICENSE. All upstream copyright notices are preserved.


Acknowledgements

Taichi Forge is built on top of the work of the upstream Taichi developers at taichi-dev/taichi. The core compiler, runtime, and the vast majority of the Python frontend are theirs. This fork carries only the delta described above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

taichi_forge-0.2.4-cp314-cp314-win_amd64.whl (87.4 MB view details)

Uploaded CPython 3.14Windows x86-64

taichi_forge-0.2.4-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (67.3 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.2.4-cp313-cp313-win_amd64.whl (85.2 MB view details)

Uploaded CPython 3.13Windows x86-64

taichi_forge-0.2.4-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (67.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.2.4-cp312-cp312-win_amd64.whl (85.2 MB view details)

Uploaded CPython 3.12Windows x86-64

taichi_forge-0.2.4-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (67.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.2.4-cp311-cp311-win_amd64.whl (85.2 MB view details)

Uploaded CPython 3.11Windows x86-64

taichi_forge-0.2.4-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (67.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

taichi_forge-0.2.4-cp310-cp310-win_amd64.whl (85.2 MB view details)

Uploaded CPython 3.10Windows x86-64

taichi_forge-0.2.4-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (67.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

File details

Details for the file taichi_forge-0.2.4-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 1174061c91b40d2e5471575c05f622772dd3dfae47eb0870caaaca6e175dc5e0
MD5 952e58e9b7935e1d90ba9472963eebe3
BLAKE2b-256 11d49cb7c9495ce47f016ae64ecabb68d5c27e1bd1076015818dee5732162f48

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp314-cp314-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 e55b6a61d534d177aa5272c6536bac4cacfb8f0831d2164022a603ff800a7abf
MD5 0048e0c4f933170903edd8d1cba5cadc
BLAKE2b-256 b3ff447df11c0cd35da47db36cafe225cd7acbee0113ae3c2f0d38e28a4a0b59

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp314-cp314-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 12432c9ed17259945d9afb77350871b19837f8e7a3ef9cd447a4dba42fd89ce6
MD5 969dbd54a175388c9115acd0a20508b3
BLAKE2b-256 fe8789fb68f5e2b0b0d0da6500f946a8174a3c6bd5b842940aef189a5ad03948

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp313-cp313-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 79e795e5da451ddc5c870f05092fff95402a5f3e55980b5128ee76a4abfdcf88
MD5 db31cb07897558ccc464e7fc7a267810
BLAKE2b-256 4560ddc77c5e3baf823c8d9ad99e436883739eafcfea645454cf71807970be6f

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp313-cp313-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 04f29cbd534e4bbf057e834a30423379911e6d8e00155729adcdca2dd6eb63e9
MD5 a876f5f9df56c75056a74b051123cb6c
BLAKE2b-256 fd2cbf93585ea817d562b5bc6546ef83021ac42b9a2d9432c3afa63e5e86da91

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp312-cp312-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 4c9f1b06b974abd3db99dac1785a63ed3731e59f562f8865d092f00a5ee29f73
MD5 2ea580b3b59b94fba817344587cdbfea
BLAKE2b-256 3cf6a132f29a111cfadcbd46125353dacfd4ce47dfba20c68df5f73391d4f879

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 692c99b1189fdafee3fbe1c4e6234f0ee0109e6b37b361c7c0d12b5101ef01a6
MD5 bed21e25f1b98935db0da8dc9f7731d3
BLAKE2b-256 b0a8ee77bd7f900a83c70971f27643b9d399e64ad2d32d97795c2361a1332762

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp311-cp311-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 a06d5a580cf143f2a145d205cc1d4bda0cf8c238e233ca952629f6598c1df489
MD5 c26c2fa4f26f27ac11dabf57089538a8
BLAKE2b-256 42b4cdb4d507b7f342332125aaa441daff7c7ca9a00f57ab831a8eab45155707

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 bf03b5b830ad193d65142afd8663e633876862b23b75e22bcb2fd39b39d4ad21
MD5 3dcb0678888de3c80f88370a3d714deb
BLAKE2b-256 1d5d91fcc8c00041f4f503116728401428918f78dd0d36e93290c302f6c7d311

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp310-cp310-win_amd64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file taichi_forge-0.2.4-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for taichi_forge-0.2.4-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 1ac9da66cf8ca621e282f5f260d91c19ee77698d1830718804bbc07f02d338ab
MD5 3ab16c416360c2ddad99963260cec07b
BLAKE2b-256 6b3379eede53b3c3cb7cec7e8c3a4bc004f17796c503cb36795e8f7c16e7dfb0

See more details on using hashes here.

Provenance

The following attestation bundles were made for taichi_forge-0.2.4-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: publish_pypi.yml on fancifulland2718/taichi-forge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page