Skip to main content

Static Local Linearization for Differentiable Discrete Programming

Project description

🔷 SLL-Core: Static Local Linearization

离散程序的零侵入可微分化引擎

PyPI License Python Build Status Coverage

中文 | English


🤯 问题:为什么离散程序无法自动求导?

在深度学习中,离散决策无处不在:

  • 量化round(x)floor(x)
  • 阈值判断sign(x)x > 0
  • 分类选择argmax(x)

但这些操作有一个致命特性:梯度几乎处处为零,导致标准反向传播直接失效。

x = torch.tensor([0.5], requires_grad=True)
y = torch.sign(x)   # ❌ 梯度为 0,参数永远更新不了
loss = (y - target).pow(2).sum()
loss.backward()
print(x.grad)       # tensor([0.]) ← 死了

传统方案的缺点

方法 是否需要改代码 部署有残留 梯度质量 收敛稳定性
硬函数直接训练 ✅ 无需改动 ✅ 无残留 ❌ 零梯度,无法训练 ❌ 完全不收敛
Sigmoid / Softmax 松弛 ❌ 重写模型 ❌ 有近似误差 ⚠️ 梯度消失/爆炸 ⚠️ 调参困难
Straight-Through Estimator (STE) ❌ 手写自定义梯度 ✅ 无残留 ⚠️ 梯度方向不准 ⚠️ 容易震荡
重参数化/Gumbel-Softmax ❌ 改模型结构 ❌ 有温度参数残留 ⚠️ 高方差 ⚠️ 慢
⭐ SLL (静态局部线性化) ✅ 零侵入 ✅ 严格恢复硬逻辑 ✅ 常数梯度,无消失 ✅ 稳定收敛

SLL 的核心洞察:不需要在整个定义域上做近似,只在决策边界附近 ε-区间局部线性化,其余区域保持原始硬逻辑。当 ε → 0 时,最优解收敛到原始离散问题的最优解。


⚡ 一句话解决

import torch
import sll

x = torch.tensor([-1.0, 0.0, 1.0], requires_grad=True)

@ sll.linearize(eps=1e-2)
def compute(x):
    y = torch.sign(x)              # 自动可微!
    z = torch.round(y * 10)
    return z.sum()

loss = compute(x)
loss.backward()

print(x.grad)                      # 梯度正常回传 ✅

离开装饰器后,torch.sign 自动恢复原始硬逻辑——训练时可微,部署时零开销。


🚀 安装

pip install sll-core

要求: Python ≥ 3.8,PyTorch ≥ 1.9.0


🎯 快速开始

方式一:装饰器(推荐)

import torch
import sll

@ sll.linearize(eps=1e-3)
def my_custom_algorithm(x):
    mask = (x > 0.5).float()       # 自动发现并软化
    y = torch.sign(x)               # 自动发现并软化
    return mask * y

x = torch.tensor([-0.5, 0.5], requires_grad=True)
y = my_custom_algorithm(x)
y.sum().backward()                  # 梯度正常回传 ✅

方式二:自动发现

运行时自动探测并软化离散操作:

from sll.discovery import auto_discover

@ auto_discover(eps=1e-3)
def algorithm(x):
    a = torch.sign(x)
    b = torch.round(a * 10)
    return b

方式三:手动算子

直接使用预定义的 SLL 算子:

from sll.ops import heaviside, sign, round, floor, ceil

x = torch.tensor([0.0], requires_grad=True)
y = sll.sign(x, eps=1e-3)
y.backward()
print(x.grad)                      # tensor([500.])

📊 SLL 为什么更好?

梯度质量对比

硬函数 STE Sigmoid 松弛 SLL
前向输出 [-1, 0, 1] [-1, 0, 1] 连续值(有误差) 精确硬输出
边界附近梯度 0 1(常数) 高斯峰(易消失) 常数 1/(2ε)
远离边界梯度 0 1 ≈ 0 0 0(硬逻辑)
是否需要调温度参数 需要调 β 无需调参

数学原理

SLL 在离散决策边界附近建立局部线性化区间:

$$ y(x) = \begin{cases} 0.5 + x/(2\epsilon) & 当 |x| \leq \epsilon \ H(x) & 其他 \end{cases} $$

其中 H(x) 是原始 Heaviside 函数。当 ε → 0 时,y(x) → H(x),最优解收敛到原始问题最优解。


📋 支持的可微离散算子

内置算子(开箱即用)

算子 描述 使用示例
heaviside Heaviside 阶跃函数 sll.heaviside(x)
sign 符号函数 sll.sign(x)
round 四舍五入 sll.round(x)
floor 向下取整 sll.floor(x)
ceil 向上取整 sll.ceil(x)
threshold 通用阈值函数 sll.threshold(x, threshold=0.5)

自动发现机制

通过运行时探测,SLL 可以自动识别并软化:

  • ✅ 用户自定义的离散函数
  • ✅ 复杂的复合离散逻辑

🔬 实际应用场景

场景 1:量化感知训练 (QAT)

import torch
import sll

def quantize(x, levels=256):
    scale = (levels - 1) / (x.max() - x.min() + 1e-10)
    return torch.round((x - x.min()) * scale) / scale + x.min()

x = torch.randn(10, requires_grad=True)

@ sll.linearize(eps=1e-3)
def forward(x):
    return quantize(x)

y = forward(x)
y.sum().backward()
print("量化梯度:", x.grad)          # ✅ 梯度正常回传

场景 2:组合优化(背包问题)

import torch
import sll

item_weights = torch.tensor([2, 3, 4, 5], dtype=torch.float32)
item_values = torch.tensor([3, 4, 5, 6], dtype=torch.float32)
capacity = torch.tensor(8.0)

@ sll.linearize(eps=1e-2)
def knapsack(probabilities):
    selected = (probabilities > 0.5).float()
    total_weight = (selected * item_weights).sum()
    total_value = (selected * item_values).sum()
    penalty = torch.max(torch.tensor(0.0), total_weight - capacity) * 100
    return total_value - penalty

probabilities = torch.sigmoid(torch.randn(4), requires_grad=True)
optimizer = torch.optim.Adam([probabilities], lr=1e-2)

for epoch in range(100):
    optimizer.zero_grad()
    total_value = knapsack(probabilities)
    (-total_value).backward()
    optimizer.step()

print("最优价值:", total_value.item())  # ✅ 梯度正常回传

⚙️ 参数说明

  • eps:线性化区间半宽,默认 1e-3
    • 输入距离硬边界 ≤ eps:使用线性化近似
    • 输入距离硬边界 > eps:使用原始硬逻辑
    • eps 越小,越接近硬逻辑,梯度区域越窄
    • eps 越大,过渡越平滑,近似区域越宽

🏛️ 项目结构

sll-core/
├── sll/
│   ├── __init__.py          # 模块导出
│   ├── core.py              # 核心 API(linearize)
│   ├── discovery.py         # 自动发现装饰器
│   └── ops.py               # SLL 算子实现(含工厂函数)
├── tests/
│   ├── test_discovery.py    # 离散探测测试
│   ├── test_gradcheck.py    # 梯度检查测试
│   ├── test_ops.py          # 算子测试
│   └── test_large_scale.py  # 大规模场景测试
├── README.md
├── README_EN.md
├── LICENSE
└── pyproject.toml

📄 许可证

MIT License


🤝 贡献

欢迎提交 Issue 和 Pull Request!

开发环境

git clone https://github.com/jacksong-sourse/sll-core.git
cd sll-core
pip install -e ".[dev]"

运行测试

pytest tests/ -v

📚 引用

如果您在研究中使用 SLL,请引用:

@software{sll-core,
  title = {SLL-Core: Static Local Linearization for Differentiable Discrete Programming},
  author = {Jacksong},
  year = {2026},
  url = {https://github.com/jacksong-sourse/sll-core},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sll_core-1.1.0.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sll_core-1.1.0-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file sll_core-1.1.0.tar.gz.

File metadata

  • Download URL: sll_core-1.1.0.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sll_core-1.1.0.tar.gz
Algorithm Hash digest
SHA256 536f7cc673cea775a69ab940707b7f5af4ddebc6c038acbf8ab140ef07e13261
MD5 dfa2005743a4b3d1a29afba0a0cd4630
BLAKE2b-256 4efd20ca4585655cc88b1cbda5bb4294fda002b0f7f6473835c89eb018be31d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for sll_core-1.1.0.tar.gz:

Publisher: publish.yml on jacksong-sourse/sll-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sll_core-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: sll_core-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sll_core-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6c1c8beff2a4ee2f0341832c01f5d50328edcf94e79efdbf4fa2fcfcc83000fe
MD5 442ff34bc821e1c49fca4bb744286db9
BLAKE2b-256 930b436a860159c6e204569509d84aa1dae66df4ea52425617c62d7f4ea71409

See more details on using hashes here.

Provenance

The following attestation bundles were made for sll_core-1.1.0-py3-none-any.whl:

Publisher: publish.yml on jacksong-sourse/sll-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page