DuckDB-powered factor research pipeline: formula engine, factor store, layered backtest
Project description
alpha101-pipeline
基于 DuckDB 的 A 股日内因子研究全流程工具链:公式引擎 → 因子存储 → 分层回测 → 可视化。
安装
pip install alpha101-pipeline
# 带绘图功能:
pip install "alpha101-pipeline[plot]"
快速上手
# 1. 重排面板数据(按 datetime, code 排序,TIMESTAMP 类型,多行组)
alpha101-reorder --input raw.parquet --output panel_sorted.parquet
# 2. 计算因子存入 store
alpha101-store add \
--source panel_sorted.parquet \
--store data/factors \
--formula mom_12='ts_mean(delta(close,1),12)' \
--formula rev_12='ts_mean(delta(close,1),12) * -1'
# 3. 回测全部因子
alpha101-backtest \
--store data/factors \
--out-dir output/backtest \
--forward 12 --groups 10
# 4. 绘制分层图
alpha101-plot batch \
--returns-root output/backtest/series \
--reports-dir output/backtest/reports \
--out-dir output/backtest/plots
前向收益模式
日内回测支持两种前向收益计算方式:
# 默认:前向收益不跨日(避免隔夜跳空污染)
alpha101-backtest --store data/factors --out-dir output --forward 12 --groups 10
# 跨日 close-to-close 收益(适合长周期因子)
alpha101-backtest --store data/factors --out-dir output --forward 60 --groups 10 \
--no-intraday-only
| 参数 | 说明 |
|---|---|
--intraday-only / --no-intraday-only |
默认开启。限制前向收益在同一交易日内。--no-intraday-only 允许跨日 close-to-close 收益 |
--bars-per-day |
每日 K 线数(默认 48,即 5 分钟线)。--intraday-only 开启时,--forward 必须 < --bars-per-day,否则报错 |
Python API
from alpha101_pipeline import FactorStore, run_intraday_multi
from pathlib import Path
# 计算因子
store = FactorStore(Path("data/factors"))
store.add_factors(
[("mom_12", "ts_mean(delta(close,1),12)")],
source=Path("panel_sorted.parquet"),
)
# 回测
reports = run_intraday_multi(
store.store_dir,
["mom_12"],
source_panel=Path("panel_sorted.parquet"),
factor_files={"mom_12": store.factor_path("mom_12")},
)
# 跨日前向收益(默认 intraday_only=True 只计算日内收益)
reports = run_intraday_multi(
store.store_dir,
["mom_12"],
forward_period=60, # 跨日(超过 bars_per_day=48)
bars_per_day=48,
intraday_only=False, # 允许 close-to-close 跨日收益
source_panel=Path("panel_sorted.parquet"),
factor_files={"mom_12": store.factor_path("mom_12")},
)
支持的函数
时序窗口函数(PARTITION BY 股票 ORDER BY 日期)
| 函数 | 说明 |
|---|---|
delay(x, d) |
取 d 期前的值 |
delta(x, d) |
与 d 期前的差值 |
ts_sum(x, d) / sum(x, d) |
滚动求和 |
ts_mean(x, d) / mean(x, d) / sma(x, d) |
滚动均值 |
ts_min(x, d) / min(x, d) |
滚动最小值 |
ts_max(x, d) / max(x, d) |
滚动最大值 |
ts_stddev(x, d) / stddev(x, d) |
滚动标准差 |
ts_variance(x, d) / variance(x, d) |
滚动方差 |
ts_count(x, d) |
滚动非空计数 |
ts_count_not_nan(x, d) |
滚动非 NaN 计数 |
ts_zscore(x, d) |
滚动 Z-Score |
ts_pct_change(x, d) |
滚动百分比变化 |
product(x, d) |
滚动乘积 |
decay_linear(x, d) |
线性衰减加权和 |
ts_corr(x, y, d) / correlation(x, y, d) |
滚动皮尔逊相关 |
ts_covariance(x, y, d) / covariance(x, y, d) |
滚动协方差 |
bollinger_upper(x, d) |
布林带上轨 |
bollinger_lower(x, d) |
布林带下轨 |
ts_median(x, d) / median(x, d) |
滚动中位数 |
ts_quantile(x, d, q) / quantile(x, d, q) |
滚动分位数 |
wma(x, d) |
加权移动平均 |
ts_skew(x, d) / skew(x, d) |
滚动偏度 |
ts_kurt(x, d) / kurt(x, d) |
滚动峰度 |
ts_mad(x, d) / mad(x, d) |
滚动平均绝对偏差 |
ts_rank(x, d) |
滚动时序排名 |
slope(x, y, d) / regr_slope(x, y, d) |
滚动回归斜率 |
rsquare(x, y, d) / regr_r2(x, y, d) |
滚动回归 R² |
resi(x, y, d) / regr_resid(x, y, d) |
滚动回归残差 |
idxmax(x, d) / ts_argmax(x, d) |
滚动窗口最大值位置 |
idxmin(x, d) / ts_argmin(x, d) |
滚动窗口最小值位置 |
截面函数(PARTITION BY 日期)
| 函数 | 说明 |
|---|---|
rank(x) |
截面排名(百分位) |
scale(x) |
截面标准化到 [0, 1] |
zscore(x) |
截面 Z-Score |
demean(x) |
截面去均值 |
分组函数(PARTITION BY 日期 + 分组列)
| 函数 | 说明 |
|---|---|
group_mean(x, group) |
分组均值 |
group_rank(x, group) |
组内排名 |
group_neutralize(x, group) / indneutralize(x, group) |
分组中性化 |
group_zscore(x, group) |
组内 Z-Score |
数学函数(标量)
| 函数 | 说明 |
|---|---|
abs(x) |
绝对值 |
log(x) |
自然对数 |
sqrt(x) |
平方根 |
sign(x) |
符号函数 |
exp(x) |
指数 |
round(x) |
四舍五入 |
floor(x) |
向下取整 |
ceil(x) |
向上取整 |
sin(x) |
正弦 |
cos(x) |
余弦 |
tan(x) |
正切 |
signed_power(x, n) / power(x, n) / pow(x, n) |
幂运算(保留符号) |
min(x, y) |
两值取小 |
max(x, y) |
两值取大 |
工具函数
| 函数 | 说明 |
|---|---|
if(cond, then, else) |
条件选择 |
fillna(x, val) |
空值填充 |
clip(x, lo, hi) |
截断到 [lo, hi] |
is_finite(x) |
是否有限值 |
不支持的函数(递归/状态型,无法用纯 SQL 表达)
ema, rsi, macd, atr, roc, obv, cci, mfi
运算符
| 优先级 | 运算符 | 说明 |
|---|---|---|
| 1(最高) | () |
括号 |
| 2 | ^ |
幂运算(右结合:2^3^2 = 2^9 = 512) |
| 3 | -x |
一元负号 |
| 4 | * / |
乘除 |
| 5 | + - |
加减 |
| 6 | > < >= <= == != |
比较(返回 1.0 / 0.0) |
| 7(最低) | ? : |
三元条件(close > 100 ? 1 : 0) |
回测输出指标
| 指标 | 说明 |
|---|---|
| IC 均值 / ICIR | 每日 Spearman 秩相关 IC 的均值和信息比率 |
| MS(单调性得分) | 相邻组收益方向一致比例(0~1,1.0 = 完美单调) |
| Spearman | 组号 vs 年化收益的秩相关(-1~+1) |
| 多空夏普 / 年化 | 最高组减最低组的多空组合绩效 |
| 分组年化收益 | 每个分层组的年化收益率 |
许可证
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
alpha101_pipeline-0.3.0.tar.gz
(71.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alpha101_pipeline-0.3.0.tar.gz.
File metadata
- Download URL: alpha101_pipeline-0.3.0.tar.gz
- Upload date:
- Size: 71.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
165e46ebb9a8275b898705b915fece646463b5eb39325a02ac438ac954733b2f
|
|
| MD5 |
660d0e6e1425740a3fd2267cf404a175
|
|
| BLAKE2b-256 |
854aa69616cef37130227502197389fa55a3f079c3e26441ff2dac65f75902da
|
File details
Details for the file alpha101_pipeline-0.3.0-py3-none-any.whl.
File metadata
- Download URL: alpha101_pipeline-0.3.0-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
639847c36168e6b10e6cdb29717c1ae32aead001607e1bf6f539c2a5972bea99
|
|
| MD5 |
f858d8c68c56f6b5b2990a395f9b3d9a
|
|
| BLAKE2b-256 |
0468c4568acc41283dcc3825efa36bb8629ba168d7cfcf39ec326c9bac59c6dd
|