Factor attribution and analytics CLI
Project description
FactorLens
FactorLens is an offline-first factor attribution assistant in Rust.
It computes statistical factors (PCA) from price history, writes artifacts, and supports explainability through a pluggable LLM backend interface (local and bedrock).
MVP Features
- Price ingestion from CSV
- PCA factor model fitting
- Portfolio factor attribution
- Residual outlier detection
- Artifact outputs (
json+csv) - Markdown report generation
- Explain command using a local llama.cpp backend (
llama-cli) with a Bedrock-ready backend contract
Workspace Layout
crates/factor_core: Returns, PCA, attribution mathcrates/factor_io: CSV IO and artifact writingcrates/factor_cli: CLI binary (factorlens)crates/llm_local:LLMClienttrait + local/bedrock backendscrates/report: Markdown report generation
Build Instructions
For advanced build/release details, see BUILD_INSTRUCTIONS.md.
Quick local build:
cargo build -p factor_cli
cargo build -p factor_cli --release
Input Formats
prices.csv
date(YYYY-MM-DD)tickerclose
portfolio.csv (optional)
tickerweight
holdings.csv (optional alternative to portfolio.csv)
ticker- either
market_valueor bothsharesandprice
factors.csv (for known-factor regression mode)
date(YYYY-MM-DD)- one or more numeric factor columns (for example:
MKT,SMB,HML)
Quick Start
cargo run -p factor_cli -- factors fit \
--prices data/prices.csv \
--k 3 \
--out artifacts/ \
--portfolio data/portfolio.csv
# safer residual analysis: auto-pick k (< number of assets)
cargo run -p factor_cli -- factors fit \
--prices data/prices.csv \
--k-auto \
--out artifacts/ \
--portfolio data/portfolio.csv
# alternative: derive weights automatically from holdings
cargo run -p factor_cli -- factors fit \
--prices data/prices.csv \
--k 3 \
--out artifacts/ \
--holdings data/holdings.csv
cargo run -p factor_cli -- report \
--artifacts artifacts/ \
--format markdown \
--out artifacts/report.md
# known-factor regression mode (MKT/SMB/HML-style)
cargo run -p factor_cli -- factors regress \
--prices data/prices.csv \
--factors data/factors.csv \
--out artifacts/ \
--portfolio data/portfolio.csv
cargo run -p factor_cli -- explain \
--backend local \
--model models/llama.gguf \
--artifacts artifacts/ \
--question "What drove the largest drawdown?"
Notes
explain --backend localexpectsllama-clion your PATH.explain --backend bedrockuses AWS Bedrock via AWS CLI (aws bedrock-runtime converse).- This project is designed for explainability of computed analytics, not market prediction.
Explainability Notes
factors fitexcludes weekend dates by default.- Pass
--include-weekendsif your dataset intentionally includes weekend trading. explainsupports focused analysis with--focus-factors.
Examples:
cargo run -p factor_cli -- factors fit --prices data/prices.csv --k 3 --out artifacts/ --portfolio data/portfolio.csv
cargo run -p factor_cli -- factors fit --prices data/prices.csv --k 3 --out artifacts/ --portfolio data/portfolio.csv --include-weekends
cargo run -p factor_cli -- explain --backend local --model models/llama_instruct.gguf --artifacts artifacts/ --question "What drove the largest drawdown?" --focus-factors factor_1,factor_2
Custom Factor Names
By default, FactorLens auto-generates factor names from your dataset loadings (top positive and negative loading tickers per factor), so it works on any dataset.
You can still override labels with a CSV or TSV file via --factor-labels.
Example data/factor_labels.csv:
factor,label
factor_1_contrib,Broad Market Beta
factor_2_contrib,Growth vs Value Rotation
factor_3_contrib,Idiosyncratic Spread
Use in explain:
cargo run -p factor_cli -- explain --backend local --model models/llama_instruct.gguf --artifacts artifacts/ --question "What drove the largest drawdown?" --factor-labels data/factor_labels.csv
Notes:
- Factor keys may be
factor_1,factor_1_contrib, or just1. #comment lines are ignored.
Suggested Questions
- What was the worst modeled drawdown day, and what factors drove it?
- On the worst day, what percentage came from each factor?
- Which factor is my largest average downside contributor over the full sample?
- Which dates had the biggest positive factor-driven gains?
- Which 5 days had the largest residuals (moves not explained by factors)?
- Did my risk concentration increase in the last month?
- Is my portfolio dominated by one factor or diversified across factors?
- How stable are exposures across time windows?
- Which factor changed direction most often?
- Which factor contributed most to volatility, not just returns?
- If I remove
factor_1, how much modeled downside is left? - Compare drawdown drivers with and without weekends included.
- Using only
factor_1,factor_2, what drove the drawdown? - Which assets are most aligned with
factor_1loadings? - Which assets increased my exposure to downside factors most?
Generic Table Analysis
Analyze any CSV table by grouping columns and numeric metrics you choose:
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--group-by region,product_line,channel \
--out artifacts/analysis.md
# profile-based quick starts
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile exec \
--out artifacts/analysis_exec.md
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile segment \
--out artifacts/analysis_segment.md
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile supplier \
--out artifacts/analysis_supplier.md
# custom profile config (recommended for private/domain fields)
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile exec_custom \
--profile-config profiles/profiles.example.toml \
--out artifacts/analysis.md
# filtered + ranked view
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--where region=US \
--rank-by revenue_usd \
--agg median \
--percentiles p50,p90 \
--top 10 \
--min-records 20 \
--out artifacts/analysis_filtered_ranked.md
# text normalization for name/title grouping + JSON-only output
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--group-by title \
--metrics revenue_usd \
--normalize-text-groups \
--word-freq \
--output-format json \
--out artifacts/analysis_title.json
Auto-detect useful grouping columns (if --group-by is omitted):
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--out artifacts/analysis_auto.md
Or analyze directly from Postgres:
# option 1: inline query
factorlens analyze \
--postgres-url "$DATABASE_URL" \
--query "SELECT region, channel, revenue_usd, cost_usd FROM analytics.sales" \
--postgres-ssl-mode require \
--postgres-ca-file /path/to/rds-ca-bundle.pem \
--profile exec_custom \
--profile-config profiles/profiles.example.toml \
--out artifacts/analysis.md
# option 2: query file
factorlens analyze \
--postgres-url "$DATABASE_URL" \
--query-file sql/sales_analysis.sql \
--profile exec_custom \
--profile-config profiles/profiles.example.toml \
--out artifacts/analysis.md
Notes:
- Outputs both markdown and JSON (
<out>.json). - If
--metricsis omitted, numeric metrics are auto-detected from the input file. --profilebuilt-ins (exec,segment,supplier) are generic (no hardcoded domain columns).- Use
--profile-config <path.toml>for your own private, file-specific profile mappings. - Input source is exclusive: use either
--input <csv>or--postgres-url+ (--queryor--query-file). --postgres-urlcan be omitted ifDATABASE_URLenv var is set.--postgres-ssl-modesupportsprefer(default),require, ordisable.--postgres-ca-fileoptionally adds PEM CA certificates for DB TLS verification.- Recommended layout: commit
profiles/profiles.example.toml, keep private variants asprofiles/*.local.tomlorprofiles/*.private.toml(gitignored). --whereaccepts comma-separatedcolumn=valuefilters (AND semantics).--rank-byranks groups by a chosen metric (default ranking is by count).--aggcontrols metric aggregation:sum(default),mean, ormedian.--percentilesadds optional metric columns (p50,p90) per metric.--topcontrols how many groups are listed in the report.--normalize-text-groupsnormalizes group values for columns likename/title(lowercase + punctuation cleanup).--word-freqadds a Top Words section/counts forname/title-style grouping columns.--output-formatsupportsmd,json, orboth(default).--min-recordsdrops tiny segments before ranking (useful to avoid one-record outliers).
Example --profile-config file:
[profiles.exec_custom]
group_by = ["region", "channel"]
metrics = ["revenue_usd"]
rank_by = "revenue_usd"
top = 12
min_records = 20
auto_group_k = 3
pip Package Usage
Install from PyPI:
For packaging/build/publish details, see BUILD_INSTRUCTIONS.md.
pip install --upgrade factorlens==0.1.3
factorlens --help
Local model:
factorlens explain \
--backend local \
--model /path/to/model.gguf \
--artifacts /path/to/artifacts \
--question "What drove the largest drawdown?"
Bedrock:
export AWS_REGION=us-east-1
factorlens explain \
--backend bedrock \
--model anthropic.claude-3-5-sonnet-20240620-v1:0 \
--artifacts /path/to/artifacts \
--question "What drove the largest drawdown?"
What Bedrock Step Is Doing
factorlens explain --backend bedrock does not compute analytics. It only explains
already-computed artifacts.
Step-by-step:
- You run analytics first (
factors fitoranalyze) to produce artifacts. explainloads artifact context (for factor mode:factors.json,attribution.csv,outliers.csv).- FactorLens builds a constrained prompt from that context.
- FactorLens calls AWS Bedrock through AWS CLI (
aws bedrock-runtime converse). - Bedrock returns plain-text explanation grounded in the provided artifact context.
Important:
analyzecommand = pure Rust analytics, no LLM used.explaincommand = LLM narrative layer over artifacts.- For table-analysis markdown (
analysis.md), you can optionally call Bedrock directly with AWS CLI by passing report text as prompt.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file factorlens-0.1.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: factorlens-0.1.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.0 MB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23effb78aee73bbb0b5a27ef561c511509beff19f6cefb523cf0c601e3bf5340
|
|
| MD5 |
affd14c49208c76c9eeee37ba6eb97be
|
|
| BLAKE2b-256 |
fbd7572581219a8acff3cd26904752d50c944db5b630f0b8ff1162dc42a16b8c
|
Provenance
The following attestation bundles were made for factorlens-0.1.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-0.1.8-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
23effb78aee73bbb0b5a27ef561c511509beff19f6cefb523cf0c601e3bf5340 - Sigstore transparency entry: 1053166832
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@4bab16460bf8c41d58636d36ff02169a8231cbca -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/kraftaa
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4bab16460bf8c41d58636d36ff02169a8231cbca -
Trigger Event:
push
-
Statement type:
File details
Details for the file factorlens-0.1.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: factorlens-0.1.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 2.8 MB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0979dcd688f03c5bb42bf76481892e930a6ceb7c7ea1948152d407902c2ff53e
|
|
| MD5 |
aa67efcdf6e3178bbde3c5d6ea382738
|
|
| BLAKE2b-256 |
fb22b48ac0d7fd80f3756f47787a8b30943ba78110a33238416c16d259dc635d
|
Provenance
The following attestation bundles were made for factorlens-0.1.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-0.1.8-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
0979dcd688f03c5bb42bf76481892e930a6ceb7c7ea1948152d407902c2ff53e - Sigstore transparency entry: 1053166825
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@4bab16460bf8c41d58636d36ff02169a8231cbca -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/kraftaa
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4bab16460bf8c41d58636d36ff02169a8231cbca -
Trigger Event:
push
-
Statement type:
File details
Details for the file factorlens-0.1.8-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: factorlens-0.1.8-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 2.8 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
187ffc0c984071e3696f01608211372be949c46190af3bb39cecbe3f2177b351
|
|
| MD5 |
0d04ea2f5b19e598fe819098c1713e6e
|
|
| BLAKE2b-256 |
98a578e1ecd5847e518c5ad43707f8c834741049514ee7d13bdee1ad3fa1c475
|
Provenance
The following attestation bundles were made for factorlens-0.1.8-py3-none-macosx_11_0_arm64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-0.1.8-py3-none-macosx_11_0_arm64.whl -
Subject digest:
187ffc0c984071e3696f01608211372be949c46190af3bb39cecbe3f2177b351 - Sigstore transparency entry: 1053166819
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@4bab16460bf8c41d58636d36ff02169a8231cbca -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/kraftaa
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4bab16460bf8c41d58636d36ff02169a8231cbca -
Trigger Event:
push
-
Statement type:
File details
Details for the file factorlens-0.1.8-py3-none-macosx_10_12_x86_64.whl.
File metadata
- Download URL: factorlens-0.1.8-py3-none-macosx_10_12_x86_64.whl
- Upload date:
- Size: 2.9 MB
- Tags: Python 3, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c747ab8bacefdb55051893d7b2952413ea45cda5d6f91bfe58e2ef47c203b6d
|
|
| MD5 |
716a66b2ecf6f85017b8a6fedfccf371
|
|
| BLAKE2b-256 |
04f84f88d729f88998c557bb0976b82c7f6e0f8cd6f8eea5b7952d4658e60b9f
|
Provenance
The following attestation bundles were made for factorlens-0.1.8-py3-none-macosx_10_12_x86_64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-0.1.8-py3-none-macosx_10_12_x86_64.whl -
Subject digest:
3c747ab8bacefdb55051893d7b2952413ea45cda5d6f91bfe58e2ef47c203b6d - Sigstore transparency entry: 1053166827
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@4bab16460bf8c41d58636d36ff02169a8231cbca -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/kraftaa
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@4bab16460bf8c41d58636d36ff02169a8231cbca -
Trigger Event:
push
-
Statement type: