Factor attribution and analytics CLI
Project description
FactorLens
FactorLens is a Rust CLI that explains why metrics changed.
Dashboards show that metrics moved. FactorLens decomposes those changes into driver contributions using deterministic math, then optionally generates narrative explanations.
Typical flow:
metric change -> driver contributions -> closure check -> residual segments
Quick Start
pip install factorlens
factorlens --help
factorlens analyze \
--input data/factorlens_demo_sales_100.csv \
--group-by region,channel,product_line \
--metrics revenue_usd,cost_usd,orders \
--rank-by revenue_usd \
factorlens analyze-drivers \
--input data/demo_revenue.csv \
--metric revenue_usd \
--date-column date \
--time-grain month \
--period last \
--anchor-date 2026-04-15
Example
factorlens analyze-drivers \
--input data/demo_revenue_residual.csv \
--metric revenue_usd \
--date-column date \
--time-grain month \
--period last \
--anchor-date 2026-04-15
Output:
revenue_usd change: -16.4%
Window: 2026-03-01..2026-03-31 vs 2026-02-01..2026-02-28
Inferred identity
- revenue_usd ≈ orders * avg_price_usd
- fit MAPE: 1.18% across 56 rows
Driver contributions
- orders: -15.9%
- avg_price_usd: -2.0%
Closure check
- explained: -17.9%
- residual: +1.5% (+77,765.73)
Residual segments
- campaign = spring_launch: mean residual +5,151.67 (16 rows)
- channel = Marketplace: mean residual +5,151.67 (16 rows)
- device_type = mobile: mean residual +5,151.67 (16 rows)
Real Use Cases
- Revenue debugging: decompose changes into orders, price, or mix effects.
- Growth analytics: explain movement in conversion, CAC, or AOV.
- Data pipeline sanity checks: large residuals often reveal joins, missing data, or definition drift.
- CI metric monitoring: run FactorLens in pipelines to catch unusual metric behavior.
Workflow
Typical workflow:
analyze- explore segments and concentration.investigate- guided multi-step drill-down for period/snapshot change analysis.explain-analyze- add optional narrative explanation.
Primary commands:
| Command | Purpose |
|---|---|
analyze |
factor/segment attribution from CSV or Postgres |
investigate |
guided drill-down investigation across snapshots/periods (deterministic or LLM planner) |
explain-analyze |
executive narrative and actions from computed JSON |
Specialized / legacy-compatible commands:
| Command | Purpose |
|---|---|
analyze-investigate |
legacy-compatible numeric driver decomposition (curated driver sets) |
analyze-drivers |
automatic metric identity detection and driver decomposition |
analyze-suggest |
infer likely dimensions/metrics/date and generate starter profile config (toml or json) |
analyze-compare |
snapshot delta analysis (biggest movers) |
factors fit / factors regress |
statistical factors (PCA) or known-factor regression |
When To Use Which
analyzeanswers: which groups changed?investigateanswers: where should I drill next, and what likely drove the change?analyze-investigateanswers: which numeric drivers account for the metric change? (curated numeric-driver mode)analyze-driversanswers: what metric identity or formula explains the change?
Tip: investigate can auto-route from question text, or you can force mode with
--mode change_drivers|concentration_drivers|compare_snapshots|recommend_next.
To avoid long investigate commands, store defaults in a TOML config and pass
--profile default --profile-config profiles/investigate.example.toml
(or legacy --config profiles/investigate.example.toml).
Practical rule:
- Start with
analyzefor wide business tables. - Use
investigatefor end-to-end drill-down across dimensions and periods/snapshots. - Use
analyze-investigatewhen you have a curated dataset with a few meaningful numeric drivers. - Use
analyze-driverswhen the metric likely has a formula such asrevenue ≈ orders * avg_price.
Design Principles
FactorLens follows a few simple design rules:
- Math-first, AI-second – deterministic factor attribution produces the artifacts, AI only explains them.
- CLI-first workflows – designed to run locally, in scripts, or inside pipelines.
- Structured outputs – results can be exported as Markdown, JSON, or HTML for humans and automation.
- Composable commands – analysis, comparison, and explanation steps can be combined in workflows.
Demo Workflow
# 1) baseline snapshot (100 rows)
factorlens analyze \
--input data/factorlens_demo_sales_100.csv \
--group-by region,channel,product_line,plan_tier \
--metrics revenue_usd,cost_usd,orders \
--rank-by revenue_usd
# 2) new snapshot (150 rows)
factorlens analyze \
--input data/factorlens_demo_sales_150.csv \
--group-by region,channel,product_line,plan_tier \
--metrics revenue_usd,cost_usd,orders \
--rank-by revenue_usd
# 3) compare + explain
factorlens analyze-compare \
--base artifacts/demo_sales_100.json \
--new artifacts/demo_sales_150.json
factorlens explain-analyze \
--backend bedrock \
--model anthropic.claude-3-haiku-20240307-v1:0 \
--analysis-json artifacts/analysis_compare.json \
--question "What are the top concentration risks and what 3 actions should we take in the next 30 days?" \
--strict-facts true \
--max-bullets 5
One-command runner:
./scripts/demo_sales.sh
# optional Bedrock:
RUN_BEDROCK=1 AWS_REGION=eu-central-1 ./scripts/demo_sales.sh
Demo Data
Public-safe demo files included:
data/factorlens_demo_sales_100.csvdata/factorlens_demo_sales_150.csv(use for compare)
Optional Postgres load:
psql "$DATABASE_URL" -c "
create schema if not exists demo;
drop table if exists demo.factorlens_demo_sales_100;
drop table if exists demo.factorlens_demo_sales_150;
create table demo.factorlens_demo_sales_100 (
order_date date,
region text,
channel text,
product_line text,
plan_tier int,
revenue_usd numeric(14,2),
cost_usd numeric(14,2),
orders int
);
create table demo.factorlens_demo_sales_150 (like demo.factorlens_demo_sales_100);
"
psql "$DATABASE_URL" -c "\copy demo.factorlens_demo_sales_100 from 'data/factorlens_demo_sales_100.csv' with (format csv, header true)"
psql "$DATABASE_URL" -c "\copy demo.factorlens_demo_sales_150 from 'data/factorlens_demo_sales_150.csv' with (format csv, header true)"
Generate a starter profile automatically from a new dataset:
factorlens analyze-suggest \
--input data/factorlens_demo_sales_150.csv \
--out artifacts/demo_suggest.md \
--profile-name demo_exec \
--auto-group-k 4 \
--max-metrics 3
Large file tip:
factorlens analyze-suggest \
--input data/factorlens_demo_sales_150.csv \
--out artifacts/demo_suggest_random.md \
--sample-rows 1000 \
--sample-mode random \
--sample-seed 42
This writes:
artifacts/demo_suggest.md(human summary)artifacts/demo_suggest.json(machine-readable suggestion report)artifacts/demo_suggest.toml(ready profile config block)
If you want JSON profile output instead of TOML:
factorlens analyze-suggest \
--input data/factorlens_demo_sales_150.csv \
--out artifacts/demo_suggest_json_profile.md \
--profile-format json
Investigate using config defaults:
cargo run -p factor_cli -- investigate \
--profile default \
--profile-config profiles/investigate.example.toml \
--question "Why did revenue change?" \
--base data/factorlens_demo_sales_100.csv \
--new data/factorlens_demo_sales_150.csv \
--out artifacts/investigate_demo.md
Investigate from Postgres query with period windows:
cargo run -p factor_cli -- investigate \
--profile default \
--profile-config profiles/investigate.example.toml \
--question "Why did revenue change last month?" \
--postgres-url "$DATABASE_URL" \
--query "SELECT order_date, region, channel, product_line, plan_tier, revenue_usd FROM analytics.sales" \
--date-column order_date \
--time-grain month \
--period last \
--anchor-date 2026-04-15 \
--out artifacts/investigate_postgres.md
Investigate from Postgres query file with period windows:
cargo run -p factor_cli -- investigate \
--profile default \
--profile-config profiles/investigate.example.toml \
--question "Why did revenue change last month?" \
--postgres-url "$DATABASE_URL" \
--query-file sql/investigate_sales.sql \
--date-column order_date \
--time-grain month \
--period last \
--anchor-date 2026-04-15 \
--out artifacts/investigate_postgres_file.md
Architecture
flowchart LR
A["CSV/Postgres"] --> B["Factor/Segment Model (Rust)"]
B --> C["Attribution Artifacts (JSON/CSV)"]
C --> D["Explanation Layer (Local LLM or Bedrock)"]
C --> E["Reports (Markdown/HTML/JSON)"]
Math engine first, explanation layer second.
Why This Exists
Many analytics workflows produce dashboards without a clear explanation of why metrics changed. FactorLens prioritizes attribution and residual math first, then translates those computed results into business language.
What This Is Not
- Not a trading bot
- Not a price prediction model
- Not a chat-first analytics toy
FactorLens computes attribution first, then uses LLMs only to explain computed artifacts.
Integrations
- Local LLMs via
llama.cpp - AWS Bedrock
- Claude Desktop / Claude Code via MCP
- CSV and Postgres data sources
MVP Features
- Price ingestion from CSV
- PCA factor model fitting
- Portfolio factor attribution
- Residual outlier detection
- Artifact outputs (
json+csv) - Markdown report generation
- Explain command using a local llama.cpp backend (
llama-cli) with a Bedrock-ready backend contract
Workspace Layout
crates/factor_core: Returns, PCA, attribution mathcrates/factor_io: CSV IO and artifact writingcrates/factor_cli: CLI binary (factorlens)crates/llm_local:LLMClienttrait + local/bedrock backendscrates/report: Markdown report generation
Build Instructions
For advanced build/release details, see BUILD_INSTRUCTIONS.md.
Quick local build:
cargo build -p factor_cli
cargo build -p factor_cli --release
Input Formats
prices.csv
date(YYYY-MM-DD)tickerclose
portfolio.csv (optional)
tickerweight
holdings.csv (optional alternative to portfolio.csv)
ticker- either
market_valueor bothsharesandprice
factors.csv (for known-factor regression mode)
date(YYYY-MM-DD)- one or more numeric factor columns (for example:
MKT,SMB,HML)
Quick Start
cargo run -p factor_cli -- factors fit \
--prices path/to/prices.csv \
--k 3 \
--out artifacts/ \
--portfolio path/to/portfolio.csv
# safer residual analysis: auto-pick k (< number of assets)
cargo run -p factor_cli -- factors fit \
--prices path/to/prices.csv \
--k-auto \
--out artifacts/ \
--portfolio path/to/portfolio.csv
# alternative: derive weights automatically from holdings
cargo run -p factor_cli -- factors fit \
--prices path/to/prices.csv \
--k 3 \
--out artifacts/ \
--holdings path/to/holdings.csv
cargo run -p factor_cli -- report \
--artifacts artifacts/ \
--format markdown \
--out artifacts/report.md
# known-factor regression mode (MKT/SMB/HML-style)
cargo run -p factor_cli -- factors regress \
--prices path/to/prices.csv \
--factors path/to/factors.csv \
--out artifacts/ \
--portfolio path/to/portfolio.csv
cargo run -p factor_cli -- explain \
--backend local \
--model models/llama.gguf \
--artifacts artifacts/ \
--question "What drove the largest drawdown?"
Notes
explain --backend localexpectsllama-clion your PATH.explain --backend bedrockuses AWS Bedrock via AWS CLI (aws bedrock-runtime converse).- This project is designed for explainability of computed analytics, not market prediction.
Explainability Notes
factors fitexcludes weekend dates by default.- Pass
--include-weekendsif your dataset intentionally includes weekend trading. explainsupports focused analysis with--focus-factors.
Examples:
cargo run -p factor_cli -- factors fit --prices path/to/prices.csv --k 3 --out artifacts/ --portfolio path/to/portfolio.csv
cargo run -p factor_cli -- factors fit --prices path/to/prices.csv --k 3 --out artifacts/ --portfolio path/to/portfolio.csv --include-weekends
cargo run -p factor_cli -- explain --backend local --model models/llama_instruct.gguf --artifacts artifacts/ --question "What drove the largest drawdown?" --focus-factors factor_1,factor_2
Custom Factor Names
By default, FactorLens auto-generates factor names from your dataset loadings (top positive and negative loading tickers per factor), so it works on any dataset.
You can still override labels with a CSV or TSV file via --factor-labels.
Example factor_labels.csv:
factor,label
factor_1_contrib,Broad Market Beta
factor_2_contrib,Growth vs Value Rotation
factor_3_contrib,Idiosyncratic Spread
Use in explain:
cargo run -p factor_cli -- explain --backend local --model models/llama_instruct.gguf --artifacts artifacts/ --question "What drove the largest drawdown?" --factor-labels path/to/factor_labels.csv
Notes:
- Factor keys may be
factor_1,factor_1_contrib, or just1. #comment lines are ignored.
Suggested Questions
- What was the worst modeled drawdown day, and what factors drove it?
- On the worst day, what percentage came from each factor?
- Which factor is my largest average downside contributor over the full sample?
- Which dates had the biggest positive factor-driven gains?
- Which 5 days had the largest residuals (moves not explained by factors)?
- Did my risk concentration increase in the last month?
- Is my portfolio dominated by one factor or diversified across factors?
- How stable are exposures across time windows?
- Which factor changed direction most often?
- Which factor contributed most to volatility, not just returns?
- If I remove
factor_1, how much modeled downside is left? - Compare drawdown drivers with and without weekends included.
- Using only
factor_1,factor_2, what drove the drawdown? - Which assets are most aligned with
factor_1loadings? - Which assets increased my exposure to downside factors most?
Analyze
Use analyze when you want to see which groups changed or where concentration lives.
Recommended demo file:
data/factorlens_demo_sales_100.csv
cargo run -p factor_cli -- analyze \
--input data/factorlens_demo_sales_100.csv \
--group-by region,channel,product_line \
--metrics revenue_usd,cost_usd,orders \
--rank-by revenue_usd \
Generic patterns:
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--group-by region,product_line,channel \
--metrics revenue_usd
# profile-based quick starts
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile exec
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile segment
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile supplier
# custom profile config (recommended for private/domain fields)
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--profile exec_custom \
--profile-config profiles/profiles.example.toml
# filtered + ranked view
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--where region=US \
--rank-by revenue_usd \
--agg median \
--percentiles p50,p90 \
--alert-top5-share 60 \
--alert-blank-share 10 \
--top 10 \
--min-records 20
# text normalization for name/title grouping + JSON-only output
cargo run -p factor_cli -- analyze \
--input data/your_file.csv \
--group-by title \
--metrics revenue_usd \
--normalize-text-groups \
--word-freq \
--output-format html
Auto-detect useful grouping columns (if --group-by is omitted):
cargo run -p factor_cli -- analyze \
--input data/your_file.csv
Analyze Compare
Create two analysis snapshots, then compare them:
Recommended demo files:
data/factorlens_demo_sales_100.csvdata/factorlens_demo_sales_150.csv
# base snapshot
cargo run -p factor_cli -- analyze \
--input data/factorlens_demo_sales_100.csv \
--group-by region,channel,product_line \
--metrics revenue_usd,cost_usd,orders \
--rank-by revenue_usd
# new snapshot
cargo run -p factor_cli -- analyze \
--input data/factorlens_demo_sales_150.csv \
--group-by region,channel,product_line \
--metrics revenue_usd,cost_usd,orders \
--rank-by revenue_usd
# compare (default: both markdown + json)
cargo run -p factor_cli -- analyze-compare \
--base artifacts/analyze_factorlens_demo_sales_100.json \
--new artifacts/analyze_factorlens_demo_sales_150.json
# compare (html)
cargo run -p factor_cli -- analyze-compare \
--base artifacts/analyze_factorlens_demo_sales_100.json \
--new artifacts/analyze_factorlens_demo_sales_150.json \
--output-format html \
--out artifacts/compare.html
# compare (json)
cargo run -p factor_cli -- analyze-compare \
--base artifacts/analyze_factorlens_demo_sales_100.json \
--new artifacts/analyze_factorlens_demo_sales_150.json \
--output-format json \
--out artifacts/compare.json
# compare (both markdown + json)
cargo run -p factor_cli -- analyze-compare \
--base artifacts/analyze_factorlens_demo_sales_100.json \
--new artifacts/analyze_factorlens_demo_sales_150.json \
--output-format both \
--out artifacts/compare.md
Notes:
analyzedefaults toartifacts/<input_stem>.md+.json(--output-format both).analyzenow prefixes default outputs asartifacts/analyze_<input_stem>.md+.json.analyze-investigatenow prefixes default outputs asartifacts/investigate_<input_stem>.md+.json.analyze-comparedefaults toartifacts/analysis_compare.md+.json(--output-format both).analyze-comparesupports--output-format md|html|json|both.--top-moverscontrols how many largest movers are shown (default:10).
Analyze Investigate (Legacy / Specialized)
Use analyze-investigate when you need legacy-compatible, compact “metric change + top drivers” output from a curated numeric driver set.
For most new workflows, prefer investigate.
Recommended demo file:
data/demo_revenue_residual.csv
It works best when your input already contains a small number of meaningful numeric drivers such as:
net_gmvorderstrafficavg_price_usd- distinct-count style entity columns via explicit
--drivers
# numeric driver accounting
cargo run -p factor_cli -- analyze-investigate \
--input data/demo_revenue_residual.csv \
--metric revenue_usd \
--driver-preset amount \
--driver-contrib both \
--date-column date \
--time-grain month \
--period last \
--anchor-date 2026-04-15
# entity-volume drivers
cargo run -p factor_cli -- analyze-investigate \
--input data/your_file.csv \
--metric revenue_usd \
--driver-preset id \
--driver-contrib both \
--date-column date \
--time-grain month \
--period last
# mixed exploratory scan
cargo run -p factor_cli -- analyze-investigate \
--input data/your_file.csv \
--metric revenue_usd \
--driver-preset mixed \
--driver-contrib both \
--date-column date \
--time-grain month \
--period last
# explicit drivers (manual override)
cargo run -p factor_cli -- analyze-investigate \
--input data/your_file.csv \
--metric revenue_usd \
--drivers 'count_distinct(order_id),count_distinct(customer_id),count_distinct(account_id)' \
--driver-contrib both \
--date-column date \
--time-grain month \
--period last
Notes:
- Driver presets:
id|amount|category|mixed. - Driver contribution view:
--driver-contrib percent|amount|both. - Manual driver expressions:
sum(col),avg(col),count(col),count(*),count_distinct(col). analyze-investigateis best for numeric driver accounting, not first-pass discovery.- For wide business tables, start with
analyzeand useanalyze-investigateonly after curating a smaller set of useful drivers. amountis usually the best first preset for spend, GMV, order, or traffic-style measures.mixedis exploratory and may be noisier thanamount.analyze-investigatereportsdecomposition_mode:regressionwhen numeric drivers support a fitted model, otherwiseheuristic.- Demo commands use
--anchor-date 2026-04-15so--period last --time-grain monthresolves to March 2026 vs February 2026 regardless of today’s date.
Example output:
revenue_usd change: -16.4%
Window: 2026-03-01..2026-03-31 vs 2026-02-01..2026-02-28
Decomposition mode: regression
Driver contributions
- sum(orders): -13.0% | delta=-696,191.18
- sum(traffic): -2.2% | delta=-116,243.57
- avg(avg_price_usd): -1.1% | delta=-61,590.98
Closure check
- explained: -16.3% (99%)
- residual: -0.1% (-6,146.70)
Analyze Drivers
Use analyze-drivers when you want FactorLens to infer the metric identity automatically instead of passing drivers.
Recommended demo files:
data/demo_revenue.csvfor a clean identity exampledata/demo_revenue_residual.csvfor residual analysis
This is best for metrics that likely come from a formula, such as:
revenue ≈ orders * avg_priceconversion ≈ purchases / visitsaov ≈ revenue / orders
# one-file period compare
cargo run -p factor_cli -- analyze-drivers \
--input data/demo_revenue.csv \
--metric revenue_usd \
--date-column date \
--time-grain month \
--period last \
--anchor-date 2026-04-15
# two-file compare
cargo run -p factor_cli -- analyze-drivers \
--input data/day1.csv \
--input-new data/day2.csv \
--metric revenue_usd
Example output:
revenue_usd change: -14.4%
Window: 2026-03-01..2026-03-31 vs 2026-02-01..2026-02-28
Inferred identity
- revenue_usd ≈ orders * avg_price_usd
- fit MAPE: 0.00% across 56 rows
Driver contributions
- orders: -11.3%
- avg_price_usd: -3.2%
Closure check
- explained: -14.5% (100%)
- residual: +0.1% (+5,970.47)
Artifacts written
- artifacts/drivers_demo_revenue.md
- artifacts/drivers_demo_revenue.json
Residual demo:
cargo run -p factor_cli -- analyze-drivers \
--input data/demo_revenue_residual.csv \
--metric revenue_usd \
--date-column date \
--time-grain month \
--period last \
--anchor-date 2026-04-15
revenue_usd change: -16.4%
Window: 2026-03-01..2026-03-31 vs 2026-02-01..2026-02-28
Inferred identity
- revenue_usd ≈ orders * avg_price_usd
- fit MAPE: 1.18% across 56 rows
Driver contributions
- orders: -15.9%
- avg_price_usd: -2.0%
Closure check
- explained: -17.9% (109%)
- residual: +1.5% (+77,765.73)
Residual segments
- campaign = spring_launch: mean residual +5,151.67 (16 rows)
- channel = Marketplace: mean residual +5,151.67 (16 rows)
- device_type = mobile: mean residual +5,151.67 (16 rows)
Artifacts written
- artifacts/drivers_demo_revenue_residual.md
- artifacts/drivers_demo_revenue_residual.json
Notes:
- Current scope infers two-term identities only:
metric ~= a * bormetric ~= a / b. - Residual is computed as observed metric change minus explained identity change.
- Residual segments rank leftover numeric/categorical fields against row-level unexplained error.
analyze-driversis always math-first;analyze-investigatemay fall back to heuristic mode when only non-numeric/count-distinct drivers are available.- Demo commands use
--anchor-date 2026-04-15so--period last --time-grain monthresolves to March 2026 vs February 2026 regardless of today’s date. - Period mode uses one input file plus
--date-columnand period flags. - Two-file mode uses
--inputand--input-new. - Default output path is
artifacts/drivers_<input_stem>.md+.json.
Or analyze directly from Postgres:
# option 1: inline query
factorlens analyze \
--postgres-url "$DATABASE_URL" \
--query "SELECT region, channel, revenue_usd, cost_usd FROM analytics.sales" \
--postgres-ssl-mode require \
--postgres-ca-file /path/to/rds-ca-bundle.pem \
--profile exec_custom \
--profile-config profiles/profiles.example.toml \
--out artifacts/analysis.md
# option 2: query file
factorlens analyze \
--postgres-url "$DATABASE_URL" \
--query-file sql/sales_analysis.sql \
--profile exec_custom \
--profile-config profiles/profiles.example.toml \
--out artifacts/analysis.md
# option 3: AWS RDS/Aurora TLS with explicit CA bundle (recommended in pods)
mkdir -p /path/to/certs
curl -fL "https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem" \
-o /path/to/rds-global-bundle.pem
factorlens analyze \
--query "SELECT * FROM schema.table_a LIMIT 5000" \
--postgres-ssl-mode require \
--postgres-ca-file /path/to/rds-global-bundle.pem \
--profile exec_custom \
--profile-config profiles/profiles.example.toml \
--out artifacts/analysis.md
Notes:
- Outputs both markdown and JSON (
<out>.json). - If
--metricsis omitted, numeric metrics are auto-detected from the input file. --profilebuilt-ins (exec,segment,supplier) are generic (no hardcoded domain columns).- Use
--profile-config <path.toml>for your own private, file-specific profile mappings. - Input source is exclusive: use either
--input <csv>or--postgres-url+ (--queryor--query-file). --postgres-urlcan be omitted ifDATABASE_URLenv var is set.--postgres-ssl-modesupportsprefer(default),require, ordisable.--postgres-ca-fileoptionally adds PEM CA certificates for DB TLS verification.- For AWS RDS/Aurora in containers/pods, pass explicit RDS CA bundle via
--postgres-ca-fileif TLS handshake fails with system certs. - Recommended layout: commit
profiles/profiles.example.toml, keep private variants asprofiles/*.local.tomlorprofiles/*.private.toml(gitignored). --whereaccepts comma-separatedcolumn=valuefilters (AND semantics).--rank-byranks groups by a chosen metric (default ranking is by count).--aggcontrols metric aggregation:sum(default),mean, ormedian.--percentilesadds optional metric columns (p50,p90) per metric.--count-onlydisables numeric metric aggregation and reports concentration using records only.--exclude-blank-groupsdrops(blank)segment keys before ranking/reporting.--alert-top5-shareand--alert-blank-shareadd threshold-based alerts to report output.--alert-ruleadds custom rules (for example:top5_record_share_pct>60,blank_share_pct>10,segments<50). Quote rules containing<or>in shell commands, for example:--alert-rule 'segments<50,top5_record_share_pct>60'.--topcontrols how many groups are listed in the report.--top-insightsadds deterministic Top Risks and Top Opportunities bullets to the report.--opportunity-min-recordssets minimum records required for Top Opportunities candidates (default:2).--normalize-text-groupsnormalizes group values for columns likename/title(lowercase + punctuation cleanup).--word-freqadds a Top Words section/counts forname/title-style grouping columns.--output-formatsupportsmd,json,both(default), orhtml.--min-recordsdrops tiny segments before ranking (useful to avoid one-record outliers).analyze-suggest --out-profile <path>writes a ready profile file directly (--profile-format toml|json, defaulttoml).explain-analyze --strict-facts trueenforces evidence-cited, grounded summaries (default:true).explain-analyze --max-bullets <n>limits explanation verbosity (default:5).
Example --profile-config file:
[profiles.exec_custom]
group_by = ["region", "channel"]
metrics = ["revenue_usd"]
rank_by = "revenue_usd"
top = 12
min_records = 20
auto_group_k = 3
pip Package Usage
Install from PyPI:
For packaging/build/publish details, see BUILD_INSTRUCTIONS.md.
pip install factorlens
factorlens --help
Local model:
factorlens explain \
--backend local \
--model /path/to/model.gguf \
--artifacts /path/to/artifacts \
--question "What drove the largest drawdown?"
Bedrock:
export AWS_REGION=us-east-1
factorlens explain \
--backend bedrock \
--model anthropic.claude-3-5-sonnet-20240620-v1:0 \
--artifacts /path/to/artifacts \
--question "What drove the largest drawdown?"
Explain from generic table analysis output (analysis.json):
Local model
factorlens explain-analyze \
--backend local \
--model /path/to/model.gguf \
--analysis-json /path/to/analysis.json \
--question "What are the top concentration risks and 3 actions?" \
--strict-facts true \
--max-bullets 5
Bedrock
factorlens explain-analyze \
--backend bedrock \
--model anthropic.claude-3-haiku-20240307-v1:0 \
--analysis-json /path/to/analysis.json \
--question "What are the top concentration risks and 3 actions?" \
--strict-facts true \
--max-bullets 5
MCP Server (Optional)
If you want to call FactorLens as tools from an MCP client, use:
scripts/mcp/factorlens_mcp_server.pyscripts/mcp/README.md
Quick start:
pip install mcp
python scripts/mcp/factorlens_mcp_server.py
What Bedrock Step Is Doing
factorlens explain --backend bedrock does not compute analytics. It only explains
already-computed artifacts.
Step-by-step:
- You run analytics first (
factors fitoranalyze) to produce artifacts. explainloads artifact context (for factor mode:factors.json,attribution.csv,outliers.csv).- FactorLens builds a constrained prompt from that context.
- FactorLens calls AWS Bedrock through AWS CLI (
aws bedrock-runtime converse). - Bedrock returns plain-text explanation grounded in the provided artifact context.
Important:
analyzecommand = pure Rust analytics, no LLM used.explaincommand = LLM narrative layer over artifacts.- For table-analysis markdown (
analysis.md), you can optionally call Bedrock directly with AWS CLI by passing report text as prompt.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file factorlens-4.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: factorlens-4.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.6 MB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb93445e7f426abc3557dc04e52200ed5d1f5bc13985cd9df9e28ad9b345c845
|
|
| MD5 |
0fb53ebb3fbaed58afb88c0066b4f5ad
|
|
| BLAKE2b-256 |
61acb813479e2f36223a5e53c401ab2c13ad5064f5388c8f25402a47aa838147
|
Provenance
The following attestation bundles were made for factorlens-4.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-4.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
fb93445e7f426abc3557dc04e52200ed5d1f5bc13985cd9df9e28ad9b345c845 - Sigstore transparency entry: 1266564511
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Branch / Tag:
refs/tags/v4.1.3 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Trigger Event:
push
-
Statement type:
File details
Details for the file factorlens-4.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: factorlens-4.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.3 MB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea859d85a2968fca4139c3e3cfd83cf36ea955b543f414a4df9ba8944c322e27
|
|
| MD5 |
3f438b5c889773bff0320cf45ee49a2a
|
|
| BLAKE2b-256 |
50bafd23a13cf9e7898d4e26b6a92eb54271ee4700b1717b328361e6014399f6
|
Provenance
The following attestation bundles were made for factorlens-4.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-4.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
ea859d85a2968fca4139c3e3cfd83cf36ea955b543f414a4df9ba8944c322e27 - Sigstore transparency entry: 1266564686
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Branch / Tag:
refs/tags/v4.1.3 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Trigger Event:
push
-
Statement type:
File details
Details for the file factorlens-4.1.3-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: factorlens-4.1.3-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.3 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e757e78c4899184a39996be52db5af4f41ef63727e3530ba56b901ab6d48ed1
|
|
| MD5 |
9eb2534128fdd63955127982fe12087c
|
|
| BLAKE2b-256 |
b4324aa0cecc1922ca9aef4dd74e3f15d800892e6e2aea73d0834e149c624c98
|
Provenance
The following attestation bundles were made for factorlens-4.1.3-py3-none-macosx_11_0_arm64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-4.1.3-py3-none-macosx_11_0_arm64.whl -
Subject digest:
8e757e78c4899184a39996be52db5af4f41ef63727e3530ba56b901ab6d48ed1 - Sigstore transparency entry: 1266564600
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Branch / Tag:
refs/tags/v4.1.3 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Trigger Event:
push
-
Statement type:
File details
Details for the file factorlens-4.1.3-py3-none-macosx_10_12_x86_64.whl.
File metadata
- Download URL: factorlens-4.1.3-py3-none-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.5 MB
- Tags: Python 3, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf6c89042bb6dc14ead6c590f57943789976d58a6245bb809b0600337e80e25d
|
|
| MD5 |
45a32505309141955474e4e1ad3e5e38
|
|
| BLAKE2b-256 |
ad5f0bf197e510fdc193556e5645adf11e33e6630eaf4372094bc27d07f393bc
|
Provenance
The following attestation bundles were made for factorlens-4.1.3-py3-none-macosx_10_12_x86_64.whl:
Publisher:
release.yml on kraftaa/factorlens
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
factorlens-4.1.3-py3-none-macosx_10_12_x86_64.whl -
Subject digest:
cf6c89042bb6dc14ead6c590f57943789976d58a6245bb809b0600337e80e25d - Sigstore transparency entry: 1266564786
- Sigstore integration time:
-
Permalink:
kraftaa/factorlens@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Branch / Tag:
refs/tags/v4.1.3 - Owner: https://github.com/kraftaa
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a22aba1b1d7ff77c41c94d84e9f8e36612482c04 -
Trigger Event:
push
-
Statement type: