DART 전자공시 + EDGAR 공시를 하나의 회사 맵으로 — Python 재무 분석 라이브러리
Project description
DartLab
One stock code. The whole story.
DART + EDGAR filings, structured and comparable — in one line of Python.
Docs · Blog · Live Demo · Open in Colab · Open in Molab · 한국어 · Sponsor
The Problem
A public company files hundreds of pages every quarter. Revenue trends, risk warnings, management strategy, competitive position — the complete truth about a company, written by the company itself.
Nobody reads it.
Not because they don't want to. Because the same information is named differently by every company, structured differently every year, and scattered across formats designed for regulators, not readers. The same "revenue" appears as ifrs-full_Revenue, dart_Revenue, SalesRevenue, or dozens of Korean variations. The same "business overview" is titled differently in every filing.
DartLab is built on one premise: every period must be comparable, and every company must be comparable. It normalizes disclosure sections into a topic-period grid (~95% mapping rate) and standardizes XBRL accounts into canonical names (~97% mapping rate) — so you compare companies, not filing formats.
Quick Start
uv add dartlab
pip install dartlab # core + AI (openai, gemini included)
pip install dartlab[server] # + web server (FastAPI, MCP)
pip install dartlab[viz] # + charts (Plotly)
pip install dartlab[all] # everything
import dartlab
c = dartlab.Company("005930") # Samsung Electronics
c.sections # every topic, every period, side by side
# shape: (41, 12) — 41 topics across 12 periods
# 2025Q4 2024Q4 2024Q3 2023Q4 ...
# companyOverview v v v v
# businessOverview v v v v
# riskManagement v v v v
c.show("businessOverview") # what this company actually does
c.diff("businessOverview") # what changed since last year
c.show("BS") # standardized balance sheet
c.show("ratios") # financial ratios, already calculated
# Same interface, different country
us = dartlab.Company("AAPL")
us.show("business")
us.show("ratios")
# Ask in natural language
dartlab.ask("Analyze Samsung Electronics financial health")
No API key needed. Data auto-downloads from HuggingFace on first use, then loads instantly from local cache.
What DartLab Is
One calling convention. Each engine: dartlab.engine() for the guide, dartlab.engine("axis") to run.
| Layer | Engine | What it does | Entry point | Notebook |
|---|---|---|---|---|
| Data | Data | Pre-built HuggingFace datasets, auto-download | Company("005930") |
— |
| L0/L1 | Company | Filings + financials + structured data unified by ticker | c.show(), c.select() |
01 |
| L1 | Gather | External market data (price, flow, macro, news) | dartlab.gather() |
02 |
| L1 | Scan | Cross-company comparison (governance, ratios, cashflow, ...) | dartlab.scan() |
03 |
| L1 | Quant | Technical & quantitative analysis (momentum/factor/pattern) | c.quant() |
04 |
| L2 | Analysis | Profitability/stability/cashflow causal analysis + valuation + forecast | c.analysis("financial", "수익성") |
05 |
| L2 | Macro | Market-level macro (cycle/rates/liquidity/sentiment/assets) | dartlab.macro("사이클") |
06 |
| L2 | Credit | Independent credit rating (dCR grade, default probability, health) | c.credit("등급") |
07 |
| L2 | Review | Composes analysis engines into a report (rich/html/markdown/json) | c.review("수익성") |
08 |
| L3 | AI | Active analyst — code execution + interpretation | dartlab.ask() |
09 |
| L4 | Channel | External sharing — dartlab channel brings PC dartlab to your phone |
dartlab channel |
— |
| core | Search | Semantic filing search (alpha) | dartlab.search() |
10 |
| facade | Listing | Catalog API (companies, filings, topics) | dartlab.listing() |
11 |
| viz | Viz | Charts and diagrams (emit_chart) |
emit_chart({...}) |
— |
| guide | Guide | Concierge — readiness, error handling, education | dartlab.guide.checkReady() |
— |
Company
Design: ops/company.md
Three data sources — docs (full-text disclosures), finance (XBRL statements), report (DART API) — merged into one object. Data auto-downloads from HuggingFace, no setup needed.
c = dartlab.Company("005930")
c.index # what's available -- topic list + periods
c.show("BS") # view data -- DataFrame per topic
c.select("IS", ["매출액"]) # extract data -- finance or docs, same pattern
c.trace("BS") # where it came from -- source provenance
c.diff() # what changed -- text changes across periods
Scan — Cross-Company Comparison
Design: ops/scan.md
Cross-company analysis across all listed firms. Governance, workforce, capital, debt, cashflow, audit, insider, quality, liquidity, network, account/ratio comparison, and more.
dartlab.scan("governance") # governance across all firms
dartlab.scan("ratio", "roe") # ROE across all firms
dartlab.scan("cashflow") # OCF/ICF/FCF + 8-pattern classification
Gather — External Market Data
Design: ops/gather.md
Price, flow, macro, news — all as Polars DataFrames.
dartlab.gather("price", "005930") # KR OHLCV
dartlab.gather("price", "AAPL", market="US") # US stock
dartlab.gather("macro", "FEDFUNDS") # auto-detects US
dartlab.gather("news", "삼성전자") # Google News RSS
Analysis — 14-Axis Financial Analysis
Design: ops/analysis.md
Revenue structure → profitability → growth → stability → cash flow → capital allocation → valuation → forecast. Turns raw statements into a causal narrative that feeds Review, AI, and direct human reading.
c.analysis("financial", "수익성") # profitability analysis
c.analysis("financial", "현금흐름") # cash flow analysis
print(c.credit()) # available-axes guide DataFrame (self-discovery)
c.credit("등급") # dCR-AA, healthScore 93/100
c.credit("등급", detail=True) # grade + narrative + metrics
Credit — Independent Credit Rating
Design: ops/credit.md | Reports: dartlab.pages.dev/blog/credit-reports
Independent credit analysis with 3-Track model (general/financial/holding), Notch Adjustment, CHS market correction, and separate financial statement blending.
79-company validation: large-cap 87% (26/30), mid-cap 82% (41/50), full sample 70% (55/79, re-measurement pending after v5.0 overvaluation fix). Samsung AA+ exact match. See methodology for validation details.
print(c.credit()) # self-discovery — available axes + grade
cr = c.credit("등급") # main grade
print(cr["grade"]) # dCR-AA+
print(cr["healthScore"]) # 96 (0-100, higher is better)
print(cr["pdEstimate"]) # 0.01% default probability
cr = c.credit("등급", detail=True) # grade + narrative + metrics + divergence explanation
print(cr["divergenceExplanation"]) # why it differs from agencies
Publish reports (credit narrative + audit are auto-included in review's 5막):
from dartlab.review.publisher import publishReport
publishReport("005930") # 6막 report including credit narrative + audit
Review — Analysis to Report
Design: ops/review.md
Assembles analysis into a structured report. 4 output formats: rich (terminal), html, markdown, json.
c.review() # full report
c.reviewer() # report + AI interpretation
Sample reports: Samsung Electronics · SK Hynix · Kia · HD Hyundai Heavy Industries · SK Telecom · LG Chem · NCSoft · Amorepacific
Search — Find Filings by Meaning (alpha)
Design: ops/search.md
No model, no GPU, no cold start. 95% precision on 4M documents — better than neural embeddings at 1/100th the cost. See methodology for benchmark details.
dartlab.search("유상증자 결정") # find capital raise filings
dartlab.search("대표이사 변경", corp="005930") # filter by company
dartlab.search("회사가 돈을 빌렸다") # natural language works too
AI — Active Analyst
Design: ops/ai.md
The AI writes and executes Python code using dartlab's full API. You see every line of code it runs. 60+ questions validated, 95%+ first-try success. See methodology for validation scope and limits.
dartlab.ask("Analyze Samsung Electronics financial health")
dartlab.ask("Samsung analysis", provider="gemini") # free providers available
Providers: gemini (free), groq (free), cerebras (free), oauth-codex (ChatGPT subscription), openai, ollama (local), and more. Auto-fallback across providers when rate-limited.
Channel — Use your PC dartlab from anywhere
Design: ops/channel.md
One command on your PC and dartlab UI works on your phone. Microsoft DevTunnels auto-setup.
dartlab channel
Flow:
- winget auto-installs the devtunnel CLI (one-time)
- GitHub OAuth (one-time, browser opens automatically)
- Permanent URL + QR code (
https://<id>-8400.<region>.devtunnels.ms) - Open the URL/QR on your phone Chrome → dartlab UI just works
Zero domains, zero token tricks. Same infrastructure as VS Code Remote Tunnels — verified mobile compatibility. Optional messaging bots: --telegram/slack/discord.
Architecture
L0 core/ Protocols, finance utils, docs utils, registry
L1 providers/ Country-specific data (DART, EDGAR, EDINET)
gather/ External market data (Naver, Yahoo, FRED)
scan/ Market-wide analysis — scan("group", "axis")
L2 analysis/ Financial + forecast + valuation + quant — analysis("group", "axis")
credit/ Independent credit rating — c.credit()
macro/ Market-level macro — dartlab.macro()
review/ Block composition (analysis + credit)
L3 ai/ Active analyst — dartlab.ask()
L4 vscode/ VSCode extension (dartlab chat --stdio)
Import direction enforced by CI. Adding a new country means one provider package — zero core changes.
Layer consumption flow
Who consumes whom across the stack:
flowchart TB
subgraph L4["L4 · User interface"]
UI["vscode / CLI / web"]
end
subgraph L3["L3 · LLM analyst"]
AI["ai<br/>dartlab.ask()"]
end
subgraph L2["L2 · Analysis"]
ANA["analysis<br/>causal financial + forecast + valuation"]
CRD["credit<br/>independent rating"]
MAC["macro<br/>market reading"]
REV["review<br/>block-composed report"]
end
subgraph L1["L1 · Data ingestion"]
PRV["providers<br/>DART / EDGAR / EDINET"]
GAT["gather<br/>FRED / ECOS / Naver / Yahoo"]
SCN["scan<br/>cross-market"]
QNT["quant<br/>25 technical indicators"]
end
subgraph L0["L0 · Infrastructure"]
CORE["core<br/>protocols + finance + docs + search"]
end
UI --> AI
AI --> REV
AI --> ANA
AI --> MAC
AI --> SCN
REV --> ANA
REV --> CRD
ANA --> PRV
ANA --> GAT
CRD --> PRV
MAC --> GAT
SCN --> PRV
QNT --> GAT
PRV --> CORE
GAT --> CORE
SCN --> CORE
QNT --> CORE
classDef l0 fill:#f5f5f5,stroke:#999
classDef l1 fill:#e8f4ff,stroke:#4a90e2
classDef l2 fill:#fff4e6,stroke:#e67e22
classDef l3 fill:#f0e6ff,stroke:#8e44ad
classDef l4 fill:#e6ffe6,stroke:#27ae60
class CORE l0
class PRV,GAT,SCN,QNT l1
class ANA,CRD,MAC,REV l2
class AI l3
class UI l4
Core rules:
- Arrows always flow top → bottom (L4→L3→L2→L1→L0). Reverse imports forbidden (CI-enforced)
- L2 engines never import each other — analysis ↛ credit, macro ↛ analysis. Composition is review's or ai's job
- When adding a feature, pick the right layer first and let data flow in one direction only
EDGAR (US)
Same interface, different data source. Auto-fetched from SEC API — no pre-download needed.
# Korea (DART) # US (EDGAR)
c = dartlab.Company("005930") c = dartlab.Company("AAPL")
c.sections c.sections
c.show("businessOverview") c.show("business")
c.show("BS") c.show("BS")
c.show("ratios") c.show("ratios")
c.diff("businessOverview") c.diff("10-K::item7Mdna")
Macro — Economy Without a Ticker
Design: ops/macro.md
No Company needed. Read the economy with import dartlab.
dartlab.macro("사이클") # Business cycle — 4 phases
dartlab.macro("금리") # Rates + Nelson-Siegel yield curve
dartlab.macro("예측") # LEI + Cleveland Fed probit + Hamilton RS + GDP Nowcast
dartlab.macro("위기") # Credit-to-GDP gap + Minsky + Koo + Fisher
dartlab.macro("기업집계") # Bottom-up: earnings cycle, Ponzi ratio, leverage
dartlab.macro("종합") # Macro summary + investment strategies + portfolio allocation
# Scenario
dartlab.macro("사이클", overrides={"hy_spread": 600})
# Backtest
dartlab.macro("금리", as_of="2022-01-01")
Cycle, rates, assets, sentiment, liquidity, forecast, crisis, inventory, corporate, trade signals — global macro methods (Hamilton EM, Kalman DFM, Nelson-Siegel, Cleveland Fed probit, Sahm Rule, BIS Credit-to-GDP, GHS, Minsky, Koo, Fisher, Cu/Au, FCI) implemented in numpy only (zero statsmodels/scipy).
Backtest result (2000-2024, FRED): Cleveland Fed probit detected 3/3 US recessions with 2-16 month lead time, recall 90% at threshold 0.20.
MCP — AI Assistant Integration
Built-in MCP server for Claude Desktop, Claude Code, Cursor, and any MCP-compatible client.
# Claude Code — one line setup
claude mcp add dartlab -- uv run dartlab mcp
# Codex CLI
codex mcp add dartlab -- uv run dartlab mcp
Claude Desktop / Cursor config
Add to claude_desktop_config.json or .cursor/mcp.json:
{
"mcpServers": {
"dartlab": {
"command": "uv",
"args": ["run", "dartlab", "mcp"]
}
}
}
Or auto-generate: dartlab mcp --config claude-desktop
OpenAPI — Raw Public APIs
from dartlab import OpenDart, OpenEdgar
# Korea (requires free API key from opendart.fss.or.kr)
d = OpenDart()
d.filings("삼성전자", "2024")
d.finstate("삼성전자", 2024)
# US (no API key needed)
e = OpenEdgar()
e.filings("AAPL", forms=["10-K", "10-Q"])
Data
All data is pre-built on HuggingFace — auto-downloads on first use. EDGAR data comes directly from the SEC API.
| Dataset | Coverage | Size |
|---|---|---|
| DART docs | 2,500+ companies | ~8 GB |
| DART finance | 2,700+ companies | ~600 MB |
| DART report | 2,700+ companies | ~320 MB |
| EDGAR | On-demand | SEC API |
Pipeline: local cache (instant) → HuggingFace (auto-download) → DART API (with your key). Most users never leave the first two.
Try It Now
Live Demo — no install, no Python
Notebooks: Company · Scan · Review · Gather · Analysis · Ask (AI)
Documentation
Docs · Quick Start · API Overview · Blog (120+ articles)
Stability
| Tier | Scope |
|---|---|
| Stable | DART Company (sections, show, trace, diff, BS/IS/CF, CIS, index, filings, profile), EDGAR Company core, valuation, forecast, simulation |
| Beta | EDGAR power-user (SCE, notes, freq, coverage), credit, insights, distress, ratios, timeseries, network, governance, workforce, capital, debt, chart/table/text tools, ask/chat, OpenDart, OpenEdgar, Server API, MCP |
| Experimental | AI tool calling, export, viz (charts) |
See docs/stability.md.
Contributing
Contributors are very welcome. Whether it's a bug report, a new analysis axis, a mapping fix, or a documentation improvement — every contribution makes dartlab better for everyone.
The one rule: experiment first, engine second. Validate your idea in experiments/ before changing the engine. This keeps the core stable while making it easy to try bold ideas.
- Experiment folder:
experiments/XXX_name/— each file must be independently runnable with actual results in its docstring - Data contributions (e.g.
accountMappings.json,sectionMappings.json): accepted when backed by experiment evidence - Issues and PRs in Korean or English are both welcome
- Not sure where to start? Open an issue — we'll help you find the right place
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dartlab-0.9.4.tar.gz.
File metadata
- Download URL: dartlab-0.9.4.tar.gz
- Upload date:
- Size: 14.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b0bb12f6aee1ef52ef52501f7d8c62acc14ddc2016b1de6a32bcc2a98b9b055
|
|
| MD5 |
d1e8b073a2c470d81aa0bfdebd010382
|
|
| BLAKE2b-256 |
1a05e88097194441bc3039053f46465de34bbe7005885ccb92d0efbc3eed7d9b
|
Provenance
The following attestation bundles were made for dartlab-0.9.4.tar.gz:
Publisher:
publish.yml on eddmpython/dartlab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dartlab-0.9.4.tar.gz -
Subject digest:
1b0bb12f6aee1ef52ef52501f7d8c62acc14ddc2016b1de6a32bcc2a98b9b055 - Sigstore transparency entry: 1258830090
- Sigstore integration time:
-
Permalink:
eddmpython/dartlab@71ffc2a8c9e685c172c14022b5d6dc6641656901 -
Branch / Tag:
refs/tags/v0.9.4 - Owner: https://github.com/eddmpython
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@71ffc2a8c9e685c172c14022b5d6dc6641656901 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dartlab-0.9.4-py3-none-any.whl.
File metadata
- Download URL: dartlab-0.9.4-py3-none-any.whl
- Upload date:
- Size: 15.2 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
515fb39fc84f792947e6626860b5a4b89116b19abdb76cdbc89b812265f8b115
|
|
| MD5 |
889bdfdc16efb15dd12b3a3d9efa7a47
|
|
| BLAKE2b-256 |
852c71cb4818b37d769beeff6e70beab3da014955aa33928d29586fa5ad5ce7b
|
Provenance
The following attestation bundles were made for dartlab-0.9.4-py3-none-any.whl:
Publisher:
publish.yml on eddmpython/dartlab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dartlab-0.9.4-py3-none-any.whl -
Subject digest:
515fb39fc84f792947e6626860b5a4b89116b19abdb76cdbc89b812265f8b115 - Sigstore transparency entry: 1258830153
- Sigstore integration time:
-
Permalink:
eddmpython/dartlab@71ffc2a8c9e685c172c14022b5d6dc6641656901 -
Branch / Tag:
refs/tags/v0.9.4 - Owner: https://github.com/eddmpython
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@71ffc2a8c9e685c172c14022b5d6dc6641656901 -
Trigger Event:
push
-
Statement type: