Async-enabled fork of edgartools with enhanced XBRL support for non-US GAAP financial statements
Project description
Async-Enabled Python Library for SEC EDGAR Data Extraction
Async fork of edgartools with enhanced XBRL support for non-US GAAP statements (IFRS, etc.). Extract financial data without blocking your event loop - perfect for high-throughput financial data pipelines.
⭐ ALL CREDIT GOES TO EDGARTOOLS ⭐ This is a temporary fork created solely to add async support for immediate production needs. ALL the heavy lifting, design, and core functionality is from the brilliant work of Dwight Gunning and the edgartools community.
Use the original edgartools for production - it's actively maintained, has extensive documentation, and a strong community. This fork exists only to add async APIs until they're merged upstream.
If you find this useful, please ⭐ star and support the original project: https://github.com/dgunning/edgartools
SEC Filing Data Extraction with Python
| With EdgarTools | Without EdgarTools |
|---|---|
| ✅ Instant access to any filing since 1994 | ❌ Hours spent navigating SEC.gov |
| ✅ Clean Python API with intuitive methods | ❌ Complex web scraping code |
| ✅ Automatic parsing into pandas DataFrames | ❌ Manual extraction of financial data |
| ✅ Specialized data objects for each form type | ❌ Custom code for each filing type |
| ✅ One-line conversion to clean, readable text | ❌ Messy HTML parsing for text extraction |
| ✅ LLM-ready text extraction for AI pipelines | ❌ Extra processing for AI/LLM compatibility |
| ✅ Automatic throttling to avoid blocks | ❌ Rate limiting headaches |
Apple's income statement in 1 line of code
balance_sheet = Company("AAPL").get_financials().balance_sheet()
🚀 Quick Start (2-minute tutorial)
# 1. Import the library
from edgar import *
# 2. Tell the SEC who you are (required by SEC regulations)
set_identity("your.name@example.com") # Replace with your email
# 3. Find a company
company = Company("MSFT") # Microsoft
# 4. Get company filings
filings = company.get_filings()
# 5. Filter by form
insider_filings = filings.filter(form="4") # Insider transactions
# 6. Get the latest filing
insider_filing = insider_filings[0]
# 7. Convert to a data object
ownership = insider_filing.obj()
⚡ Async API (New in edgartools-async)
Perfect for high-throughput pipelines and concurrent processing:
import asyncio
from edgar import get_company_async, set_identity
async def main():
# Set identity BEFORE async operations
set_identity("your.name@example.com")
# Load company data without blocking event loop
company = await get_company_async("AAPL", user_agent="your.name@example.com")
# Load SGML data asynchronously
filings = company.get_filings(form="10-K")
sgml = await filings[0].sgml_async()
# Batch load multiple filings concurrently
from edgar._filings import load_sgmls_concurrently
filings_list = list(company.get_filings(form="10-Q"))[:10]
sgmls = await load_sgmls_concurrently(filings_list, max_in_flight=32)
print(f"Loaded {len(sgmls)} filings concurrently!")
asyncio.run(main())
Key Async Features:
get_company_async(): Non-blocking company instantiationfiling.sgml_async(): Async SGML file loadingload_sgmls_concurrently(): Batch concurrent loading with rate limiting- Thread-safe identity management: No stdin blocking in async contexts
🌍 Enhanced Non-US GAAP Support (New in edgartools-async)
Improved handling of international financial statements (IFRS, etc.):
Key Enhancements:
- IFRS taxonomy support: Better detection and parsing of IFRS statements
- Quarterly vs YTD fallback: Intelligently selects best available periods (prefers 3-month, falls back to YTD for cash flow)
- Sparse period filtering: Removes comparison periods with incomplete data
- Improved concept matching: Better revenue/income detection across taxonomies
- Abstract element inference: Automatically identifies abstract/header rows
- Revenue deduplication: Smarter handling of dimensional breakdowns vs parent totals
Example: Foreign Filer with IFRS
from edgar import Company
# Works seamlessly with non-US GAAP filers
company = Company("SAP") # German company using IFRS
financials = company.income_statement(periods=4, annual=True)
# Automatically detects and parses IFRS taxonomy
SEC Filing Analysis: Real-World Solutions
Company Financial Analysis
Problem: Need to analyze a company's financial health across multiple periods.
📚 Documentation
👥 Community & Support
- GitHub Issues - Bug reports and feature requests
- Discussions - Questions and community discussions
🔮 Roadmap
- Coming Soon: Enhanced visualization tools for financial data
- In Development: Machine learning integrations for financial sentiment analysis
- Planned: Interactive dashboard for filing exploration
🤝 Contributing
We welcome contributions from the community! Here's how you can help:
- Code: Fix bugs, add features, improve documentation
- Examples: Share interesting use cases and examples
- Feedback: Report issues or suggest improvements
- Spread the Word: Star the repo, share with colleagues
See our Contributing Guide for details.
❤️ Sponsors & Support
If you find EdgarTools valuable, please consider supporting its development:
Your support helps maintain and improve EdgarTools for the entire community!
Key Features for SEC Data Extraction and Analysis
- Comprehensive Filing Access: Retrieve any SEC filing (10-K, 10-Q, 8-K, 13F, S-1, Form 4, etc.) since 1994.
- Financial Statement Extraction: Easily access Balance Sheets, Income Statements, Cash Flows, and individual line items using XBRL tags or common names.
- SEC EDGAR API: Programmatic access to the complete SEC database.
- Smart Data Objects: Automatic parsing of filings into structured Python objects.
- Fund Holdings Analysis: Extract and analyze 13F holdings data for investment managers.
- Insider Transaction Monitoring: Get structured data from Form 3, 4, 5 filings.
- Clean Text Extraction: One-line conversion from filing HTML to clean, readable text suitable for NLP.
- Targeted Section Extraction: Pull specific sections like Risk Factors (Item 1A) or MD&A (Item 7).
- AI/LLM Ready: Text formatting and chunking optimized for AI pipelines.
- Performance Optimized: Leverages libraries like
lxmland potentiallyPyArrowfor efficient data handling. - XBRL Support: Extract and analyze XBRL-tagged data.
- Intuitive API: Simple, consistent interface for all data types.
EdgarTools is distributed under the MIT License.
📊 Star History
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file edgartools_async-1.0.11.tar.gz.
File metadata
- Download URL: edgartools_async-1.0.11.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1da44e94598a7a2ff61d8cdb6f9d337b1c7e34bd274e3deed2b8d0315b2585a
|
|
| MD5 |
290c69f497fa8c54deab602900f23f03
|
|
| BLAKE2b-256 |
4f0f9e4870672237bcbbe2ab401fb337d8c87acf083d9fc6cdd282025663184d
|
File details
Details for the file edgartools_async-1.0.11-py3-none-any.whl.
File metadata
- Download URL: edgartools_async-1.0.11-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1636ca33500adc913eb4858eb145080036ee5d08c300374dc7fb81ee134dd6e
|
|
| MD5 |
960a00d16676066007cce019857b30da
|
|
| BLAKE2b-256 |
22335474d1c512a5957de2b9a57ea72d19bc75357e2a83f0cc79acef08ee9a26
|