Skip to main content

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.

Project description

 

sec-parser

Essentials ➔       Licence Project Type: Federation Beta
Health ➔              GitHub Workflow Status: ci.yml GitHub Workflow Status: cd.yml Last Commit
Quality ➔             codecov Code Style: Black Ruff
Distribution ➔    PyPI version PyPI - Python Version PyPI downloads
Community ➔     HitCount X (formerly Twitter) Follow GitHub stars

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual structure of the document.


Overview

The sec-parser project simplifies the process of extracting meaningful information from SEC EDGAR HTML documents. It organizes the document's source code into a list or tree of elements that correspond to the visual structure of the document. This includes distinct elements for section titles, paragraphs, and tables, making the data easier to analyze and understand.

This tool is especially beneficial for Artificial Intelligence (AI) and Large Language Models (LLM) applications. It significantly improves the efficiency of data extraction and analysis in these fields.

Explore the Demo!

Installation

You can install sec-parser using pip:

pip install sec-parser

Usage

import sec_parser as sp

tree = sp.parse_latest("10-K", ticker="AAPL")

# Show the general structure of the tree
print(tree.render())

Console output:

RootSectionElement: PART I — FINANCIAL INFORMATION
├── TitleElement: Item 1. Financial Statements
│   ├── TitleElement: CONDENSED CONSOLIDATED STATEMENTS OF OPERATIONS (U...
│   │   ├── TextElement: (In millions, except number of shares which are re...
│   │   ├── TableElement: ...
│   ...

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sec_parser-0.11.0.post1.tar.gz (25.2 kB view hashes)

Uploaded Source

Built Distribution

sec_parser-0.11.0.post1-py3-none-any.whl (39.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page