Skip to main content

Useful extensions for sec-edgar-downloader.

Project description

sec-downloader

GitHub Workflow Status PyPI - Python Version PyPI version Licence

Useful extensions for sec-edgar-downloader. Built with nbdev.

Install

pip install sec_downloader

Features

  • Files are downloaded to a temporary folder, immediately read into memory, and then deleted.
  • Use “glob” pattern to select which files are read to memory.

How to use

Download the metadata

Find a filing with an Accession Number

from sec_downloader import Downloader

dl = Downloader("MyCompanyName", "email@example.com")
metadata = dl.get_filing_metadatas("AAPL/0000320193-23-000077")
print(metadata[0])
FilingMetadata(accession_number='0000320193-23-000077',
               form_type='10-Q',
               primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',
               items='',
               primary_doc_description='10-Q',
               filing_date='2023-08-04',
               report_date='2023-07-01',
               cik='0000320193',
               company_name='Apple Inc.',
               tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])

Alternatively, you can also use any of these to get the same answer:

metadata = dl.get_filing_metadatas("aapl/000032019323000077")
metadata = dl.get_filing_metadatas("320193/000032019323000077")
metadata = dl.get_filing_metadatas("320193/0000320193-23-000077")
metadata = dl.get_filing_metadatas("0000320193/0000320193-23-000077")
metadata = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik="320193", accession_number="0000320193-23-000077"))

Find the filing matching a SEC EDGAR Filing URL. Only CIK and Accession Number are used from the URL:

metadatas = dl.get_filing_metadatas(
    "https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm"
)
print(metadatas[0])
FilingMetadata(accession_number='0001193125-23-272204',
               form_type='8-K',
               primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
               items='2.02,9.01',
               primary_doc_description='8-K',
               filing_date='2023-11-07',
               report_date='2023-11-04',
               cik='0001067983',
               company_name='BERKSHIRE HATHAWAY INC',
               tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
                        Ticker(symbol='BRK-A', exchange='NYSE')])

Alternatively, you can also URLs in other formats and get the same answer:

metadata = dl.get_filing_metadatas("https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm")

Find latest filings by company ticker or CIK:

from sec_downloader.types import RequestedFilings

metadata = dl.get_filing_metadatas(
    RequestedFilings(ticker_or_cik="MSFT", form_type="10-K", limit=2)
)
print(metadatas)
[FilingMetadata(accession_number='0001193125-23-272204',
                form_type='8-K',
                primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
                items='2.02,9.01',
                primary_doc_description='8-K',
                filing_date='2023-11-07',
                report_date='2023-11-04',
                cik='0001067983',
                company_name='BERKSHIRE HATHAWAY INC',
                tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
                         Ticker(symbol='BRK-A', exchange='NYSE')])]

Alternatively, you can also use any of these to get the same answer:

metadata = dl.get_filing_metadatas("2/msft/10-K")
metadata = dl.get_filing_metadatas("2/789019/10-K")
metadata = dl.get_filing_metadatas("2/0000789019/10-K")

The parameters limit and form_type are optional. If omitted, limit defaults to 1, and form_type defaults to ‘10-Q’.

metadatas = dl.get_filing_metadatas("NFLX")
print(metadatas)
[FilingMetadata(accession_number='0001065280-23-000273',
                form_type='10-Q',
                primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',
                items='',
                primary_doc_description='10-Q',
                filing_date='2023-10-20',
                report_date='2023-09-30',
                cik='0001065280',
                company_name='NETFLIX INC',
                tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]

Alternatively, you can also use any of these to get the same answer:

metadata = dl.get_filing_metadatas("nflx")
metadata = dl.get_filing_metadatas("1/NFLX")
metadata = dl.get_filing_metadatas("NFLX/10-Q")
metadata = dl.get_filing_metadatas("1/NFLX/10-Q")
metadata = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik="NFLX"))
metadata = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik="NFLX", form_type="10-Q"))

Download the HTML files

After obtaining the Primary Document URL, for example from the metadata, you can proceed to download the HTML using this URL.

for metadata in metadatas:
    html = dl.download_filing(url=metadata.primary_doc_url).decode()
    print(html[:50])
    break  # same for all filings, let's just print the first one
'<?xml version="1.0" ?><!--XBRL Document Created wi'

Advanced usage: Wrapper

If insteand of using the forked/modified sec-edgar-downloader, you want to wrap its output instead, you can use the wrapper class SecDownloaderWrapper.

Let’s demonstrate how to download a single file (latest 10-Q filing details in HTML format) to memory.

dl = Downloader("MyCompanyName", "email@example.com")
html = dl.get_latest_html("10-Q", "AAPL")
# Use dl.get_latest_n_html("10-Q", "AAPL", n=5) to get the latest 5 10-Qs
print(f"{html[:50]}...")
'<?xml version="1.0" ?><!--XBRL Document Created wi...'

Note The company name and email address are used to form a user-agent string that adheres to the SEC EDGAR’s fair access policy for programmatic downloading. Source

Which is implemented approximately as:

from sec_edgar_downloader import Downloader as SecEdgarDownloader
from sec_downloader import DownloadStorage

ONLY_HTML = "**/*.htm*"

storage = DownloadStorage(filter_pattern=ONLY_HTML)
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-Q", "AAPL", limit=1, download_details=True)
# all files are now deleted and only stored in memory

content = storage.get_file_contents()[0].content
print(f"{content[:50]}...")
'<?xml version="1.0" ?><!--XBRL Document Created wi...'

Downloading multiple documents:

storage = DownloadStorage()
with storage as path:
    dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
    dl.get("10-K", "GOOG", limit=2)
# all files are now deleted and only stored in memory

for path, content in storage.get_file_contents():
    print(f"Path: {path}\nContent [len={len(content)}]: {content[:30]}...\n")
('Path: sec-edgar-filings/GOOG/10-K/0001652044-22-000019/full-submission.txt\n'
 'Content [len=15044932]: <SEC-DOCUMENT>0001652044-22-00...\n')
('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\n'
 'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\n')

Contributing

Follow these steps to install the project locally for development:

  1. Install the project with the command pip install -e ".[dev]".

Note We highly recommend using virtual environments for Python development. If you’d like to use virtual environments, follow these steps instead:

  • Create a virtual environment python3 -m venv .venv
  • Activate the virtual environment source .venv/bin/activate
  • Install the project with the command pip install -e ".[dev]"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sec-downloader-0.6.2.tar.gz (12.6 kB view hashes)

Uploaded Source

Built Distribution

sec_downloader-0.6.2-py3-none-any.whl (10.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page