Useful extensions for sec-edgar-downloader.
Project description
sec-downloader
Useful extensions for sec-edgar-downloader. Built with nbdev.
Install
pip install sec_downloader
Features
- Files are downloaded to a temporary folder, immediately read into memory, and then deleted.
- Use “glob” pattern to select which files are read to memory.
How to use
Download the metadata
Find a filing with an Accession Number
from sec_downloader import Downloader
dl = Downloader("MyCompanyName", "email@example.com")
metadata = dl.get_filing_metadatas("AAPL/0000320193-23-000077")
print(metadata[0])
FilingMetadata(accession_number='0000320193-23-000077',
form_type='10-Q',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/320193/000032019323000077/aapl-20230701.htm',
items='',
primary_doc_description='10-Q',
filing_date='2023-08-04',
report_date='2023-07-01',
cik='0000320193',
company_name='Apple Inc.',
tickers=[Ticker(symbol='AAPL', exchange='Nasdaq')])
Alternatively, you can also use any of these to get the same answer:
metadata = dl.get_filing_metadatas("aapl/000032019323000077")
metadata = dl.get_filing_metadatas("320193/000032019323000077")
metadata = dl.get_filing_metadatas("320193/0000320193-23-000077")
metadata = dl.get_filing_metadatas("0000320193/0000320193-23-000077")
metadata = dl.get_filing_metadatas(CompanyAndAccessionNumber(ticker_or_cik="320193", accession_number="0000320193-23-000077"))
Find the filing matching a SEC EDGAR Filing URL. Only CIK and Accession Number are used from the URL:
metadatas = dl.get_filing_metadatas(
"https://www.sec.gov/ix?doc=/Archives/edgar/data/0001067983/000119312523272204/d564412d8k.htm"
)
print(metadatas[0])
FilingMetadata(accession_number='0001193125-23-272204',
form_type='8-K',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
items='2.02,9.01',
primary_doc_description='8-K',
filing_date='2023-11-07',
report_date='2023-11-04',
cik='0001067983',
company_name='BERKSHIRE HATHAWAY INC',
tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
Ticker(symbol='BRK-A', exchange='NYSE')])
Alternatively, you can also URLs in other formats and get the same answer:
metadata = dl.get_filing_metadatas("https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm")
Find latest filings by company ticker or CIK:
from sec_downloader.types import RequestedFilings
metadata = dl.get_filing_metadatas(
RequestedFilings(ticker_or_cik="MSFT", form_type="10-K", limit=2)
)
print(metadatas)
[FilingMetadata(accession_number='0001193125-23-272204',
form_type='8-K',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/1067983/000119312523272204/d564412d8k.htm',
items='2.02,9.01',
primary_doc_description='8-K',
filing_date='2023-11-07',
report_date='2023-11-04',
cik='0001067983',
company_name='BERKSHIRE HATHAWAY INC',
tickers=[Ticker(symbol='BRK-B', exchange='NYSE'),
Ticker(symbol='BRK-A', exchange='NYSE')])]
Alternatively, you can also use any of these to get the same answer:
metadata = dl.get_filing_metadatas("2/msft/10-K")
metadata = dl.get_filing_metadatas("2/789019/10-K")
metadata = dl.get_filing_metadatas("2/0000789019/10-K")
The parameters limit
and form_type
are optional. If omitted, limit
defaults to 1, and form_type
defaults to ‘10-Q’.
metadatas = dl.get_filing_metadatas("NFLX")
print(metadatas)
[FilingMetadata(accession_number='0001065280-23-000273',
form_type='10-Q',
primary_doc_url='https://www.sec.gov/Archives/edgar/data/1065280/000106528023000273/nflx-20230930.htm',
items='',
primary_doc_description='10-Q',
filing_date='2023-10-20',
report_date='2023-09-30',
cik='0001065280',
company_name='NETFLIX INC',
tickers=[Ticker(symbol='NFLX', exchange='Nasdaq')])]
Alternatively, you can also use any of these to get the same answer:
metadata = dl.get_filing_metadatas("nflx")
metadata = dl.get_filing_metadatas("1/NFLX")
metadata = dl.get_filing_metadatas("NFLX/10-Q")
metadata = dl.get_filing_metadatas("1/NFLX/10-Q")
metadata = dl.get_filing_metadatas(RequestedFilings(ticker_or_cik="NFLX"))
metadata = dl.get_filing_metadatas(RequestedFilings(limit=1, ticker_or_cik="NFLX", form_type="10-Q"))
Download the HTML files
After obtaining the Primary Document URL, for example from the metadata, you can proceed to download the HTML using this URL.
for metadata in metadatas:
html = dl.download_filing(url=metadata.primary_doc_url).decode()
print(html[:50])
break # same for all filings, let's just print the first one
'<?xml version="1.0" ?><!--XBRL Document Created wi'
Advanced usage: Wrapper
If insteand of using the forked/modified sec-edgar-downloader
, you
want to wrap its output instead, you can use the wrapper class
SecDownloaderWrapper
.
Let’s demonstrate how to download a single file (latest 10-Q filing details in HTML format) to memory.
dl = Downloader("MyCompanyName", "email@example.com")
html = dl.get_latest_html("10-Q", "AAPL")
# Use dl.get_latest_n_html("10-Q", "AAPL", n=5) to get the latest 5 10-Qs
print(f"{html[:50]}...")
'<?xml version="1.0" ?><!--XBRL Document Created wi...'
Note The company name and email address are used to form a user-agent string that adheres to the SEC EDGAR’s fair access policy for programmatic downloading. Source
Which is implemented approximately as:
from sec_edgar_downloader import Downloader as SecEdgarDownloader
from sec_downloader import DownloadStorage
ONLY_HTML = "**/*.htm*"
storage = DownloadStorage(filter_pattern=ONLY_HTML)
with storage as path:
dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
dl.get("10-Q", "AAPL", limit=1, download_details=True)
# all files are now deleted and only stored in memory
content = storage.get_file_contents()[0].content
print(f"{content[:50]}...")
'<?xml version="1.0" ?><!--XBRL Document Created wi...'
Downloading multiple documents:
storage = DownloadStorage()
with storage as path:
dl = SecEdgarDownloader("MyCompanyName", "email@example.com", path)
dl.get("10-K", "GOOG", limit=2)
# all files are now deleted and only stored in memory
for path, content in storage.get_file_contents():
print(f"Path: {path}\nContent [len={len(content)}]: {content[:30]}...\n")
('Path: sec-edgar-filings/GOOG/10-K/0001652044-22-000019/full-submission.txt\n'
'Content [len=15044932]: <SEC-DOCUMENT>0001652044-22-00...\n')
('Path: sec-edgar-filings/GOOG/10-K/0001652044-23-000016/full-submission.txt\n'
'Content [len=15264470]: <SEC-DOCUMENT>0001652044-23-00...\n')
Contributing
Follow these steps to install the project locally for development:
- Install the project with the command
pip install -e ".[dev]"
.
Note We highly recommend using virtual environments for Python development. If you’d like to use virtual environments, follow these steps instead:
- Create a virtual environment
python3 -m venv .venv
- Activate the virtual environment
source .venv/bin/activate
- Install the project with the command
pip install -e ".[dev]"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sec_downloader-0.6.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ae7b56db94bf0d1570e60807037a69d9fadb3d87c7a6883ac9731cd9ff56a2d |
|
MD5 | 79b5b22fcd89843e3af324173d9068c5 |
|
BLAKE2b-256 | 1d93faf4fe14d656967022df87f50d450c625bc04ab4764b1d25d9c3c43e7008 |