SEC Web Scraper for the EDGAR API
Project description
sec-web-scraper
A Python based web scraper for the SEC EDGAR database
Overview
This library will for scraping certain financial documents from the EDGAR database such as the 10-K (and it's versions such as 10-K405,10-KSB), 20-F and 40-F.
The two main features of the library will be:
- A document downloader portion that will fetch documents from the EDGAR database based on parameters such as a text query, time period, company ticker, and file type.
- A scraper that will parse sections and information from the retrieved files.
Installation
Please make sure you have Python 3.7 or higher.
You can check your python version with
python --version
Then run the command below!
pip install sec-web-scraper
Usage
# Downloader
from sec_web_scraper.Downloader import Downloader
# Create new downloader object
d = Downloader()
# input the year range for filing data
d.build_index_sec(2000, 2002)
# After you've built the index, see all forms type filed in that period as a list
d.get_forms()
# If you want to find the cik of company, provide the name (fuzzy match). Returns a list
d.get_company_info('apple')
# If you want all 8-K's filled in the range above.This is a DataFrame
res = d.find_files_by_type('8-K')
#More features to be added!
#Scraper
from sec_web_scraper.Scraper import *
#With a particular filing
sample_10k = "https://www.sec.gov/Archives/edgar/data/20/0000893220-96-000500.txt"
#Get the raw text
raw_txt = get_document_given_link(sample_10k)
#Get the sections in the document
doc_tags = get_document_tags(raw_txt)
#More features to be added!
References
- Python project template from https://github.com/ColumbiaOSS/example-project-python
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sec-web-scraper-0.1.1.tar.gz
(33.7 kB
view details)
File details
Details for the file sec-web-scraper-0.1.1.tar.gz
.
File metadata
- Download URL: sec-web-scraper-0.1.1.tar.gz
- Upload date:
- Size: 33.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f499d608e4c875a42020669ef6e437c785af275443d3c3cd2682eb910236ba15 |
|
MD5 | 9426c6bea26daae0e0fa3817a5c46052 |
|
BLAKE2b-256 | c67d139c839c99f5b9c3f92632a8b0727e2b18b80a82cd7084144363eaa28f58 |