Skip to main content

SEC Web Scraper for the EDGAR API

Project description

sec-web-scraper

A Python based web scraper for the SEC EDGAR database

Github Issues codecov Github docs PyPI

Overview

This library will for scraping certain financial documents from the EDGAR database such as the 10-K (and it's versions such as 10-K405,10-KSB), 20-F and 40-F.

The two main features of the library will be:

  • A document downloader portion that will fetch documents from the EDGAR database based on parameters such as a text query, time period, company ticker, and file type.
  • A scraper that will parse sections and information from the retrieved files.

Installation

Please make sure you have Python 3.7 or higher.

You can check your python version with

python --version

Then run the command below!

pip install sec-web-scraper

Usage

# Downloader
from sec_web_scraper.Downloader import Downloader

# Create new downloader object
d = Downloader()

# input the year range for filing data
d.build_index_sec(2000, 2002)


# After you've built the index, see all forms type filed in that period as a list
d.get_forms()

# If you want to find the cik of company, provide the name (fuzzy match). Returns a list
d.get_company_info('apple')

# If you want all 8-K's filled in the range above.This is a DataFrame
res = d.find_files_by_type('8-K') 

#More features to be added!
#Scraper
from sec_web_scraper.Scraper import *

#With a particular filing
sample_10k = "https://www.sec.gov/Archives/edgar/data/20/0000893220-96-000500.txt"

#Get the raw text
raw_txt = get_document_given_link(sample_10k)

#Get the sections in the document
doc_tags = get_document_tags(raw_txt)

#More features to be added!

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sec-web-scraper-0.1.1.tar.gz (33.7 kB view details)

Uploaded Source

File details

Details for the file sec-web-scraper-0.1.1.tar.gz.

File metadata

  • Download URL: sec-web-scraper-0.1.1.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for sec-web-scraper-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f499d608e4c875a42020669ef6e437c785af275443d3c3cd2682eb910236ba15
MD5 9426c6bea26daae0e0fa3817a5c46052
BLAKE2b-256 c67d139c839c99f5b9c3f92632a8b0727e2b18b80a82cd7084144363eaa28f58

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page