Crunchbase and SEC filing scraper
Project description
CB_IPO
This is a library designed for quick webscraping in finding information on SEC filings with a focus on new IPOs and annual reports.
Overview
Researching information on trends for companies can be incredibly tedious, this library will automate part of the proccess making DCF building and IPO research easier. CB_IPO is a library that will fetch information on recent and historical IPOs by scraping the SEC EDGAR database for S-1 filings. These queries can also be modified to search for certain dates, and additional forms. This data can subsequently be placed in a pandas dataframe for the sake of easy viewing. A second proccess this autmoates is finding the specific 10-K filings for a company. By inputing a cik, a list of 10-k filing links will be returned. With these links, the library also has a function for using a 10-k link and returning a list of info such as assets, liabilities, and income.
Example
Suppose I want to find companies that have filed either an S-1, 10-K or 10-Q between January 2021 and March 2023
sc = scrape()
sc.set_search_date("2021-01-01", "2023-03-31")
sc.add_forms(['S-1','10-K', '10-Q'])
dataframe = sc.generate_df(10, 1)
Then this dataframe is returned
names filing date
0 Inhibikase Therapeutics, Inc. (IKT) 2023-03-31
1 AMERINST INSURANCE GROUP LTD 2023-03-31
2 SLM Student Loan Trust 2013-5 2023-03-31
3 Games & Esports Experience Acquisition Corp. ... 2023-03-31
4 Bilander Acquisition Corp. (TWCB, TWCBU, TWCBW) 2023-03-31
5 VirTra, Inc (VTSI) 2023-03-31
6 Actinium Pharmaceuticals, Inc. (ATNM) 2023-03-31
7 Genprex, Inc. (GNPX) 2023-03-31
8 Mega Matrix Corp. (MPU) 2023-03-31
9 Digital Media Solutions, Inc. (DMS, DMS-WT) 2023-03-31
Installation
CB_IPO
can be installed via PyPi by running:
pip install CB_IPO
Quick Start
To use CB_IPO
instantiate an instance by calling
instance = scrape()
To adjust search date ranges run (Dates in YYYY-MM-DD)
instance.set_search_date(START, END)
To add form types to the search run
instance.add_forms(['S-1','10-K'])
To get a dataframe with all companies filing within the specified paramateres and filing dates run
instance.generate_df(Number of entries per page, number of pages)
To get a list of links to 10-K filings by a company given CIK
instance.create_links(cik, number of files needed)
To scrape a 10-K link for elements like assets, liabilities, and Net Income run
instance.scrape_xbrl(link)
To calculate financial ratios from a dicitonary of financial elements run
instance.calculate_ratios(link)
To get a dataframe summarizing the 10-k elements run with an optional flag
instance.summarize_10k(link, flag)
Details
This project is a pure python project using modern tooling. It uses a Makefile
as a command registry, with the following commands:
make
: list available commandsmake develop
: install and build this library and its dependencies usingpip
make build
: build the library usingsetuptools
make lint
: perform static analysis of this library withflake8
andblack
make format
: autoformat this library usingblack
make annotate
: run type checking usingmypy
make test
: run automated tests withpytest
make coverage
: run automated tests withpytest
and collect coverage informationmake dist
: package library for distribution
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file CB_IPO-0.2.0.tar.gz
.
File metadata
- Download URL: CB_IPO-0.2.0.tar.gz
- Upload date:
- Size: 24.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7dcdd2a0c70e6032045e094cd72d056061767aebf1cd163e0f921f5814b725d |
|
MD5 | 2dea52a14ef1fa1ac0d408e1be270159 |
|
BLAKE2b-256 | 5ba5a82769bf2f7e5363b39c91cae5cfe6e01e286af1275bc2280b277e1dc491 |