Python tool to collect SEC filings for all publicly traded companies. Easily fetch forms like 10-K, 10-Q, and 8-K, along with links and document contents. Ideal for analysts, researchers, and anyone exploring financial reports or SEC data. Simplify your access to essential company information
Project description
secfi Library
secfi is a free Python library made to simplify access to SEC (U.S. Securities and Exchange Commission) filings and perform basic web scraping of the retrieved documents
- Installation
- Features
- 1.
getCiks: Full and up to date, ticker/CIK securities dataframe (+10k tickers) - 2.
getFils: Get a Dataframe with all SEC filings for a specific ticker - 3.
scrapLatest: Retrieves plain text content of the latest SEC filing of a specified form type for a given company ticker - 4.
scrap: Scrapes the raw text content of a given URL - 5.
secForms: Provides a list of all available SEC form types - 6.
chunkText: Splits a long text into evenly distributed chunks with overlap
- 1.
- Notes
- License
Ypu can try this in free colab:
## Installation
pip install secfi
Features
1. getCiks()
Fetches a DataFrame of all company tickers and their corresponding Central Index Keys (CIKs).
import secfi
ciks = secfi.getCiks()
print(ciks.head())
Returns: A DataFrame with columns:
cik_str– The raw CIK string.title– The company name.cik– The CIK padded to 10 digits (for SEC queries).
| ticker | cik_str | title | cik |
|--------|----------|-----------------------------|------------|
| NVDA | 1045810 | NVIDIA CORP | 0001045810 |
| AAPL | 320193 | Apple Inc. | 0000320193 |
| MSFT | 789019 | MICROSOFT CORP | 0000789019 |
| AMZN | 1018724 | AMAZON COM INC | 0001018724 |
| GOOGL | 1652044 | Alphabet Inc. | 0001652044 |
| ... | ... | ... | ... |
2. getFils(ticker: str)
Fetches recent filings for a specific company by its ticker.
import secfi
filings = secfi.getFils("AAPL")
print(filings.head())
Parameters:
ticker(str): The company's ticker symbol.
Returns: A DataFrame like:
| filingDate | reportDate | form | filmNumber | size | isXBRL | url |
|------------|------------|---------|------------|---------|--------|--------------|
| 2024-11-01 | 2024-09-30 | 10-Q | 241416538 | 9185722 | 1 | sec.gov/... |
| 2024-08-02 | 2024-06-30 | 10-Q | 241168331 | 8114974 | 1 | sec.gov/... |
| 2024-05-01 | 2024-03-31 | 10-Q | 24899170 | 7428154 | 1 | sec.gov/... |
| 2024-04-11 | 2024-05-22 | DEF 14A | 24836785 | 8289378 | 1 | sec.gov/... |
| 2024-02-02 | 2023-12-31 | 10-K | 24588330 | 12110804| 1 | sec.gov/... |
| 2023-10-27 | 2023-09-30 | 10-Q | 231351529 | 7894342 | 1 | sec.gov/... |
| ... | ... | ... | ... | ... | ... | ... |
3. scrapLatest(ticker: str, form: str)
Retrieves the textual content of the latest SEC filing of a specific form type for a given ticker.
The SEC provides 165 different types of forms. You can find the complete list in the following CSV file:
10 Most Common Forms for Public Companies:
- 10-K: Annual report that provides a comprehensive overview of the company's business and financial condition.
- 10-Q: Quarterly report that includes unaudited financial statements and provides a continuing view of the company's financial position.
- 8-K: Report used to announce major events that shareholders should know about (e.g., acquisitions, leadership changes).
- S-1: Registration statement for companies planning to go public with an initial public offering (IPO).
- S-3: Registration statement for secondary offerings or resales of securities.
- DEF 14A: Proxy statement used for shareholder meetings, including executive compensation and voting matters.
- 4: Statement of changes in beneficial ownership (insider trading disclosures).
- 3: Initial statement of beneficial ownership of securities (insider ownership).
- 6-K: Report submitted by foreign private issuers to disclose information provided to their home country's regulators.
- 13D: Filing by anyone acquiring more than 5% of a company's shares, detailing their intentions.
10 Most Common Forms for Foreign Companies:
- 6-K: Quarterly or event-specific report submitted by foreign private issuers, serving a similar role to the 10-Q for U.S. companies.
- 20-F: Annual report for foreign private issuers, equivalent to the 10-K for U.S. companies.
- 40-F: Annual report filed by certain Canadian companies under the U.S.-Canada Multijurisdictional Disclosure System.
- F-1: Registration statement for foreign companies planning an initial public offering (IPO) in the U.S.
- F-3: Registration statement for foreign companies conducting secondary offerings in the U.S.
- F-4: Registration statement for mergers, acquisitions, or business combinations involving foreign companies.
- CB: Filing required for tender offers, rights offerings, or business combinations involving foreign private issuers.
- 13F: Quarterly report by institutional investment managers disclosing equity holdings, applicable to some foreign firms.
- 11-K: Annual report for employee stock purchase, savings, and similar plans for foreign issuers.
- SD: Specialized disclosure report, often related to conflict minerals, applicable to foreign private issuers with U.S. reporting obligations.
Example
import secfi
secfi.scrapLatest("NVDA", "10-Q")
Example Output
When calling the scrapLatest("NVDA", "10-Q") function, the returned dictionary might look like this:
{
'filingDate': '2024-11-27',
'reportDate': '2024-11-25',
'form': '4',
'filmNumber': '',
'size': 4872,
'isXBRL': 0,
'url': 'https://www.sec.gov/Archives/edgar/data/0001045810/000104581024000318/xslF345X05/wk-form4_1732744744.xml',
'acceptanceDateTime': '2024-11-27T16:59:12.000Z',
'text': 'STATESSECURITIES AND EXCHANGE COMMISSIONWashington, D.C.\nFor the quarterly period ended October, 2024 OR TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934Commission File Number: 0-23985 NVIDIA CORPORATION(Exact name of registrant as specified in its charter) Delaware94-3177549(State or other jurisdiction of(I.R.S. Employerincorporation or organization)Identification No.)2788 San Tomas Expressway, Santa Clara, California95051(Address\xa0of principal executive offices)(Zip Code)(408) 486-2000 ....'
}
Parameters:
ticker(str): The company's ticker symbol.form(str): The form type to retrieve (e.g., "10-K", "8-K").
Returns:
dict: A dictionary containing details of the filing, including:
filingDate(str): The date the filing was submittedreportDate(str): The reporting period dateform(str): The type of SEC form (e.g., '10-K', '10-Q')filmNumber(str): The film number associated with the filingsize(int): The size of the filing in bytesisXBRL(int): Whether the filing is in XBRL format (1 for yes, 0 for no)url(str): The URL of the filingtext(str): The text content of the filing if found and successfully scraped Otherwise, an empty string
If the specified form is not found for the given ticker, returns an empty dictionary
4. scrap(url: str, timeout: int = 15)
Scrapes the textual content of a given URL.
content = secfi.scrap("https://example.com")
print(content[:500]) # Preview the first 500 characters
Parameters:
url(str): The URL to scrape.timeout(int): Timeout for the HTTP request (default is 15 seconds).
Returns: The cleaned text content of the URL or an error message if the request fails.
5. secForms()
Fetches a DataFrame of SEC forms and their details from the sec_forms.csv file located in the info directory.
import secfi
sec_forms = secfi.secForms()
print(sec_forms.head())
Returns: A DataFrame with columns:
Number– The unique identifier for the form.Description– A brief description of the form.Last Updated– The last updated date of the form.SEC Number– The SEC-assigned identifier for the form.Topic(s)– Relevant topics associated with the form.link– A direct URL to the PDF version of the form.
| Number | Description | Last Updated | SEC Number | Topic(s) | link |
|---|---|---|---|---|---|
| 1 | Application for registration or exemption from... | Feb. 1999 | SEC1935 | Self-Regulatory Organizations | |
| 1-A | Regulation A Offering Statement (PDF) | Sept. 2021 | SEC486 | Securities Act of 1933, Small Businesses | |
| 1-E | Notification under Regulation E (PDF) | Aug. 2001 | SEC1807 | Investment Company Act of 1940, Small Busin... | |
| ... | ... | ... | ... | ... | ... |
6. chunkText
Splits a long text into chunks of a specified maximum length with overlap, ensuring all text is evenly distributed and the last chunk is appended to the previous one.
Parameters:
text(str): The input text to split.max_length(int, optional): The maximum length of each chunk. Defaults to 10,000.overlap(int, optional): The number of overlapping characters between consecutive chunks. Defaults to 300.
Returns:
dict: A dictionary containing the following keys:total_chars(int): The total number of characters in the input text.max_length_config(int): The adjusted maximum length for each chunk after recalculation.total_chunks(int): The total number of chunks generated.chunks(list): A list of text chunks.
Example:
import secfi
text = """
Se cierra Armani, el taco no, hace la personal y ahi se va, se va\n
Se viene Martínez para el gol y va el tercero y va el tercero\n
Y va el tercero y gol de River gol de River goooool
"""
res = secfi.chunkText(text, max_length=120, overlap=20)
chunks - res["chunks"]
print(f"Original text chars: {res['total_chars']}")
print(f"Total chunks: {res['total_chunks']}")
for i, chunk in enumerate(chunks):
print(f"Chunk number: {i+1}\n{chunk}")
Output
Original text chars: 190
Total chunks: 2
Chunk number: 1
Se cierra Armani, el taco no, hace la personal y ahi se va, se va
Se viene Martínez para el gol y va
Chunk number: 2
nez para el gol y va el tercero y va el tercero
Y va el tercero y gol de River gol de River goooool
ol de River goooool
Notes
- The library uses a custom
User-Agentto comply with SEC API requirements. - Ensure that requests to the SEC website respect their usage policies and rate limits.
License
This project is open source and available under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file secfi-0.1.11.tar.gz.
File metadata
- Download URL: secfi-0.1.11.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
965ba626cf5ab2a587c6a1be58fe73885a76d0cfed3dc7d3a9aa7e8ee023ae07
|
|
| MD5 |
c0fbb36a1069bfbcd57a9972979f3d2e
|
|
| BLAKE2b-256 |
334024c2f4379e9f3385d7e7d5342520ff6bedb2da2db1aecbd9a7f7dfe468c2
|
File details
Details for the file secfi-0.1.11-py3-none-any.whl.
File metadata
- Download URL: secfi-0.1.11-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8264fa21b890bd401de7c0cccf8d0e0cb89abf4b3187dbf8eaecd330d8d87a45
|
|
| MD5 |
b104137dda61d631a63bd66a6d7ecab6
|
|
| BLAKE2b-256 |
aa5f0af353936681e53fa6219b90d5d5688bd8ea57b266ef47777fe459584de1
|