Skip to main content

Scrape data from SEC's EDGAR

Project description

A small library to access files from SEC’s edgar.

Installation

pip install edgar

Example

To get a company’s latest 5 10-Ks, run

from edgar import Company
company = Company("Oracle Corp", "0001341439")
tree = company.get_all_filings(filing_type = "10-K")
docs = Company.get_documents(tree, no_of_documents=5)

or

from edgar import Company, TXTML

company = Company("INTERNATIONAL BUSINESS MACHINES CORP", "0000051143")
doc = company.get_10K()
text = TXTML.parse_full_10K(doc)

To get all companies and find a specific one, run

from edgar import Edgar
edgar = Edgar()
possible_companies = edgar.find_company_name("Cisco System")

To get XBRL data, run

from edgar import Company, XBRL, XBRLElement

company = Company("Oracle Corp", "0001341439")
results = company.get_data_files_from_10K("EX-101.INS", isxml=True)
xbrl = XBRL(results[0])
XBRLElement(xbrl.relevant_children_parsed[15]).to_dict() // returns a dictionary of name, value, and schemaRef

API

Company

The Company class has two fields:

  • name (company name)
  • cik (company CIK number)
  • timeout (optional) (default: 10)

get_filings_url

Returns a url to fetch filings data

  • Input
    • filing_type: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it’ll return all documents
    • prior_to: Time prior which documents are to be retrieved. If not specified, it’ll return all documents
    • ownership: defaults to include. Options are include, exclude, only.
    • no_of_entries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.

get_all_filings

Returns the HTML in the form of lxml.html

  • Input
    • filing_type: The type of document you want. i.e. 10-K, S-8, 8-K. If not specified, it’ll return all documents
    • prior_to: Time prior which documents are to be retrieved. If not specified, it’ll return all documents
    • ownership: defaults to include. Options are include, exclude, only.
    • no_of_entries: defaults to 100. Returns the number of entries to be returned. Maximum is 100.

get_10Ks

Returns the HTML in the form of lxml.html of concatenation of all the documents in the 10-K

  • Input
    • no_of_documents (default: 1): numer of documents to be retrieved

get_document_type_from_10K

Returns the HTML in the form of lxml.html of the document within 10-K

  • Input
    • document_type: Tye type of document you want, i.e. 10-K, EX-3.2
    • no_of_documents (default: 1): numer of documents to be retrieved

get_data_files_from_10K

Returns the HTML in the form of lxml.html of the data file within 10-K

  • Input
    • document_type: Tye type of document you want, i.e. EX-101.INS
    • no_of_documents (default: 1): numer of documents to be retrieved
    • isxml (default: False): by default, things aren’t case sensitive and is parsed with html in lxml. If this is True, then it is parsed withetree` which is case sensitive

get_documents (class method)

Returns a list of strings, each string contains the body of the specified document from input

  • Input
    • tree: lxml.html form that is returned from Company.getAllFilings
    • no_of_documents: number of document returned. If it is 1, the returned result is just one string, instead of a list of strings. Defaults to 1.
    • debug (default: False): if True, displays the URL and form

Edgar

Gets all companies from EDGAR

get_cik_by_company_name

  • Input
    • name: name of the company

get_company_name_by_cik

  • Input
    • cik: cik of the company

find_company_name

  • Input
    • words: input words to search the company

XBRL

Parses data from XBRL

  • relevant_children
    • get children that are not context
  • relevant_children_parsed
    • get children that are not context, unit, schemaRef
    • cleans tags

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for edgar, version 5.1.15
Filename, size File type Python version Upload date Hashes
Filename, size edgar-5.1.15-py3-none-any.whl (20.0 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size edgar-5.1.15.tar.gz (7.2 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page