Skip to main content

Parse Securities and Exchange Commission Standard Generalized Markup Language (SEC SGML) files

Project description

SEC SGML

A python library to parse Securities and Exchange Commission Standardized Generalized Markup Language. Used to power the open-source datamule project.

Currently parses two types of files:

  1. Daily Archives
  2. Submissions

Will be expanded to also parse SGML Tables.

All Variations

Installation

pip install secsgml

Quickstart

from secsgml import parse_sgml_submission
# from file
parse_sgml_submission(filepath='samples/0000891618-94-000021.txt',output_dir='results')

# from content
parse_sgml_submission(content=sgml_content,output_dir='results')

Future

  • SGML Table parsing
  • Optimization + refactor in Cython/ C bindings.
  • Standardize metadata for different file types. Keys and values vary across variations, e.g. 'CIK' vs 'CENTRAL INDEX KEY' as well as values such as '34' vs '1934'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

secsgml-0.0.2.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

secsgml-0.0.2-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file secsgml-0.0.2.tar.gz.

File metadata

  • Download URL: secsgml-0.0.2.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.0.2.tar.gz
Algorithm Hash digest
SHA256 df5bdeb06a26cd0464a94d2f6bc1547df3c8623366e0119771fc393674654007
MD5 08ead7eaedc5162c413d59a5e741b23d
BLAKE2b-256 9c992c439fd0b1f86c17549001dc32e2749f77b9dda3bd95f58d1a1828e2725a

See more details on using hashes here.

File details

Details for the file secsgml-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: secsgml-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0308b06bd9154b1a9ce126ea05b3510f1fed916a140f37b13eb1c9c3d01a4a1c
MD5 91a5a17af14a9cd5423ceff054659168
BLAKE2b-256 0727b9b3818faf36f04a1076e975eb461888a0e4cfdc60202a0e5492f2f66ba4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page