Skip to main content

Parse Securities and Exchange Commission Standard Generalized Markup Language (SEC SGML) files

Project description

SEC SGML

A python library to parse Securities and Exchange Commission Standardized Generalized Markup Language. Used to power the open-source datamule project.

Currently parses two types of files:

  1. Daily Archives
  2. Submissions

Will be expanded to also parse SGML Tables.

All Variations

secsgml also attempts to standardize the metadata between formats. e.g. 'CENTRAL INDEX KEY' will be mapped to 'cik'.

Installation

pip install secsgml

Quickstart

Parse into memory

from secsgml import parse_sgml_submission_into_memory
metadata,documents = parse_sgml_submission_into_memory(filepath="000000443897000001.sgml")

Parse to file

from secsgml import parse_sgml_submission
# from file
parse_sgml_submission(filepath='samples/0000891618-94-000021.txt',output_dir='results')

# from content
parse_sgml_submission(content=sgml_content,output_dir='results')

Note

Will be giving parse_sgml_submission_into_memory more love, will have to refactor parse_sgml_submission afterwards.

Future

  • SGML Table parsing
  • Optimization + refactor in Cython/ C bindings.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

secsgml-0.1.1.tar.gz (177.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

secsgml-0.1.1-cp313-cp313-win_amd64.whl (263.5 kB view details)

Uploaded CPython 3.13Windows x86-64

secsgml-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (762.1 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

secsgml-0.1.1-cp313-cp313-macosx_10_13_universal2.whl (356.2 kB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

secsgml-0.1.1-cp312-cp312-win_amd64.whl (264.1 kB view details)

Uploaded CPython 3.12Windows x86-64

secsgml-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (772.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

secsgml-0.1.1-cp312-cp312-macosx_10_13_universal2.whl (359.2 kB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

secsgml-0.1.1-cp311-cp311-win_amd64.whl (264.5 kB view details)

Uploaded CPython 3.11Windows x86-64

secsgml-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

secsgml-0.1.1-cp311-cp311-macosx_10_9_universal2.whl (356.6 kB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file secsgml-0.1.1.tar.gz.

File metadata

  • Download URL: secsgml-0.1.1.tar.gz
  • Upload date:
  • Size: 177.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.1.tar.gz
Algorithm Hash digest
SHA256 70d6a6b68ee7721f0a6704402caa3c788022fb5e670497e2d46b0b4415edf1ed
MD5 ebb4c1a74d71bbd97436576e9a2630fc
BLAKE2b-256 adf2a8ae1fe3cbd62fd415c896a5d4546f167244fcb50178c712bf5c477d996b

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: secsgml-0.1.1-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 263.5 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 8a821df60d3fb183e19b0c4673b8aded9d649e670bcf44941203301979e81524
MD5 3d8f6999f0966e7c1f347e8fc87f8b6d
BLAKE2b-256 fbc3aa70f613494062ab15e85c9dc55b987058555d39a9cac358268c5e6bd2ee

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for secsgml-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b73d4d4e1c0441e83a2e6441be2d7fb2e9fc8aea756bed7fba76c24f6b15dc9b
MD5 c13869b73eff8fc0cd3bd706fa7f0d32
BLAKE2b-256 3272da4be3f31b4f608de0c6d3538b930f53c317036c82a86a6a7b9c5d120729

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for secsgml-0.1.1-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 78f564cdd6ebaa5cdb1965dce2c3cee6d15068d788b44ba36b20ddbc8ca57cc3
MD5 a2d610b2890bcad0697d9d7258bf5d61
BLAKE2b-256 fd1148d2449f3ca188054b20c8dc7ea90327f1b8ec190f644511d0f08b2d6839

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: secsgml-0.1.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 264.1 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 8f63d198cb6eebcda02596db0bf30bf4125adbc478aceae8dad9b4e96f627d4b
MD5 d8ed5136d734ed461c26fb92da2f5c6d
BLAKE2b-256 9198c85d59c562ab6e8cdbce4f1c87d974f0f2ead50244ed1c89b2a6c0c9c6e9

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for secsgml-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd79621d4f0b79527d6df2ea2047741de69e1ea7d27170aa163ad920cd0ec53e
MD5 ef5410e7dffe965b32a311d4cf515dfc
BLAKE2b-256 d6286382bc41d6cf323e6455e53e5badc6fd14c4b9cb86c10ee0d21827f7231a

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for secsgml-0.1.1-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 f8643a4f36e1e4d80f895de8ef2dee670bd1e71845484028b7125de0bdbe233c
MD5 4a6cfd02abf3b933cadc648a05ea17c9
BLAKE2b-256 a3189a43ac560ab98587419857589ab092d02357647a0ff1a2ca3d07e2007067

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: secsgml-0.1.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 264.5 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 5adfb150260bcb3b733a0f7ed59971a036efeba1e87f9312155edbb4af91d9bd
MD5 ab2502eeef832f1d2bd5547945a68ae6
BLAKE2b-256 b94abf120e2439b1806d32a810015b95d428c8d288abbd3358144fcae1109d54

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for secsgml-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c124f6903d80909376dacf7d6b55c0bfd0ff97b74877a00b4d219602ef43be34
MD5 eef885b2534168472a43ada850354769
BLAKE2b-256 27d30644ed8109b60271b5ce4e19aa707223a7192d664065b424c7bac8dd04d5

See more details on using hashes here.

File details

Details for the file secsgml-0.1.1-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for secsgml-0.1.1-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 801b09ee6cd0c7ea1118cbc8f855a2b1ba78ab954df762411b9261344a27696e
MD5 ac4dc39a0a3590a6d88605c12047196a
BLAKE2b-256 e556d252e5d6d41dd0544497e16a3ca32cb552a9b5f697703ee49d4644ed5622

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page