Skip to main content

Parse Securities and Exchange Commission Standard Generalized Markup Language (SEC SGML) files

Project description

SEC SGML

A python library to parse Securities and Exchange Commission Standardized Generalized Markup Language. Used to power the open-source datamule project.

Currently parses two types of files:

  1. Daily Archives
  2. Submissions

Will be expanded to also parse SGML Tables.

All Variations

secsgml also attempts to standardize the metadata between formats. e.g. 'CENTRAL INDEX KEY' will be mapped to 'cik'.

Installation

pip install secsgml

Quickstart

Parse into memory

from secsgml import parse_sgml_submission_into_memory
metadata,documents = parse_sgml_submission_into_memory(filepath="000000443897000001.sgml")

Parse to file

from secsgml import parse_sgml_submission
# from file
parse_sgml_submission(filepath='samples/0000891618-94-000021.txt',output_dir='results')

# from content
parse_sgml_submission(content=sgml_content,output_dir='results')

Note

Will be giving parse_sgml_submission_into_memory more love, will have to refactor parse_sgml_submission afterwards.

Future

  • SGML Table parsing
  • Optimization + refactor in Cython/ C bindings.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

secsgml-0.1.4.tar.gz (177.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

secsgml-0.1.4-cp313-cp313-win_amd64.whl (263.5 kB view details)

Uploaded CPython 3.13Windows x86-64

secsgml-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (762.1 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

secsgml-0.1.4-cp313-cp313-macosx_10_13_universal2.whl (356.2 kB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

secsgml-0.1.4-cp312-cp312-win_amd64.whl (264.2 kB view details)

Uploaded CPython 3.12Windows x86-64

secsgml-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (772.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

secsgml-0.1.4-cp312-cp312-macosx_10_13_universal2.whl (359.2 kB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

secsgml-0.1.4-cp311-cp311-win_amd64.whl (264.6 kB view details)

Uploaded CPython 3.11Windows x86-64

secsgml-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (773.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

secsgml-0.1.4-cp311-cp311-macosx_10_9_universal2.whl (356.6 kB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file secsgml-0.1.4.tar.gz.

File metadata

  • Download URL: secsgml-0.1.4.tar.gz
  • Upload date:
  • Size: 177.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.4.tar.gz
Algorithm Hash digest
SHA256 19db42af09f76dde001f04af5e0a526a37ee73122833e65f650ae484727f2344
MD5 8172b6d08141ac890f156a3d4a19e353
BLAKE2b-256 dc36442f13928aa214810fcb241f4389a6a0f3b29b48b66915b93f7bc7f38103

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: secsgml-0.1.4-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 263.5 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.4-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 013fbd577a89e08452b1a60da0b32d9feb5af4e9ddb75be1eb05f5f04eb7782d
MD5 f3a4f9c7812e084271e1338565c6492d
BLAKE2b-256 bd03a52bc0f8b8719e1e82656981ef6c633795e22ea52198c9c702cc086f92ce

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for secsgml-0.1.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d34b0edca6f0a76671e19cc1ad930ba443919967fecede3d8389cee2d422c30c
MD5 3d59eecb5b082833f76ca40f87e96cb9
BLAKE2b-256 b466fe77e6d845756264ae8d3505ce95a9c1340e420a6d60dfdfc0a578e2324d

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for secsgml-0.1.4-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 8b2c45e962345a4722fbd59ce1961d387ff8a89f7987f08ab360d666c4a8af63
MD5 ba563b21d7354d5ca90322c7f12b757b
BLAKE2b-256 116fad356cee87517124e1215f98db273c86773aaddad172a0e207039047b8bf

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: secsgml-0.1.4-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 264.2 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.4-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 2d401fdb123c73dd21bdffb8726bc48763256b7494febe0a0f77c0d1883f80be
MD5 bd9d026f586fc1b8800aaa5e23c82565
BLAKE2b-256 51ecddde55f20547ca025641bee5867dbc3a2abe6f54ab898c4ef84e46e07cba

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for secsgml-0.1.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8dbe728d1d2a89cc9118776ad703fac0cfd7d39ce1ed17d6a09caf323a04b344
MD5 6496ca428219b984f4839b3644904f4a
BLAKE2b-256 0f90a2fff9c672314c2fbf01c5b204ee641cbff1f63e85ee2e5be3c90258619a

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for secsgml-0.1.4-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 b712bac77fd0775dadd3d2aa0eb3dfdc601aa7d85939878742607dfd8bbc0481
MD5 731af5df9101c098375aeff44f30c207
BLAKE2b-256 60b9ac6fab0d2bba0756ea7e36840cba90902d99368e565d0870b5d054ea3d26

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: secsgml-0.1.4-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 264.6 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for secsgml-0.1.4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 04a76907b0a067dcacb8bf8ca8e172cedb1f1cb9f5007995f857f38e9f373f2a
MD5 4e90050ec9d5c46d7a2f338638a6ed28
BLAKE2b-256 880fb1d7350b8f6be2e1a1b82576149a166804cdaca4fd71536919a78abb4ba9

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for secsgml-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 aff09e8e479c871442466b4497a4c5863b5859fa057ccbb26067ce175c1b5cb0
MD5 d6a45b563d318d1aa31afc97a095b32f
BLAKE2b-256 5079c2ec4b84df618399b70090e72a78dd06594080a2ab1f999ebdaa405880dd

See more details on using hashes here.

File details

Details for the file secsgml-0.1.4-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for secsgml-0.1.4-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 001301a22d364dd06673d8a0960eba1f056a558f70a676b16e51078eaf5baa5d
MD5 5d5d1c80b20b9b663996f9e75797c869
BLAKE2b-256 d112b6b6a48f3ca2cf59274d8de876f245678caf040fc954728af7e47e690cfb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page