Skip to main content

Parse iXBRL files, can present in RDF

Project description

ixbrl-parse

Introduction

  • Python library code, parses iXBRL files.
  • Script ixbrl-dump emits iXBRL tagged data in a semi-human-readable dump.
  • Script ixbrl-report emits iXBRL tagged data in a human-readable report. This involves downloading referenced XBRL schema to get the human-readable fact labels.
  • Script ixbrl-to-rdf emits iXBRL tagged data in RDF.
  • Script ixbrl-to-csv outputs iXBRL tagged data in a CSV format.
  • Script ixbrl-to-json emits iXBRL tagged data in JSON.
  • Script ixbrl-to-xbrl emits iXBRL tagged data as an XBRL instance.
  • Script ixbrl-to-kv emits iXBRL in a key-value form easily consumed by scripts.
  • Script ixbrl-markdown emits iXBRL tagged data in a markdown report.

Sample data

There's a bunch of sample iXBRL files grabbed from various places in the ixbrl directory: US 10-K and 10-Q filings, a few random things from UK companies house, and a couple of sample ESEF filings. This is the data I've tested with.

Also, accts.html is a sample file created using gnucash-ixbrl.

Installation

From PyPI (when published)

pip install ixbrl-parse

From source

pip install git+https://github.com/cybermaggedon/ixbrl-parse

For development

git clone https://github.com/cybermaggedon/ixbrl-parse
cd ixbrl-parse
pip install -e ".[dev]"

The dev extra includes pytest and pytest-cov for running tests.

For markdown report support:

pip install ixbrl-parse[markdown]

Usage

Parse iXBRL and output in RDF (default n3 format):

ixbrl-to-rdf accts.html

Parse iXBRL and output in RDF/XML:

ixbrl-to-rdf accts.html --format xml

Parse iXBRL and output in CSV:

ixbrl-to-csv accts.html

Parse iXBRL and output in JSON:

ixbrl-to-json accts.html

Schema labels in JSON:

ixbrl-to-json ixbrl/10k/lyft-20201231.htm -f labeled \
    -b https://www.sec.gov/Archives/edgar/data/1759509/000175950921000011/lyft-20201231.htm

Dump iXBRL values:

ixbrl-dump accts.html

Human-readable report:

ixbrl-report accts.html

Human-readable report from SEC EDGAR. Note need to tell ixbrl-report the URL of the original report in order to know where to fetch the custom schema (relative URLs are used):

ixbrl-report ixbrl/10k/lyft-20201231.htm \
    -b https://www.sec.gov/Archives/edgar/data/1759509/000175950921000011/lyft-20201231.htm

Dump iXBRL as XBRL:

ixbrl-to-xbrl accts.html

API

You can use the library directly in your Python code:

from lxml import etree as ET
from ixbrl_parse.ixbrl import parse

# Parse an iXBRL file
tree = ET.parse('accts.html')
ixbrl = parse(tree)

# Get data in various formats
data_dict = ixbrl.to_dict()
flat_data = ixbrl.flatten()
rdf_triples = ixbrl.get_triples()

# Access contexts and values
for context in ixbrl.contexts.values():
    print(context.entity, context.period)

for value in ixbrl.values.values():
    print(value.name, value.to_value())

See the CLI implementations in ixbrl_parse/cli.py for more examples.

Development

Running Tests

The project includes a comprehensive pytest test suite:

# Run all tests
pytest

# Run with coverage
pytest --cov=ixbrl_parse --cov-report=term-missing

# Run specific test file
pytest tests/test_ixbrl.py -v

See tests/README.md for more details on the test suite.

Building

The project uses modern Python packaging with pyproject.toml:

# Build wheel and source distribution
python -m build

# Install in development mode
pip install -e .

What next?

This loads into a Redland RDF sqlite3 store:

ixbrl-to-rdf -i accts.html -f ntriples > accts.ntriples
rdfproc -n -s sqlite accts.db parse accts.ntriples ntriples
rdfproc -s sqlite accts.db print | head

I run a SPARQL store across the data, and view it with LodLive.

Screenshot of LodLive

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ixbrl_parse-0.11.0.tar.gz (36.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ixbrl_parse-0.11.0-py3-none-any.whl (32.9 kB view details)

Uploaded Python 3

File details

Details for the file ixbrl_parse-0.11.0.tar.gz.

File metadata

  • Download URL: ixbrl_parse-0.11.0.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ixbrl_parse-0.11.0.tar.gz
Algorithm Hash digest
SHA256 6ad004617fb41156b091484c5e5388a216cc6d1a1928c8999f0916deab313052
MD5 06d450631a1c81e73f8e98a02169ec5d
BLAKE2b-256 e798b8e734723b2e310727cf14dac6d5e909eaaf6b58777c99824456e4230310

See more details on using hashes here.

File details

Details for the file ixbrl_parse-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: ixbrl_parse-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 32.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ixbrl_parse-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 45da75b2586da7b4f7a0212fff4b39040f71dd529b55f3d14a407f2526536abc
MD5 7c2bf39f2300c5c49b7abd6afa1e0ed5
BLAKE2b-256 cdddda2aebd9fc88797c8846721db2c46e7d2fd55e39356cf6ad04287a74323f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page