Skip to main content

An open-source Python library for ESEF XBRL filings

Project description

# Open ESEF
A Python Library for ESEF and XBRL Filings
Project Status: Under Development - 70% Complete License: GPL v3.0

Open-ESEF is a Python-based, open-source project designed to handle XBRL (eXtensible Business Reporting Language) filings, specifically those adhering to the ESEF (European Single Electronic Format) standard.

ESEF is the mandated digital reporting format for annual financial reports of listed companies in the European Union, established by the European Securities and Markets Authority (ESMA). Open-ESEF provides a robust toolkit for parsing, validating, and analyzing these ESEF XBRL filings.

Funding Acknowledgment (DFG): Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Collaborative Research Center (SFB/TRR) Project-ID 403041268 – TRR 266 Accounting for Transparency.

Open-ESEF is under active development. Stay tuned for updates and new features as the project progresses!

Getting Started

Install the stable release using pip

To install the latest stable version:

pip install openesef

Alternatively: Installing the latest version with git

  1. Clone the Repository:

    git clone https://github.com/reeyarn/openesef.git
    cd openesef
    
  2. Install Dependencies and Build Package:

    # Install Cython first
    pip install cython
    
    # Install the package in development mode with Cython compilation
    pip install -e . 
    

Note: The package will automatically compile the Cython extensions during installation. If you modify any .pyx files, you'll need to reinstall the package using pip install -e . again.

  1. Verify Installation:
    python -c "from openesef import base, taxonomy, instance; import openesef.engines.tax_pres as oetp; print('Open-ESEF installed successfully!')"
    

Usage Examples

Example 1: Loading SEC Filings (US-GAAP iXBRL)

Explore the Example and output with Notebooks: examples/apple_2020.ipynb

  • Load XBRL filing using ticker and year

    from openesef.edgar.loader import load_xbrl_filing
    from openesef.engines.tax_pres import TaxonomyPresentation
    
    # Load XBRL filing using ticker and year
    xid, tax = load_xbrl_filing(ticker="AAPL", year=2020)
    
    # OR Load using filing URL:
    # xid, tax = load_xbrl_filing(filing_url="/Archives/edgar/data/320193/0000320193-20-000096.txt") 
    
  • Create presentation object to analyze statements and concepts

    t_pres = TaxonomyPresentation(tax)
    
    # Print statement names
    print("\nFinancial Statements:")
    for statement in t_pres.statement_dimensions.keys():
        print(f"- {statement}")
    
  • Get concepts from Statement of Operations

    print("\nConcepts in Statement of Operations:")
    statement_concepts = t_pres.statement_concepts.get('CONSOLIDATEDSTATEMENTSOFOPERATIONS', [])
    concepts_statement_of_operations = []
    for concept in statement_concepts:
        concepts_statement_of_operations.append(concept['concept_qname'])
        print(f"Statement: {concept['statement_name']}")
        print(f"Concept: {concept['concept_qname']}")
        print(f"Label: {concept['label']}")        
            
    
  • Print fact values for Statement of Operations concepts

    print("\nFact Values:")
    for key, fact in xid.xbrl.facts.items():
        concept_qname = str(fact.qname)
        context = xid.xbrl.contexts[fact.context_ref]
        if concept_qname in concepts_statement_of_operations: 
            print(f"{concept_qname:<90} Value: {fact.value:<15} ")    
          
    

Example 2: Loading ESEF Filing (IFRS - Volkswagen 2020)

In this forked repository, I began by adapting the code from the fractalexperience/xbrl/ package to facilitate its compatibility with ESEF.

The issue in that repository was that, unlike US-SEC-EDGAR, ESEF files adhere to a folder structure. Consequently, the schema references in ESEF files are relative to the instance file rather than the taxonomy folder, and fractalexperience/xbrl/ package did not handle this out of the box. Using SAP SE 2022 ESEF filing as an example, the ESEF filing root folder contains the following folders and files:

  📦 sap-2022-12-31-DE
  ├── 📦 META-INF
  │   ├── 📄 catalog.xml
  │   └── 📄 taxonomyPackage.xml
  ├── 📦 reports
  │   └── 📄 sap-2022-12-31-DE.xhtml
  └── 📦 www.sap.com
      ├── 📄 sap-2022-12-31.xsd
      ├── 📄 sap-2022-12-31_cal.xml
      ├── 📄 sap-2022-12-31_def.xml
      ├── 📄 sap-2022-12-31_lab-de.xml
      ├── 📄 sap-2022-12-31_lab-en.xml
      └── 📄 sap-2022-12-31_pre.xml

I have tried to modify the code to handle ESEF by adding the esef_filing_root parameter and passing it around.

Explore the example with code: examples/try_vw2020.py

Attribution

ESEF Standard Acknowledgment

This project supports the European Single Electronic Format (ESEF), established by the European Securities and Markets Authority (ESMA) as the mandated digital reporting standard for annual financial reports of listed companies in the European Union. The ESEF specifications and guidelines are sourced from ESMA’s official publications and are adhered to in this implementation. For more information, visit esma.europa.eu.

IFRS Taxonomy Acknowledgment

This project leverages the IFRS Taxonomy, developed and maintained by the IFRS Foundation, to process XBRL filings based on International Financial Reporting Standards (IFRS). The taxonomy files are sourced from the IFRS Foundation’s official repository and are used in accordance with their terms of use. For more information, visit ifrs.org.

US GAAP Taxonomy Acknowledgment

This project utilizes the US GAAP Financial Reporting Taxonomy, developed and maintained by the Financial Accounting Standards Board (FASB) and XBRL US. The taxonomy files (e.g., us-gaap-YYYY-MM-DD.xsd) are sourced from xbrl.fasb.org and are used in compliance with their terms of use. For more information, visit fasb.org and xbrl.us.

Disclaimer

The use of these standards and taxonomies is intended to support educational and research purposes in alignment with the open-source goals of this project. If any use herein is found to infringe upon the rights of the FASB, XBRL US, ESMA, or the IFRS Foundation, please contact the author at reeyarn+github.openesef@gmail.com, and I will promptly remove or adjust the offending content to address any concerns.

Based on Open Source Projects

Open-ESEF builds upon and extends the excellent work of these open-source projects:

Other Related Projects

Key Features

  • ESEF Compliance: Specifically designed to handle XBRL filings in the ESEF format, addressing the unique folder structure and referencing conventions of ESEF reports.

  • XBRL Taxonomy Management:

    • Resolves XBRL concepts, labels, and relationships.
    • Processes XBRL linkbases (presentation, definition, calculation, label, reference).
    • Supports taxonomy packages and efficient in-memory storage for large taxonomies.
    • Handles references to external taxonomies like US-GAAP, IFRS, etc.
  • XBRL Instance Document Processing:

    • Parses XBRL facts and their associated contexts (entity, period, units, decimals, dimensions).
    • Supports dimensional data (explicit and typed dimensions, segments, scenarios).
    • Extracts Document and Entity Information (DEI).
    • Identifies key reporting contexts (Current/Prior, Instant/Duration).
  • Data Modeling & Storage:

    • Utilizes a Cube class for semantic indexing of facts in a multidimensional space (dimensions: metric, entity, period, unit, custom dimensions).
    • Optimized storage in partitioned JSON datasets within ZIP archives using SHA-1 hashing for efficient content addressing.
  • Inline XBRL (iXBRL) Support: Processes iXBRL documents, extracting embedded XBRL data from XHTML reports.

  • SEC EDGAR Integration:

    • Direct access to SEC EDGAR filings using company tickers
    • Real-time ticker to CIK mapping using SEC's company tickers API https://www.sec.gov/files/company_tickers.json; added edgar.stock.update_symbols_data() to update the symbols data file.
    • Automatic handling of filing downloads and XBRL extraction.
  • Modular Architecture: Well-structured codebase with clear separation of concerns (base components, taxonomy logic, instance processing, engines).

  • Logging & Debugging: Detailed logging for taxonomy resolution and instance processing.

Project Architecture

[Detailed Architecture Overview (Coming Soon)] - This section will be expanded to provide a more in-depth look at the Open-ESEF architecture.

Key Components:

  • base: Core modules providing fundamental classes and utilities (e.g., pool, resolver, ebase, fbase).
  • taxonomy: Modules for handling XBRL taxonomies (taxonomy, schema, linkbase, tpack).
  • instance: Modules for processing XBRL instance documents (instance, fact, context, unit, dei, filing_loader).
  • engines: Modules for reporting and data analysis (functionality to be documented).
  • edgar: Modules for SEC EDGAR filing retrieval (currently being streamlined).
  • filings_xbrl_org: Interacting with https://filings.xbrl.org/ to get the ESEF filings.
  • util: Utility functions such as util_mylogger.setup_logger() .

Data Flow (Simplified):

  1. Input: XBRL/ESEF instance documents and taxonomy files.
  2. Resolution: Taxonomies and schemas are resolved and cached.
  3. Parsing: Instance documents are parsed, facts and contexts extracted.
  4. Modeling: Data is modeled using Taxonomy, Instance, and Cube classes.
  5. Output: Processed data can be accessed programmatically or serialized for storage/analysis.

Technical Highlights:

  • LXML for XML Processing: Efficient XML parsing and XLink resolution.
  • SHA-1 Hashing: Content addressing for optimized data storage.
  • Memory File System: Uses fs.memory for in-memory file handling and caching.
  • Modular Design: Encapsulated components for maintainability and extensibility.

Standards Compliance:

  • XBRL 2.1
  • XBRL Dimensions 1.0
  • ESEF Reporting Manual

Recent Updates

  • 0.3.8

    • engines/tax_pres.py
      • Used Cython to this file
      • Enhanced taxonomy presentation processing
      • Fixed calculation linkbase processing errors
      • Improved memory management for large filings
      • Optimized fact extraction for disclosures
      • Added better error handling for label links
      • Enhanced logging and memory usage tracking
    • engines/ins_facts.py
      • Moved fact_df = ins_facts(xid, tax) to
    • edgar/loader.py added get_xbrl_df() to replace get_fact_df()
  • 0.3.7

    • Taxonomy now processes calculation networks
    • Added engines.tax_pres.tax_calc_df() to get the calculation network dataframe
  • 0.3.5

    • Improved engines.tax_pres by avoiding double for loop for disclosure only facts
  • 0.3.1

    • Added util.ram_usage.check_memory_usage() to check the memory usage
  • 0.3.0

    • Enhanced taxonomy presentation processing with new TaxonomyPresentation class:
      • Intelligent statement detection and concept organization
      • Automated extraction of financial statement structures
      • Improved dimension and segment validation
      • Support for both US-GAAP and IFRS taxonomies
    • Integrated SEC EDGAR functionality with memfs for efficient XBRL extraction
    • Added statement-specific concept mapping and validation
    • Improved fact extraction with dimensional context support

Author Information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openesef-0.3.8.21.tar.gz (467.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openesef-0.3.8.21-cp311-cp311-macosx_11_0_arm64.whl (706.6 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file openesef-0.3.8.21.tar.gz.

File metadata

  • Download URL: openesef-0.3.8.21.tar.gz
  • Upload date:
  • Size: 467.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for openesef-0.3.8.21.tar.gz
Algorithm Hash digest
SHA256 f9b437a1a04ef9913da0b68930406f09b353dbfc531572fc03466a6e9b55acd8
MD5 1f9a19a3fe19409912d0b3973e80d8c8
BLAKE2b-256 484e9e45a2e67b2a10abcf66d3c7cec0a07eb12e7b042b1188e6ee6e2d7e1f26

See more details on using hashes here.

File details

Details for the file openesef-0.3.8.21-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for openesef-0.3.8.21-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ce032a73965418ead6b4577b10be67f2b9ee0c92d13df2bae5576a574438a08f
MD5 ae25fb05efa98788125603557e64bf66
BLAKE2b-256 d4f63a5ae2ad7a9c5d7873529029abbbb8ff030434f2f84bc54dcdff5841ef2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page