Skip to main content

An open-source Python library for ESEF XBRL filings

Project description

# Open ESEF
A Python Library for ESEF and XBRL Filings
Project Status: Under Development - 70% Complete License: GPL v3.0

Open-ESEF is a Python-based, open-source project designed to handle XBRL (eXtensible Business Reporting Language) filings, specifically those adhering to the ESEF (European Single Electronic Format) standard.

ESEF is the mandated digital reporting format for annual financial reports of listed companies in the European Union, established by the European Securities and Markets Authority (ESMA). Open-ESEF provides a robust toolkit for parsing, validating, and analyzing these ESEF XBRL filings.

Funding Acknowledgment (DFG): Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) โ€“ Collaborative Research Center (SFB/TRR) Project-ID 403041268 โ€“ TRR 266 Accounting for Transparency.

Open-ESEF is under active development. Stay tuned for updates and new features as the project progresses!

Getting Started

Installation with git

  1. Clone the Repository:

    git clone https://github.com/reeyarn/openesef.git
    cd openesef
    
  2. Install Dependencies and Build Package:

    # Install Cython first
    pip install cython
    
    # Install the package in development mode with Cython compilation
    pip install -e . 
    

Note: The package will automatically compile the Cython extensions during installation. If you modify any .pyx files, you'll need to reinstall the package using pip install -e . again.

  1. Verify Installation:
    python -c "from openesef import base, taxonomy, instance; import openesef.engines.tax_pres as oetp; print('Open-ESEF installed successfully!')"
    

Usage Examples

Example 1: Loading SEC Filings (US-GAAP iXBRL)

Explore the Example and output with Notebooks: examples/apple_2020.ipynb

  • Load XBRL filing using ticker and year

    from openesef.edgar.loader import load_xbrl_filing
    from openesef.engines.tax_pres import TaxonomyPresentation
    
    # Load XBRL filing using ticker and year
    xid, tax = load_xbrl_filing(ticker="AAPL", year=2020)
    
    # OR Load using filing URL:
    # xid, tax = load_xbrl_filing(filing_url="/Archives/edgar/data/320193/0000320193-20-000096.txt") 
    
  • Create presentation object to analyze statements and concepts

    t_pres = TaxonomyPresentation(tax)
    
    # Print statement names
    print("\nFinancial Statements:")
    for statement in t_pres.statement_dimensions.keys():
        print(f"- {statement}")
    
  • Get concepts from Statement of Operations

    print("\nConcepts in Statement of Operations:")
    statement_concepts = t_pres.statement_concepts.get('CONSOLIDATEDSTATEMENTSOFOPERATIONS', [])
    concepts_statement_of_operations = []
    for concept in statement_concepts:
        concepts_statement_of_operations.append(concept['concept_qname'])
        print(f"Statement: {concept['statement_name']}")
        print(f"Concept: {concept['concept_qname']}")
        print(f"Label: {concept['label']}")        
            
    
  • Print fact values for Statement of Operations concepts

    print("\nFact Values:")
    for key, fact in xid.xbrl.facts.items():
        concept_qname = str(fact.qname)
        context = xid.xbrl.contexts[fact.context_ref]
        if concept_qname in concepts_statement_of_operations: 
            print(f"{concept_qname:<90} Value: {fact.value:<15} ")    
          
    

Example 2: Loading ESEF Filing (IFRS - Volkswagen 2020)

In this forked repository, I began by adapting the code from the fractalexperience/xbrl/ package to facilitate its compatibility with ESEF.

The issue in that repository was that, unlike US-SEC-EDGAR, ESEF files adhere to a folder structure. Consequently, the schema references in ESEF files are relative to the instance file rather than the taxonomy folder, and fractalexperience/xbrl/ package did not handle this out of the box. Using SAP SE 2022 ESEF filing as an example, the ESEF filing root folder contains the following folders and files:

  ๐Ÿ“ฆ sap-2022-12-31-DE
  โ”œโ”€โ”€ ๐Ÿ“ฆ META-INF
  โ”‚   โ”œโ”€โ”€ ๐Ÿ“„ catalog.xml
  โ”‚   โ””โ”€โ”€ ๐Ÿ“„ taxonomyPackage.xml
  โ”œโ”€โ”€ ๐Ÿ“ฆ reports
  โ”‚   โ””โ”€โ”€ ๐Ÿ“„ sap-2022-12-31-DE.xhtml
  โ””โ”€โ”€ ๐Ÿ“ฆ www.sap.com
      โ”œโ”€โ”€ ๐Ÿ“„ sap-2022-12-31.xsd
      โ”œโ”€โ”€ ๐Ÿ“„ sap-2022-12-31_cal.xml
      โ”œโ”€โ”€ ๐Ÿ“„ sap-2022-12-31_def.xml
      โ”œโ”€โ”€ ๐Ÿ“„ sap-2022-12-31_lab-de.xml
      โ”œโ”€โ”€ ๐Ÿ“„ sap-2022-12-31_lab-en.xml
      โ””โ”€โ”€ ๐Ÿ“„ sap-2022-12-31_pre.xml

I have tried to modify the code to handle ESEF by adding the esef_filing_root parameter and passing it around.

Explore the example with code: examples/try_vw2020.py

Based on Open Source Projects

Open-ESEF builds upon and extends the excellent work of these open-source projects:

Other Related Projects

Key Features

  • ESEF Compliance: Specifically designed to handle XBRL filings in the ESEF format, addressing the unique folder structure and referencing conventions of ESEF reports.

  • XBRL Taxonomy Management:

    • Resolves XBRL concepts, labels, and relationships.
    • Processes XBRL linkbases (presentation, definition, calculation, label, reference).
    • Supports taxonomy packages and efficient in-memory storage for large taxonomies.
    • Handles references to external taxonomies like US-GAAP, IFRS, etc.
  • XBRL Instance Document Processing:

    • Parses XBRL facts and their associated contexts (entity, period, units, decimals, dimensions).
    • Supports dimensional data (explicit and typed dimensions, segments, scenarios).
    • Extracts Document and Entity Information (DEI).
    • Identifies key reporting contexts (Current/Prior, Instant/Duration).
  • Data Modeling & Storage:

    • Utilizes a Cube class for semantic indexing of facts in a multidimensional space (dimensions: metric, entity, period, unit, custom dimensions).
    • Optimized storage in partitioned JSON datasets within ZIP archives using SHA-1 hashing for efficient content addressing.
  • Inline XBRL (iXBRL) Support: Processes iXBRL documents, extracting embedded XBRL data from XHTML reports.

  • SEC EDGAR Integration:

    • Direct access to SEC EDGAR filings using company tickers
    • Real-time ticker to CIK mapping using SEC's company tickers API https://www.sec.gov/files/company_tickers.json; added edgar.stock.update_symbols_data() to update the symbols data file.
    • Automatic handling of filing downloads and XBRL extraction.
  • Modular Architecture: Well-structured codebase with clear separation of concerns (base components, taxonomy logic, instance processing, engines).

  • Logging & Debugging: Detailed logging for taxonomy resolution and instance processing.

Project Architecture

[Detailed Architecture Overview (Coming Soon)] - This section will be expanded to provide a more in-depth look at the Open-ESEF architecture.

Key Components:

  • base: Core modules providing fundamental classes and utilities (e.g., pool, resolver, ebase, fbase).
  • taxonomy: Modules for handling XBRL taxonomies (taxonomy, schema, linkbase, tpack).
  • instance: Modules for processing XBRL instance documents (instance, fact, context, unit, dei, filing_loader).
  • engines: Modules for reporting and data analysis (functionality to be documented).
  • edgar: Modules for SEC EDGAR filing retrieval (currently being streamlined).
  • filings_xbrl_org: Interacting with https://filings.xbrl.org/ to get the ESEF filings.
  • util: Utility functions such as util_mylogger.setup_logger() .

Data Flow (Simplified):

  1. Input: XBRL/ESEF instance documents and taxonomy files.
  2. Resolution: Taxonomies and schemas are resolved and cached.
  3. Parsing: Instance documents are parsed, facts and contexts extracted.
  4. Modeling: Data is modeled using Taxonomy, Instance, and Cube classes.
  5. Output: Processed data can be accessed programmatically or serialized for storage/analysis.

Technical Highlights:

  • LXML for XML Processing: Efficient XML parsing and XLink resolution.
  • SHA-1 Hashing: Content addressing for optimized data storage.
  • Memory File System: Uses fs.memory for in-memory file handling and caching.
  • Modular Design: Encapsulated components for maintainability and extensibility.

Standards Compliance:

  • XBRL 2.1
  • XBRL Dimensions 1.0
  • ESEF Reporting Manual

Recent Updates

  • 0.3.8

    • engines/tax_pres.py Enhanced taxonomy presentation processing
      • Fixed calculation linkbase processing errors
      • Improved memory management for large filings
      • Optimized fact extraction for disclosures
      • Added better error handling for label links
      • Enhanced logging and memory usage tracking
    • edgar/loader.py added get_xbrl_df() to replace get_fact_df()
  • 0.3.7

    • Taxonomy now processes calculation networks
    • Added engines.tax_pres.tax_calc_df() to get the calculation network dataframe
  • 0.3.5

    • Improved engines.tax_pres by avoiding double for loop for disclosure only facts
  • 0.3.1

    • Added util.ram_usage.check_memory_usage() to check the memory usage
  • 0.3.0

    • Enhanced taxonomy presentation processing with new TaxonomyPresentation class:
      • Intelligent statement detection and concept organization
      • Automated extraction of financial statement structures
      • Improved dimension and segment validation
      • Support for both US-GAAP and IFRS taxonomies
    • Integrated SEC EDGAR functionality with memfs for efficient XBRL extraction
    • Added statement-specific concept mapping and validation
    • Improved fact extraction with dimensional context support

Author Information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openesef-0.3.8.tar.gz (195.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openesef-0.3.8-cp311-cp311-macosx_11_0_arm64.whl (442.0 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file openesef-0.3.8.tar.gz.

File metadata

  • Download URL: openesef-0.3.8.tar.gz
  • Upload date:
  • Size: 195.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.8

File hashes

Hashes for openesef-0.3.8.tar.gz
Algorithm Hash digest
SHA256 a6576903681e6dc455761aa8dd91439666c85ec8fcbb9d584dd8a34f7e3e4f84
MD5 991f88e0f0f3aaabed0b70497b088f43
BLAKE2b-256 5195be1b1880147c3adbdb459724d19f03e32ebc8b41d67e17f610afd53be956

See more details on using hashes here.

File details

Details for the file openesef-0.3.8-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for openesef-0.3.8-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c73d28aec4ddea55df70e24ee7568e2da9b52f678a7ca2ecaa512b8cef22426f
MD5 633b5ae8ad50ef124ae9bb58af9b6639
BLAKE2b-256 ef8870e22ff263d32f0967f31963370581a5d921f565a9e0b4dff45b35af9a3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page