Skip to main content

Energyml helper

Project description

energyml-utils

PyPI version License Documentation Status Python version Status

Installation

energyml-utils can be installed with pip :

pip install energyml-utils

or with poetry:

poetry add energyml-utils

Features

Supported packages versions

This package supports read/write in xml/json the following packages :

  • EML (common) : 2.0, 2.1, 2.2, 2.3
  • RESQML : 2.0.1, 2.2dev3, 2.2
  • WITSMl : 2.0, 2.1
  • PRODML : 2.0, 2.2

/!\ By default, these packages are not installed and are published independently. You can install only the versions you need by adding the following lines in the .toml file :

energyml-common2-0 = "^1.12.0"
energyml-common2-1 = "^1.12.0"
energyml-common2-2 = "^1.12.0"
energyml-common2-3 = "^1.12.0"
energyml-resqml2-0-1 = "^1.12.0"
energyml-resqml2-2-dev3 = "^1.12.0"
energyml-resqml2-2 = "^1.12.0"
energyml-witsml2-0 = "^1.12.0"
energyml-witsml2-1 = "^1.12.0"
energyml-prodml2-0 = "^1.12.0"
energyml-prodml2-2 = "^1.12.0"

Content of the package :

  • Support EPC + h5 read and write
    • .rels files are automatically generated, but it is possible to add custom Relations.
    • You can add "raw files" such as PDF or anything else, in your EPC instance, and it will be package with other files in the ".epc" file when you call the "export" function.
    • You can work with local files, but also with IO (BytesIO). This is usefull to work with cloud application to avoid local storage.
  • Supports xml / json read and write (for energyml objects)
  • Work in progress : Supports the read of 3D data inside the "AbstractMesh" class (and sub-classes "PointSetMesh", "PolylineSetMesh", "SurfaceMesh"). This gives you a instance containing a list of point and a list of indices to easily re-create a 3D representation of the data.
    • These "mesh" classes provides .obj, .off, and .geojson export.
  • Introspection : This package includes functions to ease the access of specific values inside energyml objects.
    • Functions to access to UUID, object Version, and more generic functions for any other attributes with regex like ".Citation.Title" or "Cit\.*.Title" (regular dots are used as in python object attribute access. To use dot in regex, you must escape them with a '\')
    • Functions to parse, or generate from an energyml object the "ContentType" or "QualifiedType"
    • Generation of random data : you can generate random values for a specific energyml object. For example, you can generate a WITSML Tubular object with random values in it.
  • Objects correctness validation :
    • You can verify if your objects are valid following the energyml norm (a check is done on regex contraint attributes, maxCount, minCount, mandatory etc...)
    • The DOR validation is tested : check if the DOR has correct information (title, ContentType/QualifiedType, object version), and also if the referenced object exists in the context of the EPC instance (or a list of object).
  • Abstractions done to ease use with ETP (Energistics Transfer Protocol) :
    • The "EnergymlWorkspace" class allows to abstract the access of numerical data like "ExternalArrays". This class can thus be extended to interact with ETP "GetDataArray" request etc...
  • ETP URI support : the "Uri" class allows to parse/write an etp uri.

EPC Stream Reader

The EpcStreamReader provides memory-efficient handling of large EPC files through lazy loading and smart caching. Unlike the standard Epc class which loads all objects into memory, the stream reader loads objects on-demand, making it ideal for handling very large EPC files with thousands of objects.

Key Features

  • Lazy Loading: Objects are loaded only when accessed, reducing memory footprint
  • Smart Caching: LRU (Least Recently Used) cache with configurable size
  • Automatic EPC Version Detection: Supports both CLASSIC and EXPANDED EPC formats
  • Add/Remove/Update Operations: Full CRUD operations with automatic file structure maintenance
  • Context Management: Automatic resource cleanup with with statements
  • Memory Monitoring: Track cache efficiency and memory usage statistics

Basic Usage

from energyml.utils.epc_stream import EpcStreamReader

# Open EPC file with context manager (recommended)
with EpcStreamReader('large_file.epc', cache_size=50) as reader:
    # List all objects without loading them
    print(f"Total objects: {reader.stats.total_objects}")
    
    # Get object by identifier
    obj: Any = reader.get_object_by_identifier("uuid.version")
    
    # Get objects by type
    features: List[Any] = reader.get_objects_by_type("BoundaryFeature")
    
    # Get all objects with same UUID
    versions: List[Any] = reader.get_object_by_uuid("12345678-1234-1234-1234-123456789abc")

Adding Objects

from energyml.utils.epc_stream import EpcStreamReader
from energyml.utils.constants import gen_uuid
import energyml.resqml.v2_2.resqmlv2 as resqml
import energyml.eml.v2_3.commonv2 as eml

# Create a new EnergyML object
boundary_feature = resqml.BoundaryFeature()
boundary_feature.uuid = gen_uuid()
boundary_feature.citation = eml.Citation(title="My Feature")

with EpcStreamReader('my_file.epc') as reader:
    # Add object - path is automatically generated based on EPC version
    identifier = reader.add_object(boundary_feature)
    print(f"Added object with identifier: {identifier}")
    
    # Or specify custom path (optional)
    identifier = reader.add_object(boundary_feature, "custom/path/MyFeature.xml")

Removing Objects

with EpcStreamReader('my_file.epc') as reader:
    # Remove specific version by full identifier
    success = reader.remove_object("uuid.version")
    
    # Remove ALL versions by UUID only
    success = reader.remove_object("12345678-1234-1234-1234-123456789abc")
    
    if success:
        print("Object(s) removed successfully")

Updating Objects

...
from energyml.utils.introspection import set_attribute_from_path

with EpcStreamReader('my_file.epc') as reader:
    # Get existing object
    obj = reader.get_object_by_identifier("uuid.version")
    
    # Modify the object
    set_attribute_from_path(obj, "citation.title", "Updated Title")
    
    # Update in EPC file
    new_identifier = reader.update_object(obj)
    print(f"Updated object: {new_identifier}")

Performance Monitoring

with EpcStreamReader('large_file.epc', cache_size=100) as reader:
    # Access some objects...
    for i in range(10):
        obj = reader.get_object_by_identifier(f"uuid-{i}.1")
    
    # Check performance statistics
    print(f"Cache hit rate: {reader.stats.cache_hit_rate:.1f}%")
    print(f"Memory efficiency: {reader.stats.memory_efficiency:.1f}%") 
    print(f"Objects in cache: {reader.stats.loaded_objects}/{reader.stats.total_objects}")

EPC Version Support

The EpcStreamReader automatically detects and handles both EPC packaging formats:

  • CLASSIC Format: Flat file structure (e.g., obj_BoundaryFeature_{uuid}.xml)
  • EXPANDED Format: Namespace structure (e.g., namespace_resqml201/version_{id}/obj_BoundaryFeature_{uuid}.xml or namespace_resqml201/obj_BoundaryFeature_{uuid}.xml)
with EpcStreamReader('my_file.epc') as reader:
    print(f"Detected EPC version: {reader.export_version}")
    # Objects added will use the same format as the existing EPC file

Advanced Usage

# Initialize without preloading metadata for faster startup
reader = EpcStreamReader('huge_file.epc', preload_metadata=False, cache_size=200)

try:
    # Manual metadata loading when needed
    reader._load_metadata()
    
    # Get object dependencies
    deps = reader.get_object_dependencies("uuid.version")
    
    # Batch processing with memory monitoring
    for obj_type in ["BoundaryFeature", "PropertyKind"]:
        objects = reader.get_objects_by_type(obj_type)
        print(f"Processing {len(objects)} {obj_type} objects")
        
finally:
    reader.close()  # Manual cleanup if not using context manager

The EpcStreamReader is perfect for applications that need to work with large EPC files efficiently, such as data processing pipelines, web applications, or analysis tools where memory usage is a concern.

Poetry scripts :

  • extract_3d : extract a representation into an 3D file (obj/off)
  • csv_to_dataset : translate csv data into h5 dataset
  • generate_data : generate a random data from a qualified_type
  • xml_to_json : translate an energyml xml file into json.
  • json_to_xml : translate an energyml json file into an xml file
  • describe_as_csv : create a csv description of an EPC content
  • validate : validate an energyml object or an EPC instance (or a folder containing energyml objects)

Installation to test poetry scripts :

poetry install

if you fail to run a script, you may have to add "src" to your PYTHONPATH environment variable. For example, in powershell :

$env:PYTHONPATH="src"

Validation examples :

An epc file:

poetry run validate --file "path/to/your/energyml/object.epc" *> output_logs.json

An xml file:

poetry run validate --file "path/to/your/energyml/object.xml" *> output_logs.json

A json file:

poetry run validate --file "path/to/your/energyml/object.json" *> output_logs.json

A folder containing Epc/xml/json files:

poetry run validate --file "path/to/your/folder" *> output_logs.json

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

energyml_utils-1.9.3.tar.gz (627.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

energyml_utils-1.9.3-py3-none-any.whl (642.1 kB view details)

Uploaded Python 3

File details

Details for the file energyml_utils-1.9.3.tar.gz.

File metadata

  • Download URL: energyml_utils-1.9.3.tar.gz
  • Upload date:
  • Size: 627.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.19 Linux/6.11.0-1018-azure

File hashes

Hashes for energyml_utils-1.9.3.tar.gz
Algorithm Hash digest
SHA256 41bdd6bedf345d3709a9570687ac6a2e1c8b1ed030f6c4959778e091946048ab
MD5 9d9d8a819668d51f97d5e18afcc25d3b
BLAKE2b-256 ce0b2fa2e614dce90b1810ba6d7574497eff09f49aeec3720c0151c5c79284cf

See more details on using hashes here.

File details

Details for the file energyml_utils-1.9.3-py3-none-any.whl.

File metadata

  • Download URL: energyml_utils-1.9.3-py3-none-any.whl
  • Upload date:
  • Size: 642.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.10.19 Linux/6.11.0-1018-azure

File hashes

Hashes for energyml_utils-1.9.3-py3-none-any.whl
Algorithm Hash digest
SHA256 340005e9a1807b3f5f2ab4ecb27fa5c5abf84aec2c24d2a30500fb90ec01f809
MD5 4a086b28c8b588d37e0e6ba9cacbdb3f
BLAKE2b-256 a74d99bbf83f249c478a441fe163f0e919c9fddedcf814324a8d9f7202494910

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page