Skip to main content

Run and Validate VTL Scripts

Project description

VTL Engine

Testing Testing
Package PyPI Latest Release
License License - AGPL 3.0

Introduction

The VTL Engine is a Python library for validating and running VTL scripts.

It is a Python-based library around the VTL Language.

Installation

Requirements

The VTL Engine requires Python 3.10 or higher.

Install with pip

To install the VTL Engine on any Operating System, you can use pip:

pip install vtlengine

Note: it is recommended to install the VTL Engine in a virtual environment.

Usage

The VTL Engine API implements two basic methods:

  • Semantic Analysis: aimed at validating the correctness of a script and computing the data structures of the data sets created in the script.
  • Run: aimed at executing the provided input on the provided input datasets.

Any action with VTL requires the following elements as input:

  • VTL Script: Is the VTL to be executed, which includes the transformation scheme, as well as de User Defined Operators, Hierarchical Rulesets and Datapoint Rulesets. It is provided as a string or as a Path object to a vtl file.
  • Data structures : Provides the structure of the input artifacts of the VTL script, according to the VTL Information model. Given that the current version doesn't prescribe a standard format for providing the information, the VTL Engine is implementing a JSON format that can be found here. Data Structures can be provided as Dictionaries or as Paths to JSON files. It is possible to have
  • External routines: The VTL Engine allows using SQL (SQLite) with the eval operator. Can be provided as a string with the SQL or as a path object to an SQL file. Its default value is None, which shall be used if external routines are not applicable to the VTL script.
  • Value domains: Provides the value domains that are used in the VTL script, normally with an in operator. Can be provided as a dictionary or as a path to a JSON file. Its default value is None, which shall be used if value domains are not applicable to the VTL script.

Semantic Analysis

The semantic_analysis method serves to validate the correctness of a VTL script, as well as to calculate the data structures of the datasets generated by the VTL script itself (that calculation is a pre-requisite for the semantic analysis).

  • If the VTL script is correct, the method returns a dictionary with the data structures of all the datasets generated by the script.
  • If the VTL script is incorrect, raises a VTL Engine custom error Explaining the error.

Example 1: Correct VTL

from vtlengine import semantic_analysis

script = """
    DS_A := DS_1 * 10;
"""

data_structures = {
    'datasets': [
        {'name': 'DS_1',
         'DataStructure': [
             {'name': 'Id_1',
              'type':
                  'Integer',
              'role': 'Identifier',
              'nullable': False},
             {'name': 'Me_1',
              'type': 'Number',
              'role': 'Measure',
              'nullable': True}
         ]
         }
    ]
}

sa_result = semantic_analysis(script=script, data_structures=data_structures)

print(sa_result)

Returns:

{'DS_A': Dataset(name='DS_A', components={'Id_1': Component(name='Id_1', data_type=<class 'vtlengine.DataTypes.Integer'>, role=<Role.IDENTIFIER: 'Identifier'>, nullable=False), 'Me_1': Component(name='Me_1', data_type=<class 'vtlengine.DataTypes.Number'>, role=<Role.MEASURE: 'Measure'>, nullable=True)}, data=None)}

Example 2: Incorrect VTL

Note that, as compared to Example 1, the only change is that Me_1 is of the String data type, instead of Number.

from vtlengine import semantic_analysis

script = """
    DS_A := DS_1 * 10;
"""

data_structures = {
    'datasets': [
        {'name': 'DS_1',
         'DataStructure': [
             {'name': 'Id_1',
              'type':
                  'Integer',
              'role': 'Identifier',
              'nullable': False},
             {'name': 'Me_1',
              'type': 'String',
              'role': 'Measure',
              'nullable': True}
         ]
         }
    ]
}

sa_result = semantic_analysis(script=script, data_structures=data_structures)

print(sa_result)

Will raise the following Error:

raise SemanticError(code="1-1-1-2",
vtlengine.Exceptions.SemanticError: ('Invalid implicit cast from String and Integer to Number.', '1-1-1-2')

Run VTL Scripts

The run method serves to execute a VTL script with input datapoints.

Returns a dictionary with all the generated Datasets. When the output parameter is set, the engine will write the result of the computation to the output folder, else it will include the data in the dictionary of the computed datasets.

Two validations are performed before running, which can raise errors:

  • Semantic analysis: Equivalent to running the semantic_analysis method
  • Data load analysis: Basic check of the data structure (names and types)

Example 3: Simple run

from vtlengine import run
import pandas as pd

script = """
    DS_A := DS_1 * 10;
"""

data_structures = {
    'datasets': [
        {'name': 'DS_1',
         'DataStructure': [
             {'name': 'Id_1',
              'type':
                  'Integer',
              'role': 'Identifier',
              'nullable': False},
             {'name': 'Me_1',
              'type': 'Number',
              'role': 'Measure',
              'nullable': True}
         ]
         }
    ]
}

data_df = pd.DataFrame(
    {"Id_1": [1, 2, 3],
     "Me_1": [10, 20, 30]})

datapoints = {"DS_1": data_df}

run_result = run(script=script, data_structures=data_structures,
                 datapoints=datapoints)

print(run_result)

returns:

{'DS_A': Dataset(name='DS_A', components={'Id_1': Component(name='Id_1', data_type=<class 'vtlengine.DataTypes.Integer'>, role=<Role.IDENTIFIER: 'Identifier'>, nullable=False), 'Me_1': Component(name='Me_1', data_type=<class 'vtlengine.DataTypes.Number'>, role=<Role.MEASURE: 'Measure'>, nullable=True)}, data=  Id_1   Me_1
0    1  100.0
1    2  200.0
2    3  300.0)}

For more information on usage, please refer to the API documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vtlengine-1.0.2.tar.gz (222.9 kB view details)

Uploaded Source

Built Distribution

vtlengine-1.0.2-py3-none-any.whl (245.6 kB view details)

Uploaded Python 3

File details

Details for the file vtlengine-1.0.2.tar.gz.

File metadata

  • Download URL: vtlengine-1.0.2.tar.gz
  • Upload date:
  • Size: 222.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for vtlengine-1.0.2.tar.gz
Algorithm Hash digest
SHA256 16dbeb05604b5e4226505391fe5af106afac5a0bcfbc7372522c5ae30c3682fd
MD5 0f5c61ac09c07ea30dc32b026a3bb812
BLAKE2b-256 e390ad05c4ce84f790855245adbcac436a58b88fe087aa9b57bf563863089424

See more details on using hashes here.

File details

Details for the file vtlengine-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: vtlengine-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 245.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for vtlengine-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7e53d4e44756566721aa404b63c472d27f16ded451ed9db68785a7bb1098e6a9
MD5 0b52ec068ddd697d07d382391b9cc801
BLAKE2b-256 7c311a679c62351dffd5a086d9e8b1e11f659c720c12fad5fa9a46a86c613797

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page