A python module for TM1 Bedrock.
Project description
TM1 Data Integration and Automation Toolkit
This project is a Python-based toolkit by Knowledgeseed, designed to streamline data integration and automation tasks with IBM Planning Analytics (TM1). It leverages the TM1py library to provide a high-level, configurable, and extensible framework for moving data between TM1 cubes, SQL databases, and CSV files. The toolkit is built with a focus on performance, offering features like asynchronous operations and detailed logging for debugging and optimization.
For further information about the project, please refer to https://knowledgeseed.ch/blog/tmbedrockpy_python_library.
The detailed documentation of the project is available at https://tm1-bedrock-py.readthedocs.io/en/latest/.
Features
- Versatile Data Flows:
- Copy data between TM1 cubes (inter-cube and intra-cube).
- Load data from SQL databases into TM1 cubes.
- Export data from TM1 cubes to SQL tables.
- Load data from CSV files into TM1 cubes.
- Export data from TM1 cubes to CSV files.
- Powerful Data Transformation:
- Apply a series of mapping and transformation steps to your data in-flight.
- Support for shared and step-specific mapping rules.
- Methods for replacing, mapping and replacing, and mapping and joining data based on dimension values.
- Ability to redimension data by adding, removing, or renaming columns.
- Transform values using custom functions.
- Performance-Oriented:
- Asynchronous data writing for improved performance on large datasets.
- Asynchronous execution of multiple data copy operations in parallel.
- Detailed performance metrics logging to identify bottlenecks.
- Robust and Debuggable:
- Comprehensive logging framework with configurable verbosity.
- Option to log intermediate DataFrames to CSV files for detailed debugging of transformation steps.
- Validation utilities to ensure data integrity.
- Extensible:
- Use custom functions for data extraction, transformation, and loading to fit your specific needs.
How to Use
This toolkit is designed to be used as a library within your Python scripts for TM1 automation. The core functionality is exposed through a set of high-level functions in the bedrock.py module.
Core Functions
data_copy_intercube: Copies data from a source TM1 cube to a target TM1 cube.data_copy: Copies and transforms data within the same TM1 cube.load_sql_data_to_tm1_cube: Loads data from a SQL query or table into a TM1 cube.load_tm1_cube_to_sql_table: Exports data from a TM1 cube to a SQL table.load_csv_data_to_tm1_cube: Loads data from a CSV file into a TM1 cube.load_tm1_cube_to_csv_file: Exports data from a TM1 cube to a CSV file.async_executor_tm1: Asynchronously executes multipledata_copy,data_copy_intercube,load_tm1_cube_to_sql_tableorload_csv_data_to_tm1_cubeoperations.async_executor_tm1_to_sql: Asynchronously executes multipleload_tm1_cube_to_sql_tableoperations.async_executor_csv_to_tm1: Asynchronously executes multipleload_csv_data_to_tm1_cubeoperations.
Create a connection in TM1 with at least the following parameters set:
- address
- user
- password
- port
- ssl
You can check your connection via running example/check_connectivity.py. You can configure your connection in examples/config_example.ini or if left empty, via user input from the terminal.
Example: Copying Data Between Cubes with Transformations
Here is an example of how to use data_copy_intercube to copy data from a source cube to a target cube while applying some transformations. For further examples, please refer to tests/test_integration.py and tests/test_benchmark.py.
from TM1py import TM1Service
from TM1_bedrock_py.bedrock import data_copy_intercube
# --- 1. Connect to TM1 ---
# Establish your TM1Service connection
tm1 = TM1Service(address='localhost', port=8001, user='admin', password='apple', ssl=False)
# --- 2. Define the Data Copy Parameters ---
# MDX to select the source data
source_mdx = """
SELECT
{ [Version].[Actual] } ON COLUMNS,
{ [Year]. } ON ROWS
FROM [Sales]
WHERE ([Measures].[Amount])
"""
# Define mapping and transformation steps
mapping_steps = [
{
"method": "replace",
"mapping": {
"Region": {"Old Region": "New Region"}
}
},
{
"method": "map_and_replace",
"mapping_mdx": "SELECT {[Product].[All Products].Children} ON COLUMNS FROM [Product]",
"mapping_dimensions": {
"Product": "Product"
}
}
]
# --- 3. Execute the Data Copy ---
data_copy_intercube(tm1_service=tm1, target_cube_name="Sales Target", data_mdx=source_mdx, mapping_steps=mapping_steps,
clear_target=True, logging_level="INFO")
Configuration of Mapping Steps
The mapping_steps parameter is a powerful feature that allows you to define a pipeline of transformations. Each step is a dictionary that specifies the transformation method and its parameters.
replace: Performs a simple find-and-replace on dimension elements.map_and_replace: Uses a mapping DataFrame (from an MDX query, SQL, or CSV) to replace values in specified dimensions.map_and_join: Joins columns from a mapping DataFrame to the main data based on shared dimensions.
Optimizing Performance
This toolkit provides several ways to optimize the performance of your data operations:
- Asynchronous Operations: For large data writes to TM1, set the
async_write=Trueparameter in the core functions. This will write data in parallel, significantly reducing the overall execution time. - Asynchronous Executors: When you need to run multiple independent data copy processes, use the
async_executor_functions. These functions will execute the operations concurrently, making efficient use of available resources.
import asyncio
from TM1py import TM1Service
from TM1_bedrock_py.bedrock import async_executor_tm1, data_copy_intercube
tm1 = TM1Service(address='localhost', port=8001, user='admin', password='apple', ssl=False)
asyncio.run(async_executor_tm1(
data_copy_function=data_copy_intercube,
tm1_service=tm1,
data_mdx_template=data_mdx_template,
skip_zeros=True,
skip_consolidated_cells=True,
target_cube_name=target_cube_name,
shared_mapping=shared_mapping,
mapping_steps=mapping_steps,
clear_target=True,
async_write=True,
logging_level="DEBUG",
use_blob=True,
param_set_mdx_list=param_set_mdx_list,
clear_param_templates=clear_param_templates,
ignore_missing_elements=True,
max_workers=8
))
IMPORTANT NOTES ON PERFORMANCE
- skip_zeros: When extracting data from TM1, setting
skip_zeros=Truecan reduce the amount of data transferred and processed, especially for sparse cubes. - use_blob: Default value is
use_blob=FalseasTrueneedsadministratorprivileges. Setting the value toTrueimproves performance significantly. - slice_size_of_dataframe: Default value is
slice_size_of_dataframe=5000. - Efficient MDX: The performance of the entire process is heavily dependent on the efficiency of your MDX queries. Ensure your MDX is optimized for the source cube structure.
Debugging Options
The toolkit includes a robust logging framework to help you debug your data integration processes.
- Logging Level: You can control the verbosity of the logs by setting the
logging_levelparameter in the core functions. The available levels are DEBUG, INFO, WARNING, ERROR, and CRITICAL. - DataFrame Logger: This is a powerful debugging feature that allows you to inspect the state of your DataFrame at various stages of the transformation pipeline. To enable it, set
df_verbose_logging=Truein the core functions. This will save a CSV file of the DataFrame at each significant step (e.g., after each mapping step) into a dataframe_logs directory. This is invaluable for understanding how your data is being transformed and for troubleshooting mapping issues. - Performance Metrics: The execution time of key functions is automatically logged. By setting the logging level to DEBUG, you can see the time taken for each major operation, which helps in identifying performance bottlenecks. The
logging.jsonfile can be configured to output these metrics to a separate file for analysis.
Requirements
- TM1py >=2.1, <3.0
- pandas >=2.3.3, <3.0.0
- json_logging >=1.3.0, <2.0.0
- sqlalchemy >=2.0.0, <3.0.0
- pyodbc >=5.2.0, <6.0.0
- pyyaml >=6.0, <7.0
- openpyxl >=3.1.0
Installation
Install without airflow
pip install tm1-bedrock-py
Install with airflow
As of now the airflow_executor sub-package of tm1-bedrock-py only supports apache-airflow for versions >=2.4.0, <=2.11.0, with Python versions 3.9 to 3.12.
pip install tm1-bedrock-py[airflow]
Additional options for
airflowdatabase connectors for Microsoft-MSSQL and Postgres
pip install tm1-bedrock-py[airflow, microsoft-mssql]
pip install tm1-bedrock-py[airflow, postgres]
Development
Windows
git clone https://github.com/KnowledgeSeed/tm1_bedrock_py.git
cd tm1_bedrock_py
python -m venv .env
.\.env\Scripts\activate
pip install -r requirements-dev.txt
python -m build
Linux/macOS
git clone https://github.com/KnowledgeSeed/tm1_bedrock_py.git
cd tm1_bedrock_py
virtualenv .env
source .env/bin/activate
pip install -r requirements-dev.txt
python -m build
Manual integration testing
Use tests_integration/docker-compose.yaml as a baseline, which spins up a TM1 provider and a base TM1 database and a PostgreSQL database to test against. Please note that tm1-docker image is properietary IBM product wrapped in Docker by Knowledgeseed and therefore it is only available internally for Knowledgeseed developers.
docker compose up -d tm1 test_db_postgres
To obtain a licensed IBM TM1 database for testing or production purpose, please see https://www.ibm.com/topics/tm1 for further details.
License
See LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tm1_bedrock_py-1.3.4.tar.gz.
File metadata
- Download URL: tm1_bedrock_py-1.3.4.tar.gz
- Upload date:
- Size: 109.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ab355e1ceca9d8904d8320703abcff9b7713e61bb781c15d96173588ae6cfdc
|
|
| MD5 |
c8b98614f83844bb78d653252b08a1e7
|
|
| BLAKE2b-256 |
3cfd3fcf8a343714af02485f45472598a12daea40d91b4d05d02f604ae7a6089
|
File details
Details for the file tm1_bedrock_py-1.3.4-py3-none-any.whl.
File metadata
- Download URL: tm1_bedrock_py-1.3.4-py3-none-any.whl
- Upload date:
- Size: 102.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
395fc0f1e789ae3e409a94b7dc1dbeb5d320b5dbd8ce3af53bb8a1c110ccb033
|
|
| MD5 |
826db91d51f74c6e7ec907143c57552e
|
|
| BLAKE2b-256 |
b0dbe34072bbdce79c2bf9123d2477742a34f6672ce0fc7817ab9898bf94c600
|