Automatic hierarchical dataset interpolation and disaggregation
Project description
Data completion tool
A module to create a dataset with a parent/children hierarchy (H-MECE) and permorm a completion algorithm (parent/children disaggregation and time interpolation). It is used to transform a sparse dataset into a complete dataset ready to be used in a model.
There are some plot functions to visualize the results of the completion algorithm keeping a trace of the aggregations and interpolations made to facilitate any revue of the dataset.
[TOC]
📋 Requirements
- Python 3.12 or higher
We recommend using one virtual environment per Python project to manage dependencies and maintain isolation. You can use a package manager like uv to help you with library dependencies and virtual environments.
📦 Install the data-completion-tool Package
Install the data-completion-tool package via pip:
pip install data-completion-tool
⚙️ Complete a dataset
Here is an example of a variable completion using a specified hierarchy:
import pandas as pd
from data_completion_tool import dct
# Create a Dataset instance
ds = dct.DataSet()
# Create a dimension dataframe to set the hierarchy
dimension = pd.DataFrame(
{"name": ["location"], "value": ["france"], "parents_values": ["europe"]}
)
ds.set_dimension(dimension)
# Create a variable to complete
variable = pd.DataFrame(
{
"location": ["france", "france", "france", "europe", "europe", "europe"],
"time": [1950, 1980, 2000, 1920, 1970, 2020],
"value": [15, 40, 65, 10, 54, 76],
"unit": ["random", "random", "random", "random", "random", "random"],
"source_id": [1, 1, 1, 2, 2, 2],
}
)
# Create aspect properties
aspect_property = {"location": ["intensive"]}
# Complete the dataset
completed_dataset = ds.completion(variable, aspect_property)
📊 Visualize Datasets
You can visualise the completed datasets using special methods of the DataSet class:
ds.plot_with_source(completed_dataset, "variable", "lower right", same_figure=True)
🤝 Contributing
We welcome contributions to the Data providing project! To get started, please refer to the CONTRIBUTING file for detailed guidelines.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_completion_tool-0.2.0.tar.gz.
File metadata
- Download URL: data_completion_tool-0.2.0.tar.gz
- Upload date:
- Size: 82.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a391b7032c0e5a2563d5a80d7cac41be043b530ff77296f330905c219a4dcb1
|
|
| MD5 |
12d05562a4bdc069af31058e98c1ab54
|
|
| BLAKE2b-256 |
14a39bbaffe73c02756446a4363b79f7fea477856c1196920ba2cb5a6c6d077e
|
File details
Details for the file data_completion_tool-0.2.0-py3-none-any.whl.
File metadata
- Download URL: data_completion_tool-0.2.0-py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
357ea6dd1964ea1e234c5d178e4a14561e7c36ea8759fe2372706bab79b0a8e5
|
|
| MD5 |
2ad1815e3d59a4fe0885e37b2c3cff7e
|
|
| BLAKE2b-256 |
b49197f7cd8e70570e98b1feaf0bb9fc32f066226a596e2c56728214bd833ea1
|