TODO SHORT DESCRIPTION
Project description
README (WORK IN PROGRESS)
-
Gereneralized approach to data aggregation and decomposition
-
all variables must have same dimensions
-
variables can be either extensive or intensive
-
all dimensions can have (multiple) hierarchies
-
hierarchies all start at a single scalar root
-
example for alternative hierarchies for a time dimension
graph TD;
* --- Jan;
* --- Feb;
* --- *...[...];
* --- Dec;
Jan --- 01-01;
Jan --- 01-02;
Jan --- 01...[...];
Jan --- 01-31;
01-31 --- 00:00;
01-31 --- 01:00;
01-31 --- 01-31...[...];
01-31 --- 23:00;
graph TD;
* --- 1;
* --- 2;
* --- *...[...];
* --- 8760;
- Disaggregation maps data from one level in a hierarchy to the next one further down, aggregation to the next one up conversely, using level specific factors, which are usually all 1.0 or sum up to 1.0 for each item
Examples
Disaggregation (extensive) from A to B
Example: Distribute population by area (factors are shares of sub-areas to total area)
graph TD;
* --- A1;
* --- A2;
A1 -- 0.5 --> B1;
A1 -- 0.5 --> B2;
A2 -- 0.4 --> B3;
A2 -- 0.4 --> B4;
A2 -- 0.2 --> B5;
Aggregation (extensive) from B to A
Example: Aggregate population (simply add up)
flowchart TD;
B1 -- 1 --> A1;
B2 -- 1 --> A1;
B3 -- 1 --> A2;
B4 -- 1 --> A2;
B5 -- 1 --> A2;
A1 --- *
A2 --- *
Disaggregation (intensive) from A to B
Example: Solar radiation
graph TD;
* --- A1;
* --- A2;
A1 -- 1 --> B1;
A1 -- 1 --> B2;
A2 -- 1 --> B3;
A2 -- 1 --> B4;
A2 -- 1 --> B5;
Aggregation (intensive) from B to A
Example: Solar radiation (weighted average by area)
flowchart TD;
B1 -- 0.5 --> A1;
B2 -- 0.5 --> A1;
B3 -- 0.4 --> A2;
B4 -- 0.4 --> A2;
B5 -- 0.2 --> A2;
A1 --- *
A2 --- *
Calculations
Calculations can only be performed on variables when all dimensions are at the same hierarchy level for all dimensions.
To achieve this, we disaggregate them all to the lowest level first.
At the end, we aggregate to the desired output level.
A change in level (aggregation or disaggregation) can be represented by a matrix multiplication
D := \text{Data Matrix} \\
T := \text{Transformation Matrix} \\
D' = D \cdot T \\
Note: the data matrix will have more than one dimension. In the examples, the first dimension represents all dimensions that are not part of the transformation.
Disaggregation (extensive) from A(n=2) to B(n=5)
D = \begin{pmatrix}
0 & 1 \\
2 & 3 \\
4 & 5 \\
\end{pmatrix} \\
T = \begin{pmatrix}
0.5 & 0.5 & \cdot & \cdot & \cdot \\
\cdot & \cdot & 0.4 & 0.4 & 0.2
\end{pmatrix}
D' = \begin{pmatrix}
0.0 & 0.0 & 0.4 & 0.4 & 0.2 \\
1.0 & 1.0 & 1.2 & 1.2 & 0.6 \\
2.0 & 2.0 & 2.0 & 2.0 & 1.0 \\
\end{pmatrix}
Aggregation (extensive) from B(n=5) to A(n=2)
D = \begin{pmatrix}
0.0 & 0.0 & 0.4 & 0.4 & 0.2 \\
1.0 & 1.0 & 1.2 & 1.2 & 0.6 \\
2.0 & 2.0 & 2.0 & 2.0 & 1.0 \\
\end{pmatrix}
T = \begin{pmatrix}
1 & \cdot \\
1 & \cdot \\
\cdot & 1 \\
\cdot & 1 \\
\cdot & 1
\end{pmatrix}
D' = \begin{pmatrix}
0 & 1 \\
2 & 3 \\
4 & 5 \\
\end{pmatrix} \\
Multiple transformations can be combined into a single transformation Matrix.
T = T_1 \cdot T_2
Implementation with numpy
>>> .dot()
array([[0. , 0. , 0.4, 0.4, 0.2],
[1. , 1. , 1.2, 1.2, 0.6],
[2. , 2. , 2. , 2. , 1. ]])
>>>
D = np.array([
[0, 1],
[2, 3],
[4, 5]]
)
T = np.array([
[0.5, 0.5, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.4, 0.4, 0.2]
])
D2 = D.dot(T)
# more generally, if there is more dimensions:
D2 = np.matmul(D.swapaxes(i_dim, n_dims-1), T).swapaxes(n_dims-1, i_dim)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for data-disaggregation-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfeb9cdee2c7b634b5ca54fa0aa5d8cad44b994a0d652471e0dd6ea246a998fe |
|
MD5 | 1cadb4c8064b814524669339599dae01 |
|
BLAKE2b-256 | 7d8bae20c8a4f81ddc116f0a11fb17480babca81206cca345bbfe41d86dc0310 |