Performs iterative proportional fitting on tabular data
Project description
IPFpy
Iterative proportionial fitting that can work with larger than memory tables.
inputs tables can be either pandas dataframes, .csv file or .parquet file
input: table
Thif table lists all the cells or units in a table whose value will be adjusted by Iterative proportional fitting along with boundaries whose adjusted value is meant to stay within.
unit_id : identifier for the decision variables
weight : decision variables. >=0
lb : weight >= lb
ub : weight <= up
constraints : table
This table maps for each constaint identifier, which unit_id to aggregate
unit_id : identifier for the decision variables
cons_id : identifier for each marging
targets : table
This table lists all the target values that the margins should add up to once adjusted
cons_id : identifier for each marging
cons_type : constraint must be greater or equal (ge) the target, lesser or equal (le), or equal (eq)
target : value for the constaint
unit_id : name of the column that identifies each value to be adjusted (default "unit_id")
var : name of the column that contains the value to be adjusted (default "weight")
cons_id : name of the column that identifies each constraints (default "cons_id")
db_file (optional ): name of the database file on disc that will hold the temporary tables. Default is in memory
out_parquet (optional): name path of the parquet output file
out_csv (optional) : name path of the csv output file
silent (optinal default false): Whether or not to print progress to screen
output : table
Output table lists all the initials cells/units along with their adjusted values.
untiId : identifier for the decision variables
weight : adjusted weight. Will fit in the interval lb <= weight <= ub
Example
from IPFpy import *
import numpy as np
# test IPF
#step1 - create a table and generate the margins as well as the file that maps the cells of the inner table to the margins
raw_table = generate_random_table(4,8,scale=2)
input_table, margins, constraints = aggregate_table(raw_table, by=[0,1,2,3], var="value")
margins = margins.rename(columns={"value":"target"}) #rename margin column
# step2 - modify the margins by adding noise to the inner cells
new_table = input_table.copy().drop("unit_id",axis=1)
new_table["value"] = input_table["value"] * np.random.uniform(0, 2, input_table.shape[0])
modified_table, modified_margins, constraints = aggregate_table(new_table, by=[0,1,2,3], var="value")
modified_margins = modified_margins.rename(columns={"value":"target"})
# write table as csv
input_table.to_csv('input_table.csv', index=False)
constraints.to_csv('constraints.csv', index=False)
modified_margins.to_csv('modified_margins.csv', index=False)
df.to_parquet('my_data.parquet', engine='pyarrow')
# adjust the table in step1 to the margin obtained in step2
adjusted_table = ipf( input=input_table,
constraints=constraints,
targets=modified_margins,
unit_id="unit_id",
var="value",
cons_id="cons_id",
db_file=None,
tol=0.1,
maxIter=1000)
# output to a file
ipf(input =input_table,
constraints =constraints,
targets =modified_margins,
unit_id ="unit_id",
var ="value",
cons_id ="cons_id",
tol =0.1,
maxIter =1000,
out_csv ="adjusted_table.csv",
silent=True)
# input directly from files
# paths to the input files have to be adjusted to correspond to the location of the input files
ipf(input ="/home/Desktop/Programming/IPF/IPF/input_table.csv",
constraints ="/home/Desktop/Programming/IPF/IPF/constraints.csv",
targets ="/home/Desktop/Programming/IPF/IPF/modified_margins.csv",
unit_id ="unit_id",
var ="value",
cons_id ="cons_id",
tol =0.1,
maxIter =1000,
out_csv ="adjusted_table.csv",
silent=True)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ipfpy-0.1.0.tar.gz.
File metadata
- Download URL: ipfpy-0.1.0.tar.gz
- Upload date:
- Size: 6.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
820918972902207617c55f1587937442865bbbcb16df650efa5f6d4c0cf69886
|
|
| MD5 |
bedf6b94ba7e7da7a5671b9b7dc5ae84
|
|
| BLAKE2b-256 |
885847c5abc35b81f4816b4ea3878afab773686da94f9f57a01da73c49d5f851
|
Provenance
The following attestation bundles were made for ipfpy-0.1.0.tar.gz:
Publisher:
python-publish.yml on Veozen/IPFpy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ipfpy-0.1.0.tar.gz -
Subject digest:
820918972902207617c55f1587937442865bbbcb16df650efa5f6d4c0cf69886 - Sigstore transparency entry: 929481316
- Sigstore integration time:
-
Permalink:
Veozen/IPFpy@8ed1b9af77f8bb4db8a4db8b7b5faeb07211675a -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Veozen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@8ed1b9af77f8bb4db8a4db8b7b5faeb07211675a -
Trigger Event:
release
-
Statement type:
File details
Details for the file ipfpy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ipfpy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96433f5582865e9902469a1593f02db269702b932f3a60f35c1df6c0ef1c3d86
|
|
| MD5 |
97583476afa903c41d2e56585227653e
|
|
| BLAKE2b-256 |
f8004df6b74e258a0e04de5d3d557e8b4c04895e37f90999c27ff832bcd3b96b
|
Provenance
The following attestation bundles were made for ipfpy-0.1.0-py3-none-any.whl:
Publisher:
python-publish.yml on Veozen/IPFpy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ipfpy-0.1.0-py3-none-any.whl -
Subject digest:
96433f5582865e9902469a1593f02db269702b932f3a60f35c1df6c0ef1c3d86 - Sigstore transparency entry: 929481349
- Sigstore integration time:
-
Permalink:
Veozen/IPFpy@8ed1b9af77f8bb4db8a4db8b7b5faeb07211675a -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Veozen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@8ed1b9af77f8bb4db8a4db8b7b5faeb07211675a -
Trigger Event:
release
-
Statement type: