Phil: multiverse imputation powered by topology-based representation learning.
Project description
Phil
Phil is a representation-guided imputation library for missing tabular data.
It generates multiple imputations using a configurable strategy grid, computes Euler Characteristic Transform (ECT) descriptors over each imputed dataset, and selects the most representative imputation from the candidate set.
Installation
pip install phil
phil requires the trailed backend for ECT computation. Install it from the
KRV research index or provide a compatible local build.
What Phil Does
- Impute — runs a grid of imputation strategies (sklearn estimators or custom) over the input dataframe, producing a set of candidate datasets
- Describe — computes an ECT descriptor for each candidate via the
trailedbackend - Select — picks the candidate closest to the mean descriptor (most representative imputation)
- Transform — exposes the fitted pipeline for inference on new data
Quick Start
import pandas as pd
from phil import Phil
df = pd.read_csv("data_with_missing.csv")
phil = Phil(samples=30, random_state=42)
imputed_df = phil.fit(df)
# Apply the same fitted pipeline to new data
new_df = phil.transform(new_data)
scikit-learn Pipeline Integration
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from phil import PhilTransformer
pipe = Pipeline([
("imputer", PhilTransformer(samples=20, random_state=0)),
("model", RandomForestClassifier()),
])
pipe.fit(X_train, y_train)
Configuration
Imputation grids
Phil ships with named grids accessible via GridGallery:
| Name | Methods |
|---|---|
default |
BayesianRidge, DecisionTree, RandomForest, GradientBoosting |
sampling |
DistributionImputer (empirical sampling) |
finance |
IterativeImputer, KNNImputer, SimpleImputer |
healthcare |
KNNImputer, SimpleImputer, IterativeImputer |
marketing |
SimpleImputer, KNNImputer, IterativeImputer |
engineering |
SimpleImputer, KNNImputer, IterativeImputer |
Pass a grid name or an ImputationConfig directly:
from phil import Phil, ImputationConfig
from sklearn.model_selection import ParameterGrid
config = ImputationConfig(
methods=["KNNImputer"],
modules=["sklearn.impute"],
grids=[ParameterGrid({"n_neighbors": [3, 5, 7]})],
)
phil = Phil(param_grid=config)
ECT descriptor
ECT is configured via ECTConfig:
from phil import Phil, ECTConfig
ect_config = ECTConfig(
num_thetas=64,
radius=1.0,
resolution=100,
scale=500,
normalize=True,
seed=42,
)
phil = Phil(config=ect_config)
Development
uv sync --all-extras
uv run pytest -v
uv run black phil/ tests/
Documentation
Project documentation lives under docs/source with unified API and guide pages.
Build locally with uv run sphinx-build -M html docs/source docs/build.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file philler-1.0.1.tar.gz.
File metadata
- Download URL: philler-1.0.1.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc38ad24f3e4a29341639ab51cd771586624056ecc8a6c1930ccb2ccf25951ac
|
|
| MD5 |
5b6040e41305b2e187a2cf0554274b70
|
|
| BLAKE2b-256 |
b490b05fc2d77e496265968d37744ee8812d1745c20a1c168a53e72e8061fd3f
|
File details
Details for the file philler-1.0.1-py3-none-any.whl.
File metadata
- Download URL: philler-1.0.1-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7feaacc38c6620af9f3d4bb7b92fd1753e076774622b1494fd0b20af3ee6dca5
|
|
| MD5 |
b6b9df13dd3a26a362de602c0d572fd7
|
|
| BLAKE2b-256 |
a9165ddbad4f819707b074ce14f538ebab5da5b393f67a8ffa24cf6b9037004c
|