Skip to main content

Phil: multiverse imputation powered by topology-based representation learning.

Project description

Phil

Phil is a representation-guided imputation library for missing tabular data.

It generates multiple imputations using a configurable strategy grid, computes Euler Characteristic Transform (ECT) descriptors over each imputed dataset, and selects the most representative imputation from the candidate set.

Installation

pip install phil

phil requires the trailed backend for ECT computation. Install it from the KRV research index or provide a compatible local build.

What Phil Does

  1. Impute — runs a grid of imputation strategies (sklearn estimators or custom) over the input dataframe, producing a set of candidate datasets
  2. Describe — computes an ECT descriptor for each candidate via the trailed backend
  3. Select — picks the candidate closest to the mean descriptor (most representative imputation)
  4. Transform — exposes the fitted pipeline for inference on new data

Quick Start

import pandas as pd
from phil import Phil

df = pd.read_csv("data_with_missing.csv")

phil = Phil(samples=30, random_state=42)
imputed_df = phil.fit(df)

# Apply the same fitted pipeline to new data
new_df = phil.transform(new_data)

scikit-learn Pipeline Integration

from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from phil import PhilTransformer

pipe = Pipeline([
    ("imputer", PhilTransformer(samples=20, random_state=0)),
    ("model", RandomForestClassifier()),
])
pipe.fit(X_train, y_train)

Configuration

Imputation grids

Phil ships with named grids accessible via GridGallery:

Name Methods
default BayesianRidge, DecisionTree, RandomForest, GradientBoosting
sampling DistributionImputer (empirical sampling)
finance IterativeImputer, KNNImputer, SimpleImputer
healthcare KNNImputer, SimpleImputer, IterativeImputer
marketing SimpleImputer, KNNImputer, IterativeImputer
engineering SimpleImputer, KNNImputer, IterativeImputer

Pass a grid name or an ImputationConfig directly:

from phil import Phil, ImputationConfig
from sklearn.model_selection import ParameterGrid

config = ImputationConfig(
    methods=["KNNImputer"],
    modules=["sklearn.impute"],
    grids=[ParameterGrid({"n_neighbors": [3, 5, 7]})],
)
phil = Phil(param_grid=config)

ECT descriptor

ECT is configured via ECTConfig:

from phil import Phil, ECTConfig

ect_config = ECTConfig(
    num_thetas=64,
    radius=1.0,
    resolution=100,
    scale=500,
    normalize=True,
    seed=42,
)
phil = Phil(config=ect_config)

Development

uv sync --all-extras
uv run pytest -v
uv run black phil/ tests/

Documentation

Project documentation lives under docs/source with unified API and guide pages. Build locally with uv run sphinx-build -M html docs/source docs/build.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

philler-1.0.1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

philler-1.0.1-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file philler-1.0.1.tar.gz.

File metadata

  • Download URL: philler-1.0.1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for philler-1.0.1.tar.gz
Algorithm Hash digest
SHA256 cc38ad24f3e4a29341639ab51cd771586624056ecc8a6c1930ccb2ccf25951ac
MD5 5b6040e41305b2e187a2cf0554274b70
BLAKE2b-256 b490b05fc2d77e496265968d37744ee8812d1745c20a1c168a53e72e8061fd3f

See more details on using hashes here.

File details

Details for the file philler-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: philler-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for philler-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7feaacc38c6620af9f3d4bb7b92fd1753e076774622b1494fd0b20af3ee6dca5
MD5 b6b9df13dd3a26a362de602c0d572fd7
BLAKE2b-256 a9165ddbad4f819707b074ce14f538ebab5da5b393f67a8ffa24cf6b9037004c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page