Simple, intelligent imputation analysis for data science

These details have not been verified by PyPI

Project links

Project description

FunPuter - Intelligent Imputation Analysis

Simple, fast, intelligent recommendations for handling missing data.

FunImpute analyzes your data and suggests the best imputation methods based on:

Missing data mechanisms (MCAR, MAR, MNAR detection)
Data types and statistical properties
Business rules and column dependencies
Adaptive thresholds based on your dataset characteristics

Quick Start

Installation

pip install funputer

Basic Usage

Python API (Recommended)

import funimpute

# Analyze your dataset
suggestions = funputer.analyze_imputation_requirements(
    metadata_path="metadata.csv",
    data_path="data.csv"
)

# Use the suggestions
for suggestion in suggestions:
    print(f"{suggestion.column_name}: {suggestion.proposed_method}")
    print(f"  Rationale: {suggestion.rationale}")
    print(f"  Confidence: {suggestion.confidence_score:.3f}")

Command Line

# Analyze and save results
funputer -m metadata.csv -d data.csv -o suggestions.csv

# View results
funputer -m metadata.csv -d data.csv --verbose

Metadata Format

Create a CSV with your column information:

column_name,data_type,min_value,max_value,unique_flag,dependent_column,business_rule,description
user_id,integer,1,999999,TRUE,,,User identifier
age,integer,0,120,FALSE,,Must be positive,User age
income,float,0,,FALSE,age,Higher with age,Annual income
category,categorical,,,FALSE,,,User category A/B/C

Required columns:

column_name: Name of your data column
data_type: One of integer, float, string, categorical, datetime, boolean

Optional columns:

min_value, max_value: Valid ranges for numeric data
unique_flag: Set to TRUE for ID columns
dependent_column: Related column for dependency analysis
business_rule: Business constraints or relationships
description: Human-readable description

Client Application Integration

Direct DataFrame Analysis

import pandas as pd
import funimpute
from funputer import ColumnMetadata

# Your data
data = pd.DataFrame({
    'age': [25, None, 35, 42, None],
    'income': [50000, 60000, None, 80000, 45000],
    'category': ['A', 'B', None, 'A', 'C']
})

# Define metadata programmatically
metadata = [
    ColumnMetadata('age', 'integer', min_value=0, max_value=120),
    ColumnMetadata('income', 'float', dependent_column='age', business_rule='Higher with age'),
    ColumnMetadata('category', 'categorical')
]

# Get suggestions
suggestions = funputer.analyze_dataframe(data, metadata)

# Apply suggestions (Phase 2 - your implementation)
for s in suggestions:
    if s.proposed_method == "Median":
        data[s.column_name].fillna(data[s.column_name].median(), inplace=True)
    elif s.proposed_method == "Mode":
        data[s.column_name].fillna(data[s.column_name].mode().iloc[0], inplace=True)
    # ... implement other methods as needed

Configuration

from funputer import AnalysisConfig

# Custom analysis settings
config = AnalysisConfig(
    iqr_multiplier=2.0,           # Outlier detection sensitivity
    correlation_threshold=0.4,    # Relationship detection threshold
    skewness_threshold=1.5        # Mean vs median decision point
)

suggestions = funputer.analyze_imputation_requirements(
    "metadata.csv", "data.csv", config=config
)

What You Get

Each suggestion includes:

suggestion.column_name          # 'age'
suggestion.proposed_method      # 'Median'
suggestion.rationale           # 'Numeric data with MCAR mechanism...'
suggestion.confidence_score    # 0.847
suggestion.missing_count       # 15
suggestion.missing_percentage  # 0.075 (7.5%)

Available Methods:

Mean, Median, Mode - Statistical imputation
Regression, kNN - Predictive imputation
Business Rule - Domain-specific logic
Forward Fill, Backward Fill - Temporal imputation
Manual Backfill - Requires human intervention
No action needed - No missing values

Key Features

✅ Intelligent Analysis - Detects missing data mechanisms automatically
✅ Business Rule Integration - Uses your domain knowledge
✅ Adaptive Thresholds - Adjusts based on your data characteristics
✅ High Performance - Analyzes 100+ columns in seconds
✅ Simple API - Easy integration with existing workflows
✅ Type Safe - Full type hints and validation

Real-World Example

# Your existing data pipeline
import pandas as pd
import funimpute

def process_customer_data(df):
    # 1. Define your metadata once
    metadata = [
        ColumnMetadata('customer_id', 'integer', unique_flag=True),
        ColumnMetadata('age', 'integer', min_value=0, max_value=120),
        ColumnMetadata('income', 'float', dependent_column='age'),
        ColumnMetadata('segment', 'categorical'),
    ]
    
    # 2. Get intelligent suggestions
    suggestions = funputer.analyze_dataframe(df, metadata)
    
    # 3. Apply high-confidence suggestions automatically
    for s in suggestions:
        if s.confidence_score > 0.8:
            if s.proposed_method == "Median":
                df[s.column_name].fillna(df[s.column_name].median(), inplace=True)
            elif s.proposed_method == "Mode":
                df[s.column_name].fillna(df[s.column_name].mode().iloc[0], inplace=True)
        else:
            print(f"Manual review needed for {s.column_name}: {s.rationale}")
    
    return df

Distribution

PyPI Package: pip install funputer
Source Code: Available on GitHub
Requirements: Python 3.9+, pandas, numpy, scipy

License

MIT License - Use freely in commercial and open-source projects.

Focus: Get intelligent imputation recommendations, not complex infrastructure.
Philosophy: Simple tools that scale with your needs.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.7.1

Dec 16, 2025

1.7.0

Nov 4, 2025

1.6.0

Nov 3, 2025

1.5.2

Aug 19, 2025

1.5.1

Aug 13, 2025

1.4.0

Aug 9, 2025

1.3.7

Aug 9, 2025

1.3.6

Aug 9, 2025

1.3.5

Aug 8, 2025

1.3.4

Aug 8, 2025

1.3.3

Aug 7, 2025

1.3.2

Aug 7, 2025

1.3.1

Aug 6, 2025

1.2.1

Aug 6, 2025

1.1.0

Aug 5, 2025

1.0.4

Aug 5, 2025

This version

1.0.3

Aug 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

funputer-1.0.3.tar.gz (48.6 kB view details)

Uploaded Aug 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

funputer-1.0.3-py3-none-any.whl (44.3 kB view details)

Uploaded Aug 1, 2025 Python 3

File details

Details for the file funputer-1.0.3.tar.gz.

File metadata

Download URL: funputer-1.0.3.tar.gz
Upload date: Aug 1, 2025
Size: 48.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for funputer-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`7d64af8433e88df610d989880882593743dcad63cea62f6a35a7bc697514661f`
MD5	`2995ba5af477707eefe6fb8f94512063`
BLAKE2b-256	`23e8a52814e1b6c025f129e3485d9c01c3697c4d8955069df547cfb3fdc34fec`

See more details on using hashes here.

File details

Details for the file funputer-1.0.3-py3-none-any.whl.

File metadata

Download URL: funputer-1.0.3-py3-none-any.whl
Upload date: Aug 1, 2025
Size: 44.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for funputer-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4925a094677ac955a2eb0375a7a431b67db37ce5e109ce6862cbac504bd6c08a`
MD5	`7f19cacd7d5e68418041e301fdee1827`
BLAKE2b-256	`7d8148365f389cd62e8e99589fb17efaaf4d1a8562ddf7ff790c8ede36a0e10d`

See more details on using hashes here.

funputer 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FunPuter - Intelligent Imputation Analysis

Quick Start

Installation

Basic Usage

Metadata Format

Client Application Integration

Direct DataFrame Analysis

Configuration

What You Get

Key Features

Real-World Example

Distribution

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes