Skip to main content

🔬 Professional data inspection and visualization toolkit for Polars - featuring advanced X-ray analysis, comprehensive plotting, and data quality assessment.

Project description

🔬 Polarscope

PyPI - Python Version License: MIT

Professional data inspection and visualization toolkit for Polars 🐻‍❄️

Polarscope is a modern, fast, and comprehensive data analysis library built exclusively for Polars. It provides advanced X-ray analysis, beautiful visualizations, and comprehensive data quality assessment - all with native Polars performance.


✨ Key Features

🔬 X-ray Analysis

  • xray(df) → Comprehensive data quality assessment with beautiful Great Tables output
  • Advanced statistics: skewness, kurtosis, outliers, normality tests, data quality flags
  • Performance metrics: execution timing and memory usage tracking
  • Customizable output: minimal or expanded views with professional formatting

📊 Multi-Backend Visualization

  • corr_heatmap(df) → Correlation analysis with advanced filtering and target analysis
  • dist_plot(df) → Distribution plotting with statistical overlays
  • missingval_plot(df) → Missing value pattern analysis
  • cat_plot(df) → Categorical data frequency analysis
  • corr_plot(df) → Interactive correlation exploration
  • 3 backends supported: Plotly (default), Seaborn, Altair

🧹 Data Processing

  • clean_column_names(df) → Normalize and deduplicate column names
  • data_cleaning(df) → Comprehensive automated data cleaning pipeline
  • convert_datatypes(df) → Intelligent dtype optimization
  • drop_missing(df) → Advanced missing value handling

🛠️ Utilities

  • save_fig() → Universal figure saving (PNG, HTML, SVG, etc.)
  • Native Polars performance - no Pandas dependency
  • Type-safe with comprehensive docstrings

🚀 Quick Start

Installation

# Basic installation
pip install polarscope

# With all optional dependencies
pip install polarscope[all]

# Individual extras
pip install polarscope[plotly,seaborn,great_tables]

Basic Usage

import polars as pl
import polarscope as plc

# Load your data
df = pl.read_csv("data.csv")

# 🔬 Get comprehensive X-ray analysis
plc.xray(df)

# 📊 Create beautiful visualizations
plc.corr_heatmap(df, backend="plotly")
plc.dist_plot(df, column="price", backend="seaborn")
plc.missingval_plot(df, backend="altair")

# 🧹 Clean and optimize your data
df_clean = plc.data_cleaning(df)
df_optimized = plc.convert_datatypes(df_clean)

Advanced X-ray Analysis

# Expanded analysis with custom settings
plc.xray(
    df,
    expanded=True,                    # Show all statistics
    corr_target="target_column",      # Correlation to specific column
    outlier_method="iqr",            # Outlier detection method
    decimals=3,                      # Formatting precision
    compact=True                     # Compact number formatting
)

# Custom percentiles and quality thresholds
plc.xray(
    df,
    percentiles=[0.1, 0.25, 0.5, 0.75, 0.9],
    missing_threshold=0.3,           # Flag high missingness
    outlier_threshold=0.05,          # Flag outlier-heavy columns
    normality_test=True,             # Include normality tests
    uniformity_test=True             # Include uniformity tests
)

📈 Performance

Polarscope is built for speed and efficiency:

  • Lightning fast: Native Polars operations throughout
  • Memory efficient: Optimized for large datasets
  • Scalable: Handles millions of rows with ease
  • Professional: Comprehensive test suite with 100% pass rate

🎯 Why Polarscope?

Polars-Native

  • Built exclusively for Polars - no Pandas dependencies
  • Native performance and memory efficiency
  • Type-safe operations with Polars' query engine

Professional Quality

  • Production-ready with comprehensive testing
  • Beautiful, customizable output with Great Tables
  • Scientific precision in statistical analysis

Multi-Backend Flexibility

  • Choose your preferred visualization backend
  • Consistent API across Plotly, Seaborn, and Altair
  • Easy switching between backends

Comprehensive Analysis

  • Goes beyond basic .describe() functionality
  • Advanced statistics and quality assessment
  • Data quality flags and recommendations

📚 Documentation

Core Functions

Function Description Key Features
xray() Comprehensive data analysis Statistics, quality flags, performance metrics
corr_heatmap() Correlation visualization Multiple backends, filtering, target analysis
dist_plot() Distribution analysis Statistical overlays, multiple backends
missingval_plot() Missing value patterns Percentage/absolute counts, pattern analysis
data_cleaning() Automated cleaning Duplicates, missing values, optimization

Backends

  • Plotly (default): Interactive, web-ready visualizations
  • Seaborn: Statistical plotting with beautiful aesthetics
  • Altair: Grammar of graphics approach

🤝 Contributing

Contributions are welcome! Please feel free to submit issues, feature requests, or pull requests.


📄 License

MIT License - see LICENSE file for details.


🙏 Acknowledgments

Inspired by klib and ydata-profiling, but built from the ground up for Polars performance and modern data science workflows.


🔬 Ready to X-ray your data? Install Polarscope today!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polarscope-1.2.0.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polarscope-1.2.0-py3-none-any.whl (52.9 kB view details)

Uploaded Python 3

File details

Details for the file polarscope-1.2.0.tar.gz.

File metadata

  • Download URL: polarscope-1.2.0.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for polarscope-1.2.0.tar.gz
Algorithm Hash digest
SHA256 c7513ee0f506ef948998649bafc226b719770442eb4a2ed14ed10a6e94ae42fc
MD5 cee1e8ebe9545bb1dbcda4e24d8a2641
BLAKE2b-256 761b1c9ac22091edbed6e558b68095e8107b854e5545e32fa227f67d72c1c7d7

See more details on using hashes here.

File details

Details for the file polarscope-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: polarscope-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 52.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for polarscope-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30a9e8ec4aec3f2b5493eaddae899d4173066c1d2cb2e2aef9c2b7a60288cc6b
MD5 47ffc6489ea9e53daea706eecdfc88c0
BLAKE2b-256 05055e1127622a40405b635b6151568e6561a146b418dd1fbb1dc3a6780ffb6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page