Skip to main content

A Python package for quick processing and transforming SPSS files

Project description

TidySPSS

A Python package for quick processing, transforming, and managing SPSS (.sav) files with support for Excel and CSV inputs. This package is built on top of pyreadstat and pandas to give you flexible, production-ready template for processing and transforming data files into SPSS format with full metadata control.

Philosophy

"Make simple things simple, and complex things possible"

🔄 Processing Flow

LOAD → TRANSFORM → CONFIGURE → SAVE
  1. LOAD: Read file with metadata preservation
  2. TRANSFORM: Apply any pandas operations directly
  3. CONFIGURE: Set SPSS-specific options
  4. SAVE: Output with all configurations applied

Features

  • 📁 Multi-format support: Read from SPSS (.sav/.zsav), Excel (.xlsx/.xls), and CSV files
  • 🔄 Comprehensive transformations: Reorder, rename, drop, and keep columns with ease
  • 🏷️ Metadata management: Full support for SPSS labels, formats, measures, and display widths
  • 🔧 Value replacement: Replace specific values across columns
  • 📊 Column positioning: Advanced column reordering with range specifications
  • 🌍 Encoding support: Automatic handling of multiple character encodings
  • 🔧 Production-ready: Comprehensive logging and error handling

Installation

Install using pip:

pip install tidyspss

Or using uv:

uv add tidyspss

Quick Start

Basic Usage

from tidyspss import read_input_file, process_and_save

# Read a file (automatically detects format)
df, meta = read_input_file("data.sav")  # or .xlsx, .csv

# Process and save with transformations
df, meta = process_and_save(
    df=df,
    meta=meta,
    output_path="output.sav",
    user_variable_rename={"old_name": "new_name"},
    user_variable_drop=["unwanted_col1", "unwanted_col2"],
    user_column_labels={"Q1": "Question 1", "Q2": "Question 2"}
)

API Reference

Main Functions

read_input_file(file_path)

Reads a file into a pandas DataFrame with metadata.

  • Supports: .sav, .zsav, .xlsx, .xls, .csv
  • Returns: (DataFrame, metadata) tuple

process_and_save(df, meta, output_path, **kwargs)

Processes DataFrame with configurations and saves to SPSS format.

Parameters:

  • df: Input DataFrame
  • meta: Metadata from SPSS file (or None)
  • output_path: Path for output .sav file
  • user_column_position: Dict for column reordering
  • user_variable_drop: List of columns to drop
  • user_variable_keep: List of columns to keep (drops all others)
  • user_variable_rename: Dict for renaming columns
  • user_value_replacement: Dict for replacing values
  • user_column_labels: Dict of column labels
  • user_variable_value_labels: Dict of value labels
  • user_variable_format: Dict of variable formats
  • user_variable_measure: Dict of variable measures
  • user_variable_display_width: Dict of display widths
  • user_missing_ranges: Dict of missing value ranges
  • user_note: File note string
  • user_file_label: File label string
  • user_compress: Boolean for file compression
  • user_row_compress: Boolean for row compression

Requirements

  • Python ≥ 3.12
  • pandas ≥ 2.3.0
  • pyreadstat ≥ 1.3.0
  • openpyxl ≥ 3.0.0

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidyspss-0.1.0.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tidyspss-0.1.0-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file tidyspss-0.1.0.tar.gz.

File metadata

  • Download URL: tidyspss-0.1.0.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.9

File hashes

Hashes for tidyspss-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e1a1e5fdcfed5402491af1c30ad3fd2bcad8285c9b385049745440a8a8c30b2a
MD5 5db5c31bb87e6fffee6fda8bb130fd5b
BLAKE2b-256 76213e8216eb433271862e197dec0b16dbb6b5c5e700007243c592bf696952d3

See more details on using hashes here.

File details

Details for the file tidyspss-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tidyspss-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.9

File hashes

Hashes for tidyspss-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a07ef6f927c465830cf2ffd52eded5eb93c2ca70376785e9905387e5cf5cdc7
MD5 1e561243f90c4dc0096d8e9119d167bd
BLAKE2b-256 47bee28665e376d8fa30aeefe96cbf6d8b27abbdb8975f6e2882eaa784e0c0c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page