A Python package for quick processing and transforming SPSS files
Project description
TidySPSS
A Python package for quick processing, transforming, and managing SPSS (.sav) files with support for Excel and CSV inputs. This package is built on top of pyreadstat and pandas to give you flexible, production-ready template for processing and transforming data files into SPSS format with full metadata control.
Philosophy
"Make simple things simple, and complex things possible"
🔄 Processing Flow
LOAD → TRANSFORM → CONFIGURE → SAVE
- LOAD: Read file with metadata preservation
- TRANSFORM: Apply any pandas operations directly
- CONFIGURE: Set SPSS-specific options
- SAVE: Output with all configurations applied
Features
- 📁 Multi-format support: Read from SPSS (.sav/.zsav), Excel (.xlsx/.xls), and CSV files
- 🔄 Comprehensive transformations: Reorder, rename, drop, and keep columns with ease
- 🏷️ Metadata management: Full support for SPSS labels, formats, measures, and display widths
- 🔧 Value replacement: Replace specific values across columns
- 📊 Column positioning: Advanced column reordering with range specifications
- 🌍 Encoding support: Automatic handling of multiple character encodings
- 🔧 Production-ready: Comprehensive logging and error handling
Installation
Install using pip:
pip install tidyspss
Or using uv:
uv add tidyspss
Quick Start
Basic Usage
from tidyspss import read_input_file, process_and_save
# Read a file (automatically detects format)
df, meta = read_input_file("data.sav") # or .xlsx, .csv
# Process and save with transformations
df, meta = process_and_save(
df=df,
meta=meta,
output_path="output.sav",
user_variable_rename={"old_name": "new_name"},
user_variable_drop=["unwanted_col1", "unwanted_col2"],
user_column_labels={"Q1": "Question 1", "Q2": "Question 2"}
)
API Reference
Main Functions
read_input_file(file_path)
Reads a file into a pandas DataFrame with metadata.
- Supports: .sav, .zsav, .xlsx, .xls, .csv
- Returns:
(DataFrame, metadata)tuple
process_and_save(df, meta, output_path, **kwargs)
Processes DataFrame with configurations and saves to SPSS format.
Parameters:
df: Input DataFramemeta: Metadata from SPSS file (or None)output_path: Path for output .sav fileuser_column_position: Dict for column reorderinguser_variable_drop: List of columns to dropuser_variable_keep: List of columns to keep (drops all others)user_variable_rename: Dict for renaming columnsuser_value_replacement: Dict for replacing valuesuser_column_labels: Dict of column labelsuser_variable_value_labels: Dict of value labelsuser_variable_format: Dict of variable formatsuser_variable_measure: Dict of variable measuresuser_variable_display_width: Dict of display widthsuser_missing_ranges: Dict of missing value rangesuser_note: File note stringuser_file_label: File label stringuser_compress: Boolean for file compressionuser_row_compress: Boolean for row compression
Requirements
- Python ≥ 3.12
- pandas ≥ 2.3.0
- pyreadstat ≥ 1.3.0
- openpyxl ≥ 3.0.0
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tidyspss-0.1.0.tar.gz.
File metadata
- Download URL: tidyspss-0.1.0.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1a1e5fdcfed5402491af1c30ad3fd2bcad8285c9b385049745440a8a8c30b2a
|
|
| MD5 |
5db5c31bb87e6fffee6fda8bb130fd5b
|
|
| BLAKE2b-256 |
76213e8216eb433271862e197dec0b16dbb6b5c5e700007243c592bf696952d3
|
File details
Details for the file tidyspss-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tidyspss-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a07ef6f927c465830cf2ffd52eded5eb93c2ca70376785e9905387e5cf5cdc7
|
|
| MD5 |
1e561243f90c4dc0096d8e9119d167bd
|
|
| BLAKE2b-256 |
47bee28665e376d8fa30aeefe96cbf6d8b27abbdb8975f6e2882eaa784e0c0c9
|