Skip to main content

Prepup is a free, open-source package for data preprocessing in terminal

Project description

💻 Prepup: Interactive Data Preprocessing Toolkit

Static Badge

Python Versions image Static Badge Static Badge Static Badge License

⚠️ PACKAGE RENAMED: prepup-linux → ride-cli

IMPORTANT: This package has been renamed to ride-cli. Please use the new package for all future installations and updates.

Migration Instructions

To migrate to the new package:

# Uninstall the old package
pip uninstall prepup-linux

# Install the new package
pip install ride-cli

All functionality remains the same. The only change is the package name and command:

  • Old command: prepup
  • New command: ride or ride-cli

Why the Change?

Prepup began in summer 2023 as the Preprocessing Utility Package (PrePUP) with just 5 terminal flags—a learning project that evolved into a comprehensive data tool. After creating prepup-linux to address cross-platform compatibility issues, we realized the name incorrectly suggested Linux exclusivity, when our vision has always been platform independence. We also tested our first menu-driven approach in prepup-linux. We're now transitioning to RIDE-CLI (Rapid Insights Data Engine), a name that better reflects our tool's capabilities: rapid data preprocessing, meaningful insights generation, and cross-platform functionality. This rebranding represents our growth from a simple utility to a robust data engine, while maintaining our commitment to continuous improvements and expanded features across all platforms.


🚀 Quick Overview

Prepup is a powerful, user-friendly data preprocessing tool designed to simplify and streamline your data analysis workflow directly from the terminal. Whether you're a data scientist, analyst, or researcher, Prepup provides an intuitive interface for exploring, cleaning, and preparing your datasets.

✨ Features

Interactive Mode

  • 📊 Load datasets from various formats (CSV, Excel, Parquet)
  • 🔍 Comprehensive data inspection
  • 📈 Advanced data exploration
  • 🧹 Missing value handling
  • 📊 Feature visualization
  • 🤖 Automatic Machine Learning (AutoML) model selection

Key Functionalities

  • Data Loading
  • Feature Inspection
  • Correlation Analysis
  • Distribution Checking
  • Outlier Detection
  • Missing Value Imputation
  • Feature Standardization
  • Automatic Model Training

🛠 Installation

⚠️ Important: Creating a virtual environment is highly recommended when installing prepup-linux. As a data processing library, it has various dependencies that may conflict with your existing packages.

Setting Up a Virtual Environment

Windows

# Create virtual environment
python -m venv prepup-env

# Activate virtual environment
prepup-env\Scripts\activate

# Deactivate when done
deactivate

Linux/macOS

# Create virtual environment
python3 -m venv prepup-env

# Activate virtual environment
source prepup-env/bin/activate

# Deactivate when done
deactivate

Using pip

# Inside your activated virtual environment
pip install prepup-linux

From Source

# Inside your activated virtual environment
git clone https://github.com/sudhanshumukherjeexx/prepup-linux.git
cd prepup-linux
pip install .

💻 Usage

Interactive Mode

prepup

Loading a Specific Dataset

prepup path/to/your/dataset.csv

Main Menu Options

  1. Load Dataset
  2. Inspect Data
  3. Explore Data
  4. Visualize Data
  5. Impute Missing Values
  6. Standardize Features
  7. Export Data
  8. AutoML (Train & Evaluate Models)

🎮 Interactive Workflow Example

  1. Launch Prepup prepup

  2. Load Your Dataset: Choose option 1 and enter your dataset path

  3. Inspect Data: Use option 2 to explore features, data types, and missing values

  4. Preprocess: Impute missing values | Standardize features

  5. Analyze: Visualize data distributions | Perform correlation analysis | Run AutoML for model selection

🤖 AutoML Capabilities

  • Supports both Classification and Regression tasks
  • Evaluates multiple machine learning algorithms
  • Provides performance metrics
  • Saves results to CSV

📦 Dependencies

  • NumPy
  • Pandas
  • Scikit-learn
  • Matplotlib
  • and more (see requirements.txt)

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📋 License

Distributed under the MIT License. See LICENSE for more information.

🔄 Migration Notice

This package is deprecated and will no longer receive updates. Please migrate to ride-cli for the latest features and support.

New Package Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prepup_linux-0.2.3.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prepup_linux-0.2.3-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file prepup_linux-0.2.3.tar.gz.

File metadata

  • Download URL: prepup_linux-0.2.3.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for prepup_linux-0.2.3.tar.gz
Algorithm Hash digest
SHA256 b00e675b7751bbeae6e8a76c47885a734ca350afd2cab11a6811541f97ca823f
MD5 8ea57b515cab600c0863f5ebc7e6d765
BLAKE2b-256 b8eb14a9081722c38388e516f7a5ee138a06b6b1c958f5966fc4cd8d97300f0a

See more details on using hashes here.

File details

Details for the file prepup_linux-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: prepup_linux-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for prepup_linux-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0114308c6d0a57a7586854521731692e6e0165e6d6dfc52ef207087d18e61c8b
MD5 4163aff21dbc4a0270f9029a766b5dd6
BLAKE2b-256 511026f64cb5dca5f38b43b7287d4a2d9bd0509e76036031e37d9e6a7f33f0cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page