Skip to main content

Data processing and quality control tools for research workflows

Project description

ScriptCraft Python Package

A comprehensive Python package for data processing and quality control tools designed for research workflows, particularly in the field of Huntington's Disease research.

🚀 Features

  • Data Processing Tools: Automated data cleaning, validation, and transformation
  • Quality Control: Comprehensive validation frameworks with plugin support
  • Research Workflows: Specialized tools for clinical and biomarker data
  • Release Management: Automated PyPI and Git release workflows
  • Pipeline Orchestration: Multi-step workflow automation
  • Extensible Architecture: Plugin-based system for custom validations
  • Cross-Platform: Works on Windows, macOS, and Linux

📦 Installation

pip install scriptcraft

🛠️ Quick Start

Basic Usage

import scriptcraft
import scriptcraft.common as cu

# Use common utilities
data = cu.load_data("your_data.csv")
cu.log_and_print("✅ Data loaded successfully")

Using Tools

# Import tools directly
from scriptcraft.tools.automated_labeler import AutomatedLabeler
from scriptcraft.tools.data_content_comparer import DataContentComparer

# Create and use tools
labeler = AutomatedLabeler()
comparer = DataContentComparer()

# Run tools with arguments
labeler.run(
    input_paths=["data.csv"],
    output_dir="output",
    mode="labeling"
)

CLI Usage

# List available tools and pipelines
scriptcraft list

# Run specific tools
scriptcraft rhq_form_autofiller
scriptcraft data_content_comparer

# Run pipelines
scriptcraft data_quality
scriptcraft dictionary_pipeline

# Use release management CLI
scriptcraft-release pypi-test
scriptcraft-release git-sync
scriptcraft-release full-release

# Use release manager directly (RECOMMENDED for version bumps)
python -c "from scriptcraft.tools.release_manager import ReleaseManager; ReleaseManager().run(mode='python_package', version_type='patch', auto_push=True)"

# Run specific tools via console scripts
rhq-autofiller --help
data-comparer --help
auto-labeler --help
function-auditor --help

# Or run tools directly
python -m scriptcraft.tools.rhq_form_autofiller --help
python -m scriptcraft.tools.data_content_comparer --help

🧰 Available Tools

Data Processing

  • AutomatedLabeler: Automated data labeling and classification
  • DataContentComparer: Compare datasets for consistency
  • SchemaDetector: Automatic schema detection and validation
  • DateFormatStandardizer: Standardize date formats across datasets
  • DictionaryCleaner: Clean and validate dictionary files

Quality Control

  • DictionaryDrivenChecker: Validation using predefined dictionaries
  • DictionaryValidator: Validate dictionary structures
  • MedVisitIntegrityValidator: Validate medical visit data integrity
  • ScoreTotalsChecker: Validate score calculations
  • FeatureChangeChecker: Detect changes in data features

Automation

  • RHQFormAutofiller: Automated form filling for research questionnaires
  • DictionaryWorkflow: Complete dictionary processing workflows

Release Management

  • PyPIReleaseTool: Automated PyPI package testing and release
  • GitWorkspaceTool: Git repository management and operations
  • GitSubmoduleTool: Git submodule synchronization and management
  • GenericReleaseTool: Flexible release workflow orchestration

🔧 Development

Installation for Development

# Clone the repository
git clone https://github.com/yourusername/scriptcraft-python.git
cd scriptcraft-python

# Install in development mode
pip install -e .

Running Tests

# Run all tests
python -m pytest

# Run specific test categories
python -m pytest tests/unit/
python -m pytest tests/integration/

📚 Documentation

For comprehensive documentation, examples, and advanced usage:

  • Main Documentation: ScriptCraft Workspace
  • Tool Documentation: See individual tool README files
  • API Reference: Available in the main workspace documentation

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support

🙏 Acknowledgments

  • Built for the Huntington's Disease research community
  • Developed with support from research institutions
  • Thanks to all contributors and users

ScriptCraft Python Package - Making research data processing easier, one tool at a time.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scriptcraft_python-1.6.3.tar.gz (280.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scriptcraft_python-1.6.3-py3-none-any.whl (379.7 kB view details)

Uploaded Python 3

File details

Details for the file scriptcraft_python-1.6.3.tar.gz.

File metadata

  • Download URL: scriptcraft_python-1.6.3.tar.gz
  • Upload date:
  • Size: 280.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for scriptcraft_python-1.6.3.tar.gz
Algorithm Hash digest
SHA256 6918c5c0b2bea378985114753f15e5bee473f889e00d9048ac3dd8467119ed43
MD5 5b148c957bf9c0466acf292f06c7b78a
BLAKE2b-256 ad5b015af3d42dd49591dbe50625d4ec23caa8f559615bab620be4aa4f6adfe3

See more details on using hashes here.

File details

Details for the file scriptcraft_python-1.6.3-py3-none-any.whl.

File metadata

File hashes

Hashes for scriptcraft_python-1.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 71697ce43ccde4de185ca6bec2400e1859ebb941d6885fa1301662f844a991e8
MD5 1683e42376d8cfd816f914de1a9d1f96
BLAKE2b-256 baefb371257bee3da9b7f650e1bacf8104b0c4dcd9bf433df4c594bd7783c1d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page