Skip to main content

Automated processing and analysis of MicroED data

Project description

pyautoprocess

A Python package for automated processing and analysis of MicroED (Micro-Electron Diffraction) data.

Overview

This package provides a comprehensive suite of tools for automated MicroED data processing:

  • autoprocess: Core script for automated MicroED data processing
  • image_process: Preconverted image processing and quality analysis tool
  • batch_reprocess: Batch reprocessing tool with specific space group and unit cell parameters
  • mrc2tif: Utility for converting MRC files to TIF format
  • ser2tif: Utility for converting SER files to TIF format

Installation

Install from PyPI:

pip install pyautoprocess

Or install from source:

git clone https://github.com/theNelsonLab/autoprocess.git
cd autoprocess
pip install .

## Requirements

### Python Dependencies
- Python >= 3.8
- numpy ~= 2.0
- mrcfile ~= 1.5
- tifffile >= 2025.1.10
- seremi == 0.0.1
- scipy >= 1.7.0
- scikit-image >= 0.19.0

### External Software Dependencies
1. **XDS Software Suite**: Required for crystallographic data processing
   - Must be accessible via the `xds` command
   - Available from [xds.mr.mpg.de](https://xds.mr.mpg.de/)

2. **Pointless**: Required for space group analysis (pointless)
   - Part of the CCP4 software suite

## Script Details

### 1. autoprocess

#### Description
Primary tool for automated MicroED data processing using the XDS suite. Handles image conversion, indexing, integration, space group analysis, and scaling with intelligent parameter optimization.

#### Key Features
- **File Processing**: Processes .mrc/.ser files from specified paths or current directory
- **Automatic Image Conversion**: Converts source files to XDS-compatible TIF format with validation
- **Diffraction Quality Analysis**: Optional frame quality assessment and intelligent frame selection (--dqa)
- **Dynamic XDS Optimization**: Automatic parameter adjustment for indexing, integration, and scaling
- **Advanced Space Group Analysis**: Intelligent space group optimization using lattice analysis
- **Pointless Integration**: Optional CCP4 pointless analysis for space group validation (--pointless)
- **Parallel Processing**: Support for parallel XDS execution (--parallel)
- **Comprehensive Logging**: Detailed processing logs and progress tracking
- **Microscope Configurations**: Built-in support for multiple instrument configurations
- **Reprocessing Control**: Skip already processed files or force reprocessing (--reprocess)

#### Usage
```bash
autoprocess [paths] [options]

Positional Arguments:
  paths                    Path(s) to process: single .mrc/.ser file, folder containing files,
                          or multiple files/folders. If not specified, processes all files
                          in current directory.

Microscope Configuration:
  --microscope-config CONFIG  Choose instrument configuration (default: default)
  --config-file FILE          Path to microscope configuration file

Processing Control:
  --reprocess                 Reprocess files even if they have been processed before
  --pointless                 Run pointless for space group analysis
  --parallel                  Use parallel XDS (xds_par) instead of serial XDS
  --dqa                       Enable diffraction quality analysis and frame selection
  --verbose                   Enable verbose logging for detailed conversion validation

XDS Parameters Override:
  --rotation-axis AXIS        Override rotation axis
  --frame-size SIZE          Override frame size
  --signal-pixel VALUE       Override signal pixel value (XDS parameter)
  --min-pixel VALUE         Override minimum pixel value (XDS parameter)
  --background-pixel VALUE   Override background pixel value (XDS parameter)
  --pixel-size VALUE        Override pixel size value
  --wavelength VALUE        Override wavelength value
  --beam-center-x VALUE     Override beam center X coordinate
  --beam-center-y VALUE     Override beam center Y coordinate
  --file-extension EXT      Override input file extension

Experimental Parameters Override:
  --detector-distance VALUE  Override detector distance (in mm)
  --exposure VALUE           Override exposure time
  --rotation VALUE          Override rotation value

2. image_process

Description

Processing tool for pre-converted crystallography images with backup management, quality analysis, and flexible format support.

Key Features

  • Pre-Converted Processing: Process existing TIF/IMG image files without conversion
  • Format Flexibility: Support for both TIF and SMV (.img) formats (--smv)
  • Quality Analysis Integration: Optional diffraction quality analysis with frame selection (--dqa)
  • Frame Trimming: Precise frame range control with --trim-front and --trim-end
  • Backup Management: Automatic backup system with organized archive folders
  • Multi-Path Processing: Process multiple directories or specific paths simultaneously
  • Microscope Configurations: Support for all standard microscope configurations
  • Processing Isolation: Separate output folders (auto_process_direct) to avoid conflicts

Usage

image_process [paths] [options]

Positional Arguments:
  paths                    Path(s) to process: folders containing pre-converted images.
                          If not specified, processes all suitable folders in current directory.

File Format Options:
  --smv                    Process SMV (.img) files instead of TIF files

Frame Selection:
  --trim-front N           Number of frames to trim from the start of the range (default: 0)
  --trim-end N             Number of frames to trim from the end of the range (default: 0)
  --dqa                    Enable diffraction quality analysis and frame selection

Microscope Configuration:
  --microscope-config CONFIG  Choose instrument configuration (default: default)
  --config-file FILE          Path to microscope configuration file

Processing Control:
  --pointless                 Run pointless for space group analysis
  --parallel                  Use parallel XDS (xds_par) instead of serial XDS
  --verbose                   Enable verbose logging

XDS Parameters Override:
  --rotation-axis AXIS        Override rotation axis
  --frame-size SIZE          Override frame size
  --signal-pixel VALUE       Override signal pixel value
  --min-pixel VALUE         Override minimum pixel value
  --background-pixel VALUE   Override background pixel value
  --pixel-size VALUE        Override pixel size value
  --wavelength VALUE        Override wavelength value
  --beam-center-x VALUE     Override beam center X coordinate
  --beam-center-y VALUE     Override beam center Y coordinate

Experimental Parameters Override:
  --detector-distance VALUE  Override detector distance (in mm)
  --exposure VALUE           Override exposure time
  --rotation VALUE          Override rotation value

Examples:
  # Process TIF images in current directory
  image_process

  # Process SMV files with frame trimming
  image_process --smv --trim-front 5 --trim-end 10 /path/to/images

  # Quality analysis with multiple paths
  image_process --dqa /data/sample1 /data/sample2 /data/sample3

  # Advanced processing with parameter overrides
  image_process --parallel --pointless --signal-pixel 6 --trim-front 3 /path/to/data

3. batch_reprocess

Description

Advanced tool for batch reprocessing of crystallography data with specific space group and unit cell parameters. Supports both manual parameter specification and smart automatic detection mode.

Key Features

  • Smart Detection Mode: Automatically detect optimal parameters from existing processing results (--smart)
  • Manual Parameter Mode: Specify exact space group and unit cell parameters (--manual, default)
  • Interactive Input: Prompts for missing parameters when not provided via command line
  • Flexible Reindexing: Support for custom reindexing matrices
  • Batch Processing: Process multiple datasets with consistent or adaptive parameters
  • Archive Management: Organized backup and folder management for reprocessed data

Usage

batch_reprocess [paths] [options]

Positional Arguments:
  paths                    Path(s) to reprocess: folders containing processed data

Processing Mode:
  --manual                 Manual parameter specification mode (default)
  --smart                  Smart detection mode - automatically detect parameters

Crystallographic Parameters (Manual Mode):
  --space-gr NUMBER        Space group number
  --a VALUE               Unit cell parameter a (Å)
  --b VALUE               Unit cell parameter b (Å)
  --c VALUE               Unit cell parameter c (Å)
  --alpha VALUE           Unit cell angle alpha (degrees)
  --beta VALUE            Unit cell angle beta (degrees)
  --gamma VALUE           Unit cell angle gamma (degrees)
  --folder NAME           Subfolder name for reprocessed data
  --reidx "MATRIX"        Reindexing matrix as space-separated string
                          (e.g., "1 0 0 0 0 1 0 0 0 0 1 0")

Examples:
  # Smart mode - auto-detect parameters
  batch_reprocess --smart /path/to/data/folders

  # Manual mode with specific parameters
  batch_reprocess --space-gr 19 --a 50.1 --b 60.2 --c 70.3 \
                  --alpha 90 --beta 90 --gamma 90 --folder reprocess_P212121

  # Interactive mode (prompts for missing parameters)
  batch_reprocess /path/to/data

4. mrc2tif

Description

Precision utility for converting MRC movie files to TIF format with comprehensive verification, statistics, and data integrity checking.

Key Features

  • Multi-Format Support: Single and multi-frame MRC file processing
  • Data Type Preservation: Intelligent data type handling and conversion
  • Pedestal Addition: Optional pedestal value addition with range validation
  • Raw Conversion Mode: Preserve original data without type conversion (--raw)
  • Comprehensive Verification: Frame-by-frame conversion validation with statistical analysis
  • Detailed Logging: Split logging (file and console) with conversion statistics
  • Recursive Processing: Process files in subdirectories (--recursive)
  • Custom Naming: Flexible output file naming with --tif-name option
  • Data Range Validation: Automatic checking for uint16 overflow/underflow
  • Directory Management: Automatic 'images' subdirectory creation

Usage

mrc2tif [options]

Options:
  --folder PATH           Path to folder containing MRC files (default: current directory)
  --ped VALUE            Pedestal value to add to each pixel (default: 0)
  --tif-name NAME        Base name for output TIF files (default: same as MRC filename)
  --recursive            Search for MRC files recursively in subdirectories
  --raw                  Convert data without any type conversion or modifications

Examples:
  # Convert all MRC files in current directory
  mrc2tif

  # Convert with pedestal addition
  mrc2tif --ped 100 --folder /path/to/mrc/files

  # Recursive conversion with custom naming
  mrc2tif --recursive --tif-name sample_data --folder /data/root

  # Raw conversion preserving original data types
  mrc2tif --raw --folder /path/to/data

5. ser2tif

Description

Specialized utility for converting SER (Serial Electron Microscopy) movie files to TIF format using the seremi library for precise SER file handling and frame extraction.

Key Features

  • SER Format Expertise: Native SER file format support using seremi library
  • TIA Compatibility: Full compatibility with TIA and other SER-generating software
  • Frame Extraction: Automatic individual frame extraction and processing
  • Data Integrity: Comprehensive conversion verification and validation
  • Pedestal Support: Optional pedestal value addition with validation
  • Raw Conversion: Raw data conversion mode preserving original formats (--raw)
  • Detailed Logging: Split logging system with comprehensive statistics
  • Recursive Processing: Search subdirectories for SER files (--recursive)
  • Custom Naming: Flexible output file naming control
  • Directory Management: Automatic 'images' subdirectory organization

Usage

ser2tif [options]

Options:
  --folder PATH           Path to folder containing SER files (default: current directory)
  --ped VALUE            Pedestal value to add to each pixel (default: 0)
  --tif-name NAME        Base name for output TIF files (default: same as SER filename)
  --recursive            Search for SER files recursively in subdirectories
  --raw                  Convert data without any type conversion or modifications

Examples:
  # Convert all SER files in current directory
  ser2tif

  # Convert with pedestal and custom naming
  ser2tif --ped 50 --tif-name converted_frames --folder /data/ser_files

  # Recursive search with raw conversion
  ser2tif --recursive --raw --folder /microscopy/data

File Naming Convention

For .ser Files

sample-name_distance_rotation_exposure_additional-notes.ser

Example: sample-mov1_960_0.3_3_n60top10_g8sp10_cryo.ser

  • distance: Detector distance in mm
  • rotation: Rotation speed in degrees/second
  • exposure: Exposure time in seconds

Directory Structure

working_directory/
├── sample_name/
│   ├── images/
│   │   └── (converted image files)
│   ├── auto_process/
│   │   └── (XDS processing files)
│   └── batch_reprocess/
│       └── (reprocessed data files)
├── autoprocess_logs/
│   └── (processing log files)
└── logs/
    └── (conversion log files)

Error Handling

  • All scripts include comprehensive error handling and logging
  • Detailed logs are generated in the respective log directories
  • Processing statistics and verification results are recorded
  • Failed processes are clearly identified in the logs

Contributing

Contributions are welcome! Please submit issues and pull requests to the project repository.

License

This project is licensed under the GPL-3.0-or-later License.

Authors and Contributors

  • Sam Foxman (sfoxman@caltech.edu) - Package maintainer and development lead
  • Dmitry Eremin (eremin@caltech.edu) - Core development and enhancements
  • Jessica Burch - Original autoprocess.py implementation

Acknowledgments

  • Nelson Lab at Caltech for ongoing support and development
  • XDS developers for data processing capabilities
  • CCP4 Software Suite for crystallographic tools

Version History

  • v0.2.0: Major refactoring and DQA implementation
    • Implemented quality analysis and diffraction assessment
    • Added modular architecture with separate core, config, and UI modules
  • v0.1.x: Enhanced autoprocess functionality
    • Restructured as installable Python package (pyautoprocess)
    • Added image_process tool for pre-converted image processing
    • Enhanced error handling and logging capabilities
    • Added support for frame trimming and SMV format
    • Added pointless and parallel processing flags
    • Improved parameter handling and microscope configurations
  • v0.0.x: Initial development releases
    • Core autoprocess.py functionality
    • Basic batch reprocessing capabilities
    • MRC to TIF conversion utilities

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyautoprocess-0.3.0-py3-none-any.whl (82.2 kB view details)

Uploaded Python 3

File details

Details for the file pyautoprocess-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pyautoprocess-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 82.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for pyautoprocess-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08dbf755e93b52365515e9b7c6d953b93e8df4c0773dfe7bb1db853f7de549ee
MD5 d8c3d38e2047479447a25517218efd6a
BLAKE2b-256 3c6ac54225c6eff2f92e26d5dd9a95969826cb6fdd0cf28db7b7940c12189857

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page