Sample Handling and Analysis Kit for Experiments
Project description
FAIRshake
FAIRshake (Sample Handling and Analysis Kit for Experiments) is a comprehensive data processing pipeline designed for efficient benchmarking and processing of datasets, particularly in diffraction data analysis. It includes modules for benchmarking, data loading, preprocessing, integration, and exporting.
Table of Contents
Features
- Benchmarking Modules: Assess the performance of data processing workflows.
- Data Loading: Efficient handling of large-scale datasets.
- Preprocessing: Data cleaning, normalization, and noise reduction.
- Integration: Combine data from various formats and sources seamlessly.
- Exporting: Output processed data in multiple formats for further analysis.
Installation
Requirements
- Python 3.11 or higher
From PyPi
pip install FAIRshake
From Source
Clone the repository and install FAIRshake locally:
git clone https://github.com/cwru-sdle/FAIRshake.git
cd FAIRshake
pip install .
# FAIRshake
FAIRshake (Sample Handling and Analysis Kit for Experiments) is a comprehensive data processing pipeline designed for efficient benchmarking and processing of datasets, particularly in diffraction data analysis. It includes modules for benchmarking, data loading, preprocessing, integration, and exporting.
## Features
- **Benchmarking Modules**: Assess the performance of data processing workflows.
- **Data Loading**: Efficient handling of large-scale datasets.
- **Preprocessing**: Data cleaning, normalization, and noise reduction.
- **Integration**: Combine data from various formats and sources seamlessly.
- **Exporting**: Output processed data in multiple formats for further analysis.
## Installation
### Requirements
- Python 3.11 or higher
### From PyPi
```bash
pip install FAIRshake
From Source
Clone the repository and install FAIRshake locally:
git clone https://github.com/cwru-sdle/FAIRshake.git
cd FAIRshake
pip install .
Usage
FAIRshake provides command-line tools and modules for data processing, benchmarking, and integration of diffraction data.
Command-Line Interface
After installation, you can use the fairshake command. Use fairshake --help to see available commands:
fairshake --help
Data Processing Pipeline
To run the data processing pipeline on your dataset:
fairshake process --config <config-file> --data-dir <data-directory> --output-dir <output-directory>
Example Configuration File
Create a configuration file (e.g., config.json) specifying parameters for preprocessing, integration, and exporting:
{
"preprocessing": {
"dark_field_path": "path/to/dark_field.ge2",
"mask_file_path": "path/to/mask.edf",
"invert_mask": true,
"min_intensity": 0.0,
"max_intensity": null
},
"integration": {
"poni_file_path": "calibration_files/det0.poni",
"npt_radial": 500,
"unit": "2th_deg",
"do_solid_angle": false,
"error_model": "poisson",
"radial_range": [3, 13],
"azimuth_range": [-180, 180],
"polarization_factor": 0.99,
"method": ["full", "histogram", "cython"]
},
"exporting": {
"output_directory": "path/to/output",
"naming_convention": "{GE_filenumber}_{iter}",
"options": {
"do_remove_nan": true,
"unit": "2th_deg"
},
"file_format": "fxye"
}
}
Benchmarking
To benchmark the performance of the data processing pipeline:
fairshake benchmark --data-dir <data-directory> \
--iterations <iterations> \
--batch-size <batch-size> \
--files-per-dataset <files-per-dataset>
Example:
fairshake benchmark --data-dir data/benchmark_files \
--iterations 1 \
--batch-size 5 \
--files-per-dataset 10
Programmatic Usage
You can use FAIRshake modules directly in your Python scripts:
from FAIRshake.execution_pipeline.pipeline import ExecutionPipeline
# Configuration Parameters
input_base_dir = 'path/to/input'
output_base_dir = 'path/to/output'
# Preprocessing configuration
preprocessing_config = {
"dark_field_path": "path/to/dark_field.ge2",
"mask_file_path": "path/to/mask.edf",
"invert_mask": True,
"min_intensity": 0.0,
"max_intensity": None,
}
# Integration configuration
integration_config = {
"poni_file_path": "calibration_files/det0.poni",
"npt_radial": 500,
"unit": "2th_deg",
"do_solid_angle": False,
"error_model": "poisson",
"radial_range": (3, 13),
"azimuth_range": [-180, 180],
"polarization_factor": 0.99,
"method": ["full", "histogram", "cython"]
}
# Exporting configuration
exporting_config = {
"output_directory": output_base_dir,
"naming_convention": "{GE_filenumber}_{iter}",
"options": {
"do_remove_nan": True,
"unit": "2th_deg"
},
"file_format": "fxye"
}
# Pipeline parameters
pipeline_params = {
"input_base_dir": input_base_dir,
"output_base_dir": output_base_dir,
"batch_size": 10,
"data_file_types": ['.ge2', '.tif', '.edf', '.cbf', '.mar3450', '.h5', '.png'],
"metadata_file_types": ['.json', '.poni', '.instprm', '.geom', '.spline'],
"require_metadata": True,
"load_metadata_files": True,
"load_detector_metadata": False,
"require_all_formats": False,
"average_frames": False,
"enable_profiling": True,
"tf_data_debug_mode": False,
"pattern": '*/*/*',
"preprocessing_config": preprocessing_config,
"enable_preprocessing": True,
"enable_integration": True,
"integration_config": integration_config,
"enable_exporting": True,
"exporting_config": exporting_config,
"log_level": "ERROR"
}
# Initialize the Execution Pipeline
pipeline = ExecutionPipeline(**pipeline_params)
# Run the Pipeline
pipeline.run()
Ensure that you define preprocessing_config, integration_config, and exporting_config according to your requirements.
Help and Support
For detailed usage and options, use the help command:
fairshake process --help
fairshake benchmark --help
Contributing
Contributions are welcome. Please fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change.
Steps to Contribute
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch). - Make your changes.
- Commit your changes (
git commit -m 'Add some feature'). - Push to the branch (
git push origin feature-branch). - Open a pull request.
License
This project is licensed under the BSD 3-Clause License. See the LICENSE.txt file for details.
Contact Information
For support or inquiries:
- Author: Finley Holt
- Email: finley0454@gmail.com
- GitHub: FinleyHolt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fairshake-0.1.3.tar.gz.
File metadata
- Download URL: fairshake-0.1.3.tar.gz
- Upload date:
- Size: 40.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9aea8a96be748322e3200829dfaf0e5ceb3fc44f6f85dcecd196a1ed7de5e05
|
|
| MD5 |
abbd8629a7f8378d21c1faa0bf82fff6
|
|
| BLAKE2b-256 |
deff0dd5bf62cadb38bd405746ed6b65923ae3c5ba7b0b158147cfc79d818ea5
|
File details
Details for the file fairshake-0.1.3-py3-none-any.whl.
File metadata
- Download URL: fairshake-0.1.3-py3-none-any.whl
- Upload date:
- Size: 47.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
147227240b0297269b2ea0901404fca6529b281d49dc515fcafbb72697ed991a
|
|
| MD5 |
02fff82ac96ab32299155f8ec440d0a5
|
|
| BLAKE2b-256 |
139f439213f64444d11c94d3fdf658e344b0360623f8a657b06232ce4a5f3bd3
|