Pipeline framework to dump analysis-ready data from LHCb grid-based files
Project description
digout is a Python library purpose-built to execute the multi-stage workflow
of converting raw LHCb DIGI files into analysis-ready parquet dataframes
of particles and hits.
To manage this process in a scalable and reproducible manner,
it implements a workflow framework organized around configurable steps
(e.g., digi2root, root2df).
The framework operates on a two-phase execution model:
a stream phase runs once to prepare the dataset from a bookkeeping path,
and a chunk phase processes each input file in parallel.
This parallel execution is managed by swappable schedulers
(such as local for local processing or htcondor for cluster submission),
with the entire workflow being defined through YAML configuration files
to ensure complete reproducibility.
Resources
| Link | Description |
|---|---|
| 📖 Full Documentation | The complete guide to installation, configuration, and concepts. |
| 🚀 Quickstart Guide | The fastest way to get a working example running. |
| 💡 Contributing Guide | Learn how to set up a development environment and contribute to the project. |
| 🐛 Report a Bug | Found an issue? Let us know by creating a bug report. |
| 📜 Changelog | See the latest changes from the release page |
Core Features
- Automated Metadata Discovery:
Automatically queries the LHCb bookkeeping system to retrieve necessary
metadata (
dddb_tag,conddb_tag, etc.), eliminating manual lookup. - Scalable Parallel Processing: Built-in support for processing large datasets in parallel on a local machine or on a distributed cluster like HTCondor.
- Configuration-Driven and Reproducible:
Define your entire workflow in YAML files.
digoutsaves the final, resolved configuration for every run, ensuring any result can be reproduced. - Idempotent Execution: Automatically detects and skips steps that have already been completed.
- Extensible Architecture: Easily define new steps or schedulers.
Main Workflows
-
DIGI to DataFrame Conversion: Produce analysis-ready
parquetdataframes from LHCbDIGIfiles. The available output dataframes are detailed on the DataFrames Page. -
DIGI to MDF Conversion: Convert LHCb
DIGIfiles into the.mdfformat required as input for the Allen framework.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file digout-0.1.3-py3-none-any.whl.
File metadata
- Download URL: digout-0.1.3-py3-none-any.whl
- Upload date:
- Size: 154.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0244767ea64a51bce6b763f34cf30eb94b05374abc751516dca574e01eb26207
|
|
| MD5 |
7d25552729a1ceb922ddb53292e9fe49
|
|
| BLAKE2b-256 |
c56d1a8ccdcaf0c8541a8156a32a31e4fa3d9fdea78673ccdc086bb114b5a66d
|