Skip to main content

Pipeline framework to dump analysis-ready data from LHCb grid-based files

Project description

Digout logo

Pipeline Status License Latest Release PyPI - Version Python Version Documentation Status Contributing Guide

digout is a Python library purpose-built to execute the multi-stage workflow of converting raw LHCb DIGI files into analysis-ready parquet dataframes of particles and hits.

To manage this process in a scalable and reproducible manner, it implements a workflow framework organized around configurable steps (e.g., digi2root, root2df). The framework operates on a two-phase execution model: a stream phase runs once to prepare the dataset from a bookkeeping path, and a chunk phase processes each input file in parallel. This parallel execution is managed by swappable schedulers (such as local for local processing or htcondor for cluster submission), with the entire workflow being defined through YAML configuration files to ensure complete reproducibility.

Resources

Link Description
📖 Full Documentation The complete guide to installation, configuration, and concepts.
🚀 Quickstart Guide The fastest way to get a working example running.
💡 Contributing Guide Learn how to set up a development environment and contribute to the project.
🐛 Report a Bug Found an issue? Let us know by creating a bug report.
📜 Changelog See the latest changes from the release page

Core Features

  • Automated Metadata Discovery: Automatically queries the LHCb bookkeeping system to retrieve necessary metadata (dddb_tag, conddb_tag, etc.), eliminating manual lookup.
  • Scalable Parallel Processing: Built-in support for processing large datasets in parallel on a local machine or on a distributed cluster like HTCondor.
  • Configuration-Driven and Reproducible: Define your entire workflow in YAML files. digout saves the final, resolved configuration for every run, ensuring any result can be reproduced.
  • Idempotent Execution: Automatically detects and skips steps that have already been completed.
  • Extensible Architecture: Easily define new steps or schedulers.

Main Workflows

  • DIGI to DataFrame Conversion: Produce analysis-ready parquet dataframes from LHCb DIGI files. The available output dataframes are detailed on the DataFrames Page.

  • DIGI to MDF Conversion: Convert LHCb DIGI files into the .mdf format required as input for the Allen framework.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

digout-0.1.3-py3-none-any.whl (154.8 kB view details)

Uploaded Python 3

File details

Details for the file digout-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: digout-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 154.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.5

File hashes

Hashes for digout-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0244767ea64a51bce6b763f34cf30eb94b05374abc751516dca574e01eb26207
MD5 7d25552729a1ceb922ddb53292e9fe49
BLAKE2b-256 c56d1a8ccdcaf0c8541a8156a32a31e4fa3d9fdea78673ccdc086bb114b5a66d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page