Skip to main content

columnflow

Project description

Build status Package version Documentation status Code coverge License

Backend for columnar, fully orchestrated HEP analyses with pure Python, law and order.

Original source hosted at GitHub.

Note on current development

This project is currently in a beta phase. The project setup, suggested workflows, definitions of particular tasks, and the signatures of various helper classes and functions are mostly frozen but could still be subject to changes in the near future. At this point (July 2024), various large-scale analyses based upon columnflow are being developed, and in the process, help test and verify various aspects of its core. The first major release with a largely frozen API is expected in the fall of 2024. However, if you would like to join early on, contribute or just give it a spin, feel free to get in touch!

Columnflow analytics

Quickstart

To create an analysis using columnflow, it is recommended to start from a predefined template (located in analysis_templates). The following command (no previous git clone required) interactively asks for a handful of names and settings, and creates a minimal, yet fully functioning project structure for you!

bash -c "$(curl -Ls https://raw.githubusercontent.com/columnflow/columnflow/master/create_analysis.sh)"

At the end of the setup, you will see further instructions and suggestions to run your first analysis tasks (example below).

Setup successfull! The next steps are:

1. Setup the repository and install the environment.
   > cd
   > source setup.sh [recommended_yet_optional_setup_name]

2. Run local tests & linting checks to verify that the analysis is setup correctly.
   > ./tests/run_all

3. Create a GRID proxy if you intend to run tasks that need one
   > voms-proxy-init -rfc -valid 196:00

4. Checkout the 'Getting started' guide to run your first tasks.
   https://columnflow.readthedocs.io/en/stable/start.html

   Suggestions for tasks to run:

   a) Run the 'calibration -> selection -> reduction' pipeline for the first file of the
      default dataset using the default calibrator and default selector
      (enter the command below and 'tab-tab' to see all arguments or add --help for help)
      > law run cf.ReduceEvents --version dev1 --branch 0

      Verify what you just run by adding '--print-status -1' (-1 = fully recursive)
      > law run cf.ReduceEvents --version dev1 --branch 0 --print-status -1

   b) Create the jet1_pt distribution for the single top datasets
      (if you have an image/pdf viewer installed, add it via '--view-cmd <binary>')
      > law run cf.PlotVariables1D --version dev1 --datasets 'st*' --variables jet1_pt

      Again, verify what you just ran, now with recursion depth 4
      > law run cf.PlotVariables1D --version dev1 --datasets 'st*' --variables jet1_pt --print-status 4

   c) Include the ttbar dataset and also plot jet1_eta
      > law run cf.PlotVariables1D --version dev1 --datasets 'tt*,st*' --variables jet1_pt,jet1_eta

For a better overview of the tasks that are triggered by the commands below, checkout the current (yet stylized) task graph.

Projects using columnflow

  • hh2bbtautau: HH → bb𝜏𝜏 analysis with CMS.
  • hh2bbww: HH → bbWW analysis with CMS.
  • topmass: Top quark mass measurement with CMS.
  • mttbar: Search for heavy resonances in ttbar events with CMS.
  • analysis playground: A testing playground for HEP analyses.

Contributors

Marcel Rieger
Marcel Rieger

💻 👀 📖 ⚠️
Mathis Frahm
Mathis Frahm

💻 👀
Daniel Savoiu
Daniel Savoiu

💻 👀
pkausw
pkausw

💻 👀
nprouvost
nprouvost

💻 ⚠️
Bogdan-Wiederspan
Bogdan-Wiederspan

💻 ⚠️
Tobias Kramer
Tobias Kramer

💻 👀
Matthias Schroeder
Matthias Schroeder

💻
Johannes Lange
Johannes Lange

💻
BalduinLetzer
BalduinLetzer

💻
JanekMoels
JanekMoels

🤔
haddadanas
haddadanas

💻
jomatthi
jomatthi

💻

This project follows the all-contributors specification.

Development

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

columnflow-0.2.4.tar.gz (265.3 kB view details)

Uploaded Source

File details

Details for the file columnflow-0.2.4.tar.gz.

File metadata

  • Download URL: columnflow-0.2.4.tar.gz
  • Upload date:
  • Size: 265.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for columnflow-0.2.4.tar.gz
Algorithm Hash digest
SHA256 ceb9c1b1c676393bd7316464692618489e5936f4deee7e8327641c8fee8cf738
MD5 39bfe3c31de564577d891e7a02cd5c5c
BLAKE2b-256 f07a255b3abcc4643a3f34d4e8d16e32f0e2b83e726847040caac99a13d9f937

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page