Skip to main content

A high-level HEP analysis library for ROOT::RDataFrame

Project description

Bamboo: A high-level HEP analysis library for ROOT::RDataFrame

Documentation Status

The ROOT::RDataFrame class provides an efficient and flexible way to process per-event information (stored in a TTree) and e.g. aggregate it into histograms.

With the typical pattern of storing object arrays as a structure of arrays (variable-sized branches with a common prefix in the names and length), the expressions that are typically needed for a complete analysis quickly become cumbersome to write (with indices to match, repeated sub-expressions etc.).

As an example, imagine the expression needed to calculate the invariant mass of the two leading muons from a CMS NanoAOD file (which stores 4-momenta with pt, eta and phi branches): one way is to construct LorentzVector objects, sum and evaluate the invariant mass. Next imagine doing the same thing with the two highest-pt jets that have a b-tag and are not within some cone of the two leptons you already selected in another way (while keeping the code maintainable enough to allow for passing jet momenta with a systematic variation applied).

Bamboo attempts to solve this problem by automatically constructing lightweight python wrappers based on the structure of the TTree, which allow to construct such expression with high-level code, similar to the language that is commonly used to discuss and describe them. By constructing an object representation of the expression, a few powerful operations can be used to compose complex expressions. This also allows to automate the construction of derived expressions, e.g. for shape systematic variation histograms.

Building selections, plots etc. with such expressions is analysis-specific, but the mechanics of loading data samples, processing them locally or on a batch system, combining the outputs for different samples in an overview etc. is very similar over a broad range of use cases. Therefore a common implementation of these is provided, such that the analyst only needs to provide a subclass with their selection and plot definitions, and a configuration file with a list of samples, and instructions how to display them.

Documentation

The HTML documentation (with a longer introduction, installation instructions, recipes for common tasks and an API reference of the classes and methods) is available here.

Development

Bamboo has been in development since early 2019, and is actively used by several analyses. The experience from daily use, and the addition of new features in the underlying ROOT::RDataFrame package, ideas for improvements and further development continue to pop up. Please have a look at the guidelines to also start contributing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bamboo-hep-1.0.0.tar.gz (1.8 MB view details)

Uploaded Source

File details

Details for the file bamboo-hep-1.0.0.tar.gz.

File metadata

  • Download URL: bamboo-hep-1.0.0.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.6.8

File hashes

Hashes for bamboo-hep-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ce5da4237b574a5c6c0ff96f4c37e890518ef64e39362e61b9229ec497165cb1
MD5 32b2fc6f7b29d2086733870412ef2ad3
BLAKE2b-256 f8b1a8a8b2cb116a75ccb3c47b950704d58047e28fe5aeb3ef819e0fb09d966f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page