Skip to main content

AlphaTwirl + uproot for the Z inv. width analysis

Project description

CircleCI

codecov

Z invisible analysis

This code processes CMS event-based data and simulation stored in a flat ROOT.TTree format (i.e. branches correspond to simple data types such as bool, int, float, ... or an std::vector of these data types). Typically, this is done on nanoAOD. The output is a dataframe(s) of similar data types (with the exclusion of vectors) either directly taken from the nanoAOD files or derived from these variables to create an analysis-level dataframe.

This is achieved by reading in nanoAOD files with uproot applying a set of modules to generate derived variables and storing these in a dataframe saved to disk. Yaml config files are passed to define the input data, modules and output.

Usage

Install with pip:

pip install zinv-analysis

or in editable mode to alter the code:

git clone git@github.com:shane-breeze/zinv-analysis.git
cd zinv-analysis
pip install -e .

Either run with the CLI

zinv_analysis.py --help

or the python API

import zinv
help(zinv.modules.analyse)

Layout

Interfaces

Interfaces to the underlying code is located in analyse.py and resume.py.

Scripts using these functions are found in zinv/scripts/.

Modules

A set of modules which create derived variables are found in zinv/modules/readers. These modules are applied to the data with the (alphatwirl)[https://github.com/alphatwirl/alphatwirl] package and contain a class (possibly) with the begin, event and end methods.

The begin method is run at the start of processing the data to initialise some required parameters. The EventTools module adds a register_function method to the event to allows functions to be cached for lazy-evaluation (e.g. the JEC variations function is not run if the JEC variations are not saved in the output).

The event method is applied to each iteration over the input data. This corresponds to a chunk of events which are loaded into numpy arrays with uproot. Here the derived variables are evaluated. However, because of thee lazy-evaluation this is typically blank for most modules.

The end method ia applied at the end of processing to clear up anything that needs to be cleared. If this is run in multiprocessing or batch processing mode then modules are serialised. Lambda functions are not serialisable and hence must be created with the begin method and cleared in the end method.

Output

A special module defines the output. Currently this is HDF5.py. Instead of creating derived variables, this module will evaluate the previously defined functions and store them into a .h5 file using pandas. The actual output is defined by yaml config.

Config

The yaml config is defined externally by the user and controls where the datasets are found, which modules are applied and the output into the dataframes. However, with this flexibility extra care must be taken so modules which depend on each other are defined and in the correct order. For example, if the JEC variations are saved by the HDF5 module, then the JECVariation module must be included in the sequence before the output module.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zinv-analysis-0.3.2.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

zinv_analysis-0.3.2-py3-none-any.whl (45.1 kB view details)

Uploaded Python 3

File details

Details for the file zinv-analysis-0.3.2.tar.gz.

File metadata

  • Download URL: zinv-analysis-0.3.2.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.8.1

File hashes

Hashes for zinv-analysis-0.3.2.tar.gz
Algorithm Hash digest
SHA256 3baf17b0266864469a61dab97e77d9788d29128c79bbe8b07b8795350b4a11d1
MD5 12951c5269d72ee5316bd7d34e7678b4
BLAKE2b-256 81bc11657790ca53274390ba59b43d6def437745e6ad74d9efb3a7e3fcdef4f9

See more details on using hashes here.

File details

Details for the file zinv_analysis-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: zinv_analysis-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 45.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.8.1

File hashes

Hashes for zinv_analysis-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 23620887a24b91c4d2a27decd196b917a1773f7fe9e1fa2ad2240853a700764c
MD5 467f63f9d13d11b321fa78a684c08013
BLAKE2b-256 a65e72c16f48f644eff458e035e0ba80d92e1d25ab17a9e9a0a3ca8b49da9b2f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page