Skip to main content

Interface ServiceX into TRExFitter to provide an alternative method of reading input ntuples

Project description

ServiceX for TRExFitter

For GitHub and PyPI release v1.1.0

Overview

ServiceX, a component of the IRIS-HEP DOMA group's iDDS, is an experiment-agnostic service to enable on-demand data delivery from data lakes in different data formats, including Apache Arrow and ROOT ntuple.

TRExFitter is a popular framework to perform profile likelihood fits in ATLAS experiment. It takes ROOT histograms or ntuples as input. Long turnaround time of ntuple reading for large and/or remote data would slow down the whole analysis.

servicex-for-trexfitter is a Python package to integrate ServiceX into the TRExFitter framework. It provides an alternative workflow if you use ROOT ntuples as inputs. The package analyzes your TRExFitter configuration file, and delivers only necessary branches and entries to produce all histograms defined in your TRExFitter configuration file.

The default workflow requires to download all ROOT ntuples to your local machine or CERN EOS area from the grid. The workflow using servicex-for-trexfitter, on the the other hand, delivers only a subset of ROOT ntuples. It practically replaces the step to download ROOT ntuples from the grid using Rucio.

Workflow comparison

The main advantages are:

  • Disk space: No need to store all ROOT ntuples locally. A reduction on the delivered ROOT ntuples by servicex-for-trexfitter varies accoring to the setting in your TRExFitter configuration file.
  • Faster turnaround: Download of ROOT ntuples can be faster for servicex-for-trexfitter as it parallelizes your job up to each TTree of each file and delivers smaller ROOT ntuples over WAN. Processing time of TRExFitter option n is naturally faster for the workflow using servicex-for-trexfitter as it runs over smaller ROOT ntuples.
  • Simplicity: A single TRExFitter configuration file to get ROOT ntuples from the grid using servicex-for-trexfitter and all other TRExFitter steps. No separate script needed to download all ROOT ntuples from the grid.

Prerequisites

  • Python 3.6, 3.7, or 3.8
  • Access to an Uproot ServiceX endpoint. More information about ServiceX can be found at ServiceX documentation

Usage

Installation

The library is published at PyPI: servicex-for-trexfitter

Prepare TRExFitter configuration

The followings are the settings needed for the workflow using servicex-for-trexfitter:

Job block settings

  • NtuplePaths: <PATH>
    • The path where input root files are stored.
    • Write permission is required as ServiceX delivers root ntuples to the subdirectory servicex of this path.

Sample block settings

  • GridDID: <Rucio DID>
    • Add option GridDID for the Sample using ServiceX for delivery.
    • Both scope and name for GridDID, e.g., user.kchoi:user.kchoi.WZana_WZ.
    • Sample can have multiple DIDs: e.g., user.kchoi:user.kchoi.WZana_WZ_mc16a, user.kchoi:user.kchoi.WZana_WZ_mc16d, user.kchoi:user.kchoi.WZana_WZ_mc16e
    • Sample without an option GridDID is treated as a typical Sample, which reads ntuple files from local path.
  • NtupleFile: servicex/<SAMPLE NAME>
    • servicex-for-trexfitter delivers one ROOT file per Sample with the same name as the Sample name.
    • This option is required only for the Samples with option GridDID. Other Samples can use any option valid for option NTUP.

Here is a side-by-side comparsion of example configuration files:

servicex-for-trexfitter Default

Caveat

  • Only scalar-type of TBranch is supported.
  • Most of standard TCut expressions are supported for Selection,but special functions like Sum$(formula) are not supported yet. Please find more about the supported TCut expression here.

Deliver ROOT ntuples

from servicex_for_trexfitter import ServiceXTRExFitter
sx_trex = ServiceXTRExFitter("<TRExFitter configuration file>")
sx_trex.get_ntuples()

Once you load the package, you can define an instance with an argument of TRExFitter configuration file. You can then ask for delivery of ROOT ntuples. It will initiate ServiceX transformation(s) based on your TRExFitter configuration, and deliver ROOT ntuples to the path you specified at NtuplePaths/servicex.

Local data cache

ServiceX provides the feature that caches your queries and data into a local temporary directory. Therefore, whenever you make further changes to the TRExFitter configuration file, servicex-for-trexfitter creates data delivery requests only for the updated parts.

Compatible TRExFitter framework

To run the subsequent steps of TRExFitter with the ROOT ntuples that servicex-for-trexfitter delivered, you need to checkout the branch feat/servicex-integration of TRExFitter framework. Otherwise, TRExFitter will complain about the unknown options. The feature branch will be merged into master in the near future.

Acknowledgements

Support for this work was provided by the the U.S. Department of Energy, Office of High Energy Physics under Grant No. DE-SC0007890

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

servicex_for_trexfitter-1.1.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

servicex_for_trexfitter-1.1.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file servicex_for_trexfitter-1.1.0.tar.gz.

File metadata

  • Download URL: servicex_for_trexfitter-1.1.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.7.3 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.4

File hashes

Hashes for servicex_for_trexfitter-1.1.0.tar.gz
Algorithm Hash digest
SHA256 d8393ae6df32cb24392ed4ac2b2399e76d3c88fcb68b8dc11771d16016d34ba5
MD5 f6424b56be55f6b856b9efa52a13a8bb
BLAKE2b-256 e8d43e3b7dd7fb55254ad24a83369805bdcd13a8984ad08f890500b45f7400d9

See more details on using hashes here.

File details

Details for the file servicex_for_trexfitter-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: servicex_for_trexfitter-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.7.3 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.4

File hashes

Hashes for servicex_for_trexfitter-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a53c3ae312973fbba16bfbd3702886e574b36e40fc11d83b115fe41ed6cc4cea
MD5 22e8b3d26056aafbeb21261dc1d94d83
BLAKE2b-256 06127842249c21cace817568009ecf1a2e4ea6984d03963d3d1f0514f01eb792

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page