Interface ServiceX into TRExFitter to provide an alternative method of reading input ntuples
Project description
ServiceX for TRExFitter
For GitHub and PyPI release v1.1.0
Overview
ServiceX, a component of the IRIS-HEP DOMA group's iDDS, is an experiment-agnostic service to enable on-demand data delivery from data lakes in different data formats, including Apache Arrow and ROOT ntuple.
TRExFitter is a popular framework to perform profile likelihood fits in ATLAS experiment. It takes ROOT histograms or ntuples as input. Long turnaround time of ntuple reading for large and/or remote data would slow down the whole analysis.
servicex-for-trexfitter is a Python package to integrate ServiceX into the TRExFitter framework.
It provides an alternative workflow if you use ROOT ntuples as inputs.
The package analyzes your TRExFitter configuration file, and delivers only necessary branches and entries to produce all histograms defined in your TRExFitter configuration file.
The default workflow requires to download all ROOT ntuples to your local machine or CERN EOS area from the grid.
The workflow using servicex-for-trexfitter, on the the other hand, delivers only a subset of ROOT ntuples.
It practically replaces the step to download ROOT ntuples from the grid using Rucio.
The main advantages are:
- Disk space: No need to store all
ROOTntuples locally. A reduction on the deliveredROOTntuples byservicex-for-trexfittervaries accoring to the setting in yourTRExFitterconfiguration file. - Faster turnaround: Download of
ROOTntuples can be faster forservicex-for-trexfitteras it parallelizes your job up to eachTTreeof each file and delivers smallerROOTntuples over WAN. Processing time ofTRExFitteroptionnis naturally faster for the workflow usingservicex-for-trexfitteras it runs over smallerROOTntuples. - Simplicity: A single
TRExFitterconfiguration file to getROOTntuples from the grid usingservicex-for-trexfitterand all otherTRExFittersteps. No separate script needed to download allROOTntuples from the grid.
Prerequisites
- Python 3.6, 3.7, or 3.8
- Access to an Uproot ServiceX endpoint. More information about ServiceX can be found at ServiceX documentation
Usage
Installation
The library is published at PyPI: servicex-for-trexfitter
Prepare TRExFitter configuration
The followings are the settings needed for the workflow using servicex-for-trexfitter:
Job block settings
NtuplePaths: <PATH>- The path where input root files are stored.
- Write permission is required as ServiceX delivers root ntuples to the subdirectory
servicexof this path.
Sample block settings
GridDID: <Rucio DID>- Add option
GridDIDfor theSampleusing ServiceX for delivery. - Both scope and name for
GridDID, e.g.,user.kchoi:user.kchoi.WZana_WZ. - Sample can have multiple DIDs: e.g.,
user.kchoi:user.kchoi.WZana_WZ_mc16a, user.kchoi:user.kchoi.WZana_WZ_mc16d, user.kchoi:user.kchoi.WZana_WZ_mc16e Samplewithout an optionGridDIDis treated as a typical Sample, which reads ntuple files from local path.
- Add option
NtupleFile: servicex/<SAMPLE NAME>servicex-for-trexfitterdelivers oneROOTfile perSamplewith the same name as theSamplename.- This option is required only for the Samples with option
GridDID. Other Samples can use any option valid for option NTUP.
Here is a side-by-side comparsion of example configuration files:
servicex-for-trexfitter |
Default |
|---|---|
Caveat
- Only scalar-type of TBranch is supported.
- Most of standard TCut expressions are supported for
Selection,but special functions likeSum$(formula)are not supported yet. Please find more about the supported TCut expression here.
Deliver ROOT ntuples
from servicex_for_trexfitter import ServiceXTRExFitter
sx_trex = ServiceXTRExFitter("<TRExFitter configuration file>")
sx_trex.get_ntuples()
Once you load the package, you can define an instance with an argument of TRExFitter configuration file.
You can then ask for delivery of ROOT ntuples.
It will initiate ServiceX transformation(s) based on your TRExFitter configuration, and deliver ROOT ntuples to the path you specified at NtuplePaths/servicex.
Local data cache
ServiceX provides the feature that caches your queries and data into a local temporary directory.
Therefore, whenever you make further changes to the TRExFitter configuration file, servicex-for-trexfitter creates data delivery requests only for the updated parts.
Compatible TRExFitter framework
To run the subsequent steps of TRExFitter with the ROOT ntuples that servicex-for-trexfitter delivered, you need to checkout the branch feat/servicex-integration of TRExFitter framework.
Otherwise, TRExFitter will complain about the unknown options.
The feature branch will be merged into master in the near future.
Acknowledgements
Support for this work was provided by the the U.S. Department of Energy, Office of High Energy Physics under Grant No. DE-SC0007890
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file servicex_for_trexfitter-1.1.0.tar.gz.
File metadata
- Download URL: servicex_for_trexfitter-1.1.0.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/3.7.3 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d8393ae6df32cb24392ed4ac2b2399e76d3c88fcb68b8dc11771d16016d34ba5
|
|
| MD5 |
f6424b56be55f6b856b9efa52a13a8bb
|
|
| BLAKE2b-256 |
e8d43e3b7dd7fb55254ad24a83369805bdcd13a8984ad08f890500b45f7400d9
|
File details
Details for the file servicex_for_trexfitter-1.1.0-py3-none-any.whl.
File metadata
- Download URL: servicex_for_trexfitter-1.1.0-py3-none-any.whl
- Upload date:
- Size: 12.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/3.7.3 pkginfo/1.8.2 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a53c3ae312973fbba16bfbd3702886e574b36e40fc11d83b115fe41ed6cc4cea
|
|
| MD5 |
22e8b3d26056aafbeb21261dc1d94d83
|
|
| BLAKE2b-256 |
06127842249c21cace817568009ecf1a2e4ea6984d03963d3d1f0514f01eb792
|