Interface ServiceX into TRExFitter to provide an alternative method of reading input ntuples
Project description
ServiceX for TRExFitter
Overview
ServiceX
, a component of the IRIS-HEP DOMA group's iDDS, is an experiment-agnostic service to enable on-demand data delivery from data lakes in different data formats, including Apache Arrow and ROOT ntuple.
TRExFitter
is a popular framework to perform profile likelihood fits in ATLAS experiment. It takes ROOT histograms or ntuples as input. Long turnaround time of ntuple reading for large and/or remote data would slow down the whole analysis.
ServiceXforTRExFitter
is a python library to interface ServiceX into TRExFitter framework. It provides an alternative method to produce histograms out of ROOT ntuples.
Expected improvements:
- Faster turnaround: this library analyzes your TRExFitter configuration and performs on-the-fly transformation to apply preselection (or filtering) on the ntuples and delivers only necessary branches.
- Disk space: ServiceX reads files on grid, thus no need to have a direct access to ROOT ntuples.
- Scalability: ServiceX is running on a K8s cluster which can easily scale the job.
Possible bottlenecks:
- Bad network speed between the grid site and the K8s cluster where ServiceX deployed.
- Parquet to ROOT ntuple conversion is currently being done on your PC or laptop.
Prerequisites
- Python 3.6, 3.7, or 3.8
- Access to an Uproot ServiceX endpoint. More information about ServiceX can be found at ServiceX documentation
- PyROOT
Usage
Prepare TRExFitter configuration
The following items of TRExFitter configuration need to be modified to be compatible with the library:
- In
Job
block: specifyNtuplePath
in a form of<Output Path>/Data
- In
Sample
block: specify NEW optionGridDID
for eachSample
, whereGridDID
is a Rucio data indentifier which includes scope and name. - In
Sample
block:NtupleFile
has to be the same name asSample
name
An example TRExFitter configuration can be found here.
N.B. Tenary operation is not supported yet
Delivery of slimmed/skimmed ROOT ntuples
The library can be loaded by the following command
from servicex_for_trexfitter import ServiceXTRExFitter
and then an instance can be created with an argument of TRExFitter configuration file.
sx_trex = ServiceXTRExFitter('<TRExFitter configuration file>')
Now you are ready to make a delivery request.
sx_trex.get_ntuples()
It will initiate ServiceX transformation(s) based on your TRExFitter configuration, and deliver ROOT ntuples that are effectively slimmed and skimmed.
Prepare histograms from delivered ROOT ntuples
Now you can run TRExFitter with the action n
to read input ntuples from the delivered ROOT ntuples.
Given that the current TRExFitter framework doesn't support ServiceXforTRExFitter
yet, the option GridDID
in Sample
block has to be deleted before you run TRExFitter.
Acknowledgements
Support for this work was provided by the the U.S. Department of Energy, Office of High Energy Physics under Grant No. DE-SC0007890
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file servicex_for_trexfitter-0.6.0.tar.gz
.
File metadata
- Download URL: servicex_for_trexfitter-0.6.0.tar.gz
- Upload date:
- Size: 7.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c762f5bf006fd9cee2d930d0db6c083cfde5e191127d333eec6dd543bf4ad390 |
|
MD5 | 0ecb1a3be2bf9dd310efb603413518ae |
|
BLAKE2b-256 | 0ec964108cdb545c970ab9671b0359ee24dc68fccce46abdce9e35d2df8047d6 |
File details
Details for the file servicex_for_trexfitter-0.6.0-py3-none-any.whl
.
File metadata
- Download URL: servicex_for_trexfitter-0.6.0-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 659fe004742aa34ccdbc20bd41cbdef6d9df0a5d481ac25931ccdfc461fd18bf |
|
MD5 | 5af2c7c344c6c27325800562a044fa78 |
|
BLAKE2b-256 | 9664d9dd070e38f1cceec1d12baa38db4a81f937b3532e1cea2bc442bf8c8221 |