ServiceX data management using a configuration file
Project description
ServiceX DataBinder
ServiceX DataBinder is a Python package for making multiple ServiceX requests and managing ServiceX delivered data from a configuration file.
Configuration file
The configuration file is a yaml file containing all the information. An example configuration file is shown below:
General:
ServiceXBackendName: uproot
OutputDirectory: /path/to/output
OutputFormat: parquet
Sample:
- Name: ttH
GridDID: user.kchoi:user.kchoi.sampleA,
user.kchoi:user.kchoi.sampleB
Tree: nominal
FuncADL: "Select(lambda event: {'jet_e': event.jet_e, 'jet_pt': event.jet_pt})"
- Name: ttW
GridDID: user.kchoi:user.kchoi.sampleC
Tree: nominal
Filter: n_jet > 5
Columns: jet_e, jet_pt
The following settings are available options:
Option for General |
Description |
---|---|
ServiceXBackendName |
ServiceX backend name (only uproot is supported at the moment) |
OutputDirectory |
Path to the directory for ServiceX delivered files |
OutputFormat |
Output file format of ServiceX delivered data (only parquet is supported at the moment) |
IgnoreServiceXCache |
Ignore the existing ServiceX cache and force to make ServiceX requests |
Option for Sample |
Description |
---|---|
Name |
sample name defined by a user |
GridDID |
Rucio Dataset Id (DID) for a given sample; Can be multiple DIDs separated by comma |
Tree |
Name of the input ROOT TTree |
Filter 1 |
Selection in the TCut syntax, e.g. jet_pt > 10e3 && jet_eta < 2.0 |
Columns 1 |
List of columns (or branches) to be delivered; multiple columns separately by comma |
FuncADL 2 |
func-adl expression for a given sample |
1 Options for TCut syntax (CANNOT combine with the option FuncADL
)
2 Option for func-adl expression (CANNOT combine with the option Fitler
and Columns
)
Deliver data
from servicex_databinder import DataBinder
sx_db = DataBinder('<CONFIG>.yml')
out = sx_db.deliver()
Acknowledgements
Support for this work was provided by the the U.S. Department of Energy, Office of High Energy Physics under Grant No. DE-SC0007890
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for servicex_databinder-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f88304cf351cb7467af2dc9808aacc3f750814630b45dedf8fd8375c24b40f64 |
|
MD5 | 7a659fa97f2e80daee97a3cb3ebbf013 |
|
BLAKE2b-256 | 323f8646e375babb182cdf901374cd9b93f53a050d2b2b87ea5342a4cdf75739 |
Hashes for servicex_databinder-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12680c213b75fde3958142bf48c873f25b548bfd6b47618f01cdc48032ec506d |
|
MD5 | 125b04b974d081e15900958eb815bd57 |
|
BLAKE2b-256 | 6cc6b9159449b401550c30fbe60308b7493567b13c98e9ff55c41e9f73a04989 |