Skip to main content

Project with code needed to filter, trim and slim ntuples produced by AP

Project description

[TOC]

TLDR

To submit the jobs do:

# A virtual environment as shown below
micromamba activate post_ap

# This should allow access to shell with Ganga
# .ganga.py has to have the path to site-packages added
# to python path as shown below
post_shell

# Create proxy with 100 hours validity
lhcb-proxy-init -v 100:00

# Submit job "data_2024" with configuration in v1.yaml. The job name in dirac will be
# filter_001.
job_filter_ganga -n filter_001 -p rx_2024 -s data_2024 -c rk/v1.yaml -b Dirac -v 037

Description

This project is used to:

  • Filter, slim, trim the trees from a given AP production
  • Rename branches

This is done using configurations in a YAML file and through Ganga jobs.

Installation

You will need to install this project in a virtual environment provided by micromamba. For that, check this Once micromamba is installed in your system:

  • Make sure that the ${HOME}/.local directory does not exist. If a dependency of post_ap is installed there, ganga would have to be pointed to that location and to the location of the virtual environment. This is too complicated and should not be done.

  • Create a new environment:

# python 3.11 is used by DIRAC and it's better to also use it here 
micromamba create -n post_ap python==3.12
micromamba activate post_ap
  • In the $HOME/.bashrc export POSTAP_PATH, which will point to the place where your environment is installed, e.g.:
export POSTAP_PATH=/home/acampove/micromamba/envs/post_ap/bin

which is needed to find the executables.

  • Install XROOTD using:
micromamba install xrootd

which is needed to download the ntuples and is not a python project, therefore it cannot be installed with pip.

  • Install this project.
# can also be installed in editable mode
pip install post_ap
  • In order to make Ganga aware of the post_ap package, in $HOME/.ganga.py add:
import sys

# Or the proper place where the environment is installed in your system
sys.path.append('/home/acampove/micromamba/envs/post_ap/lib/python3.12/site-packages')
  • This project is used from inside Ganga. To have access to Ganga do:
# Setup LHCb environment
. /cvmfs/lhcb.cern.ch/lib/LbEnv

# Make a proxy that lasts 100 hours
lhcb-proxy-init -v 100:00
  • To check that this is working, open ganga and run:
from post_ap.pfn_reader        import PFNReader

Submitting jobs

Environment

The script in the next setion needs a special environment to run. This environment is created by:

  • Go to a virtual environment where this project is installed, could be named e.g. post_ap
  • Make a grid token with:
lhcb-proxy-init -v 100:00
  • Run:
post_shell

which will leave you in a new shell with the required environment variables

  • Run the commands shown below

Submition script

For this one would run a line like:

job_filter_ganga -n job_name -p PRODUCTION -s SAMPLE -c rx/v13.yaml -b BACKEND -v VERSION_OF_ENV 

# For example
job_filter_ganga -n flt_002 -p rd_ap_2024 -s w37_39_v1r3788 -c rx/v13.yaml -b Dirac -v 037
  • The number of jobs will be equal to the number of PFNs, up to 500 jobs.
  • The code used to filter reside in the grid and the only thing the user has to do is to provide the latest version

The options that can be used are:

usage: job_filter_ganga [-h] -n NAME -p PROD -s SAMP -c CONF [-b {Interactive,Local,Dirac}] [-t] -v VENV [-d]

Script used to send ntuple filtering jobs to the Grid, through ganga

options:
  -h, --help            show this help message and exit
  -n NAME, --name NAME  Job name
  -p PROD, --prod PROD  Production
  -s SAMP, --samp SAMP  Sample
  -c CONF, --conf CONF  Relative path to config file 
  -b {Interactive,Local,Dirac}, --back {Interactive,Local,Dirac}
                        Backend
  -t, --test            Will run one job only if used
  -v VENV, --venv VENV  Version of virtual environment used to run filtering
  -d, --dry_run         If used, will not create and send job, only initialize

RX jobs: See this.
LbpKmumu jobs: See this

Check latest version of virtual environment

The jobs below will run with code from a virtual environment that is already in the grid. One should use the latest version of this environment. To know the latest versions, run:

# In a separate terminal open a shell with access to dirac
post_shell

# Run this command for a list of environmets
list_venvs

The post_shell terminal won't be used to send jobs.

Config file

Here is where all the configuration goes and an example of a config can be found here. One of the sections contains the list of MC samples, this can be updated by:

dump_samples -p rd_ap_2024 -g rd -v v1r2437 -a RK RKst

Which will dump a yaml file with the samples for the rd_ap_2024 production in the rd group and in version v1r2437 used by the RK and RKst analyses.

Validation

In order to validate the ntuples slimmed run:

validate_slimming -P rk -v v1 -p /path/to/directory/with/slimmed/ntuples 

to validate ntuples produced with v1.yaml config from the rk project. This should check that:

  • Every ntuple that was meant to be slimmed, was slimmed.
  • Each ntuple that was meant to be slimmed produced the same number of ntuples.

Optional

  • In order to improve the ganga experience use:
# Minimizes messages when opening ganga
# Does not start monitoring of jobs by default
alias ganga='ganga --quiet --no-mon'

in the $HOME/.bashrc file. Monitoring can be turned on by hand as explained here

Make your own virtual environment

You can also:

  • Modify this project
  • Make a virtual environment and put it in a tarball
  • Upload it to the grid and make your jobs use it.

For this export:

  • LXNAME: Your username in LXPLUS, which should also be the one in the grid, used to know where in the grid the environment will go.
  • VENVS: Path to the directory where the code will place all the tarballs holding the environments.
  • POSTAP_PATH: Path to micromamba directory in which the environment where you are developing is located e.g. /home/acampove/micromamba/envs/run3/bin. Here the name of the environment is run3.

Then do:

# This leaves you in a shell with the right environment
post_shell

# Create and upload the environment with version 030
update_tarball -v 030

Utilities

For brevity, these utilities are documented separately, the utilities are:

  • dump_sample Used to search for existing and missing samples in a production

Things that can go wrong

For brevity, each of these issues will be documented in a separate file,

Missing ganga jobs If ganga has problems, the jobs might not appear, despite they did run.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

post_ap-0.3.7.tar.gz (98.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

post_ap-0.3.7-py3-none-any.whl (132.8 kB view details)

Uploaded Python 3

File details

Details for the file post_ap-0.3.7.tar.gz.

File metadata

  • Download URL: post_ap-0.3.7.tar.gz
  • Upload date:
  • Size: 98.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for post_ap-0.3.7.tar.gz
Algorithm Hash digest
SHA256 cb5845426cb9ccb4e47b256acd9c1075b4be942a8428915bc52e42e8723bb63d
MD5 8c7ea0d1e1d8dd3243d6216e8471938e
BLAKE2b-256 5d1386a1e8ccdb26b1ba9bb849e147843c22d9ed59f4dfdae511449bda410d78

See more details on using hashes here.

Provenance

The following attestation bundles were made for post_ap-0.3.7.tar.gz:

Publisher: publish.yaml on acampove/post_ap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file post_ap-0.3.7-py3-none-any.whl.

File metadata

  • Download URL: post_ap-0.3.7-py3-none-any.whl
  • Upload date:
  • Size: 132.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for post_ap-0.3.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2ce190c807dcad725aac0491c728a8996c335ab091490d0f340ef476a31e7335
MD5 af55ff20d9ed6e63c06c80ccc06643ae
BLAKE2b-256 9f8f2b726468d49cec5893b66b0f624c819370efe500d89684b8e3c97da15533

See more details on using hashes here.

Provenance

The following attestation bundles were made for post_ap-0.3.7-py3-none-any.whl:

Publisher: publish.yaml on acampove/post_ap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page