Skip to main content

Improving Ganga for better productivity.

Project description

GangaCK

Improving Ganga for better productivity.

package version pipeline status coverage report License: GPL v3 Documentation Status python version

Features:

  • Jobtree: improved visualization of jobtree for better jobs organization. This can be called both inside/outside ganga interactive session.

  • IOUtils: Misc operations to convert to/from (collection of) PFN, LFN, Bookkeeping uri (evt+std://, sim+std://), PPL, xml, lfns, eos, ... There is a caching algorithm to help where it's usefully applicable. One particular application is LHCbDataset.new where it can accept arbitary argument from the list of support inputs above. For example:

    LHCbDataset.new(
    
        'some/local/file.dst', # LOCAL
    
        'root://some-remote-file.dst',  # REMOTE
    
        'file:///another-remote-file.dst',  # REMOTE
    
        '/lhcb/MC/Dev/LDST/00041927/0000/00041927_00000002_1.ldst', # LFN
    
        'evt+std://MC/2012/42100000/Beam4000GeV-2012-MagDown-Nu2.5-Pythia8/...', # BKQ
    
        'sim+std://LHCb/Collision12/Beam4000GeV-VeloClosed-MagDown/...',  # BKQ
    
        '$EOS_HOME/ganga/4083/000.dst', # EOS
    
        '/cvmfs/lhcb.cern.ch/.../pool_xml_catalog_Reco14_Run125113.xml', # XML
    
        open('text_file_with_url_per_line.txt'), # local list
    
        jobs(123),  # output from another Ganga job.
    
        LHCbDataset(['foo', 'bar']),  # another ds.
    
    ) # accept heterogenous input appropriately,
    
  • Magics: because ganga is embedded inside IPython, why not more magics?

    • jv : show status of subjobs from all running jobs. Extremely useful for monitoring.
    • jt : for improved jobtree operation.
    • peek: based on Job.peek, but look deeper when possible.
    • jsh : provide shell-like syntax to operate Job with less (no-shift) typing, for example, jsh 197.12 remove True instead of jobs("197.12").remove(True). Less typing saves your life's time...
    • grun: similar to the built-in magic ganga, but it'll pick the local ganga*.py immediately or ask in case of ambiguity.
    • resubmit: Smartly handle resubmission/backend.reset of failed Dirac jobs based on its failing status (e.g., "Pending Requests", "Job has reached the CPU limit of the queue", "Stalling for more than ...", etc.)
  • Additional instance methods:

    • Job: lfn_list, lfn_size, lfn_purge, pfn_size, ppl_list, eos_list, humansize, is_final.
    • Gauss: nickname, to retrieve nickname from $DECFILESROOT.

Scripts:

  • ganga_cache_viewer: display the list of cache made by this package.

  • ganga_cleaner: Complete all-in-one script for tidying your ganga environment.

  • offline_ganga_reader: Quick script to read the content in Ganga's JobTree offline.

  • xmlgensum: Report summary of GeneratorLog.xml from all subjobs of Ganga-Gauss-Job

  • xmlmerge: Merge summary.xml files from Ganga's subjobs and neatly archive the dir.

Installation

It's available on pip: pip install gangack

Disclaimer

This package was written and used during my PhD in 2013-2017 at EPFL (Lausanne) and LHCb collaboration (CERN), for the work in Z->tau tau cross-section measurement and H->mu tau searches at LHCb (8TeV).

As such, it's developped during the period of Ganga 5.34 -- 6.0.44. Because of the fast-pace development and non-backward compat nature of Ganga, this package can be obsoleted against newer version of Ganga.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

GangaCK-1.0.1.dev1.tar.gz (720.4 kB view hashes)

Uploaded Source

Built Distribution

GangaCK-1.0.1.dev1-py2.py3-none-any.whl (66.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page