Skip to main content

ServiceX Data Transformer for HEP Data

Project description

ServiceX_transformer Library

CI/CD codecov

Library of common classes for building serviceX transformers.

Minimum Requiremnts

Works with Python version 2.7 and above

Download from PyPi

To use this library:

pip install servicex-transformer

Standard Command Line Arguments

This library provides a subclass of ArgParse for standardizing commnand line arguments for all transformer implementations.

Available arguments are:

Transformed Result Output

Command line arguments determine a destination for the results as well as an output format.

  • Kafka - Streaming system. Write messages formatted as Arrow tables. The chunks parameter determines how many events are included in each message.
  • Object Store - Each transformed file is written as an object to an S3 compatible object store. The only currently supported output file format is parquet. The objects are stored in a bucket named after the transformation request ID.

Command Line Reference

Option Description Default
--brokerlist BROKERLIST List of Kafka broker to connect to if streaming is selected servicex-kafka-0.slateci.net:19092, servicex-kafka-1.slateci.net:19092, servicex-kafka-2.slateci.net:19092"
--topic TOPIC Kafka topic to publish arrays to servicex
--chunks CHUNKS Number of events to include in each message. If ommitted, it will compute a best guess based on heuristics and max message size None
--tree TREE Root Tree to extract data from. Only valid for uproot transformer Events
--path PATH Path to single Root file to transform. Any file path readable by xrootd
--limit LIMIT Max number of events to process
--result-destination DEST Where to send the results: kafka or object-store, output-dir kafka
--output-dir Local directory where the result will be written. Use this to run standalone without other serviceX infrastructure None
--result-format Binary format for the results: arrow, parquet, or root-file arrow
--max-message-size Maximum size for any message in Megabytes 14.5 Mb
--rabbit-uri URI RabbitMQ Connection URI host.docker.internal
--request-id GUID ID associated with this transformation request. Used as RabbitMQ Topic Name as well as object-store bucket servicex

Running Tests

Validation of the code logic is performed using pytest and pytest-mock. Unit test fixtures are in test directories inside each package.

The tests are instrumented with code coverage reporting via codecov. The travis job has a the codecov upload token set as an environment variable which is passed into the docker container so the report can be uploaded upon successful conclusion of the tests.

Coding Standards

To make it easier for multiple people to work on the codebase, we enforce PEP8 standards, verified by flake8. The community has found that the 80 character limit is a bit awkward, so we have a local config setting the max_line_length to 99.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

servicex-transformer-0.3.1rc3.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

servicex_transformer-0.3.1rc3-py3-none-any.whl (31.0 kB view details)

Uploaded Python 3

File details

Details for the file servicex-transformer-0.3.1rc3.tar.gz.

File metadata

  • Download URL: servicex-transformer-0.3.1rc3.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.4

File hashes

Hashes for servicex-transformer-0.3.1rc3.tar.gz
Algorithm Hash digest
SHA256 fb46bd157f1370bcc26ac14cd2341b0aff80d81498d58875f2c4f9e26209ab0b
MD5 d9a80f0345cb95ff7b4096fa19b160b4
BLAKE2b-256 49a2f563d3db4c0ff6611ccadc573ba1f43c40565393f35550a8fef56096841e

See more details on using hashes here.

File details

Details for the file servicex_transformer-0.3.1rc3-py3-none-any.whl.

File metadata

  • Download URL: servicex_transformer-0.3.1rc3-py3-none-any.whl
  • Upload date:
  • Size: 31.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.4

File hashes

Hashes for servicex_transformer-0.3.1rc3-py3-none-any.whl
Algorithm Hash digest
SHA256 7a20ea42b9f33232e4bd61d5158ac48c79d6f7dab0edbc1b576dd314f802c2b5
MD5 3c60788d56850dd04c68794ad28f0ab3
BLAKE2b-256 b6061018cfcd49fb5c38437be62e1bd1fe0c2ceb60235fa1726d136032925d80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page