Skip to main content

Software library for interacting with myquery web service

Project description

jlab_archiver_client

A Python client library for querying the Jefferson Lab EPICS archiver (MYA) via the myquery web service.

It is intended for non-mission critical applications such as data analysis and uses the CEBAF read-only archiver deployment by default. CEBAF mission critical applications should use internal libraries that provide direct access to the operations-oriented deployment.

Overview

This package provides a convenient Python interface to the myquery web service, making archived EPICS Process Variable (PV) data easily accessible for analysis. Data is returned in familiar pandas data structures (Series and DataFrames) with datetime indices for time-series analysis.

The package supports multiple myquery endpoints:

  • interval: Retrieve all archived events for a PV over a time range
  • mysampler: Get regularly-spaced samples across multiple PVs
  • mystats: Compute statistical aggregations over time bins
  • point: Retrieve a single event at a specific time
  • channel: Search and discover available channel names

Key Features

  • Pandas Integration: All data returned as pandas Series, DataFrames, and simple dictionaries
  • Datetime Indexing: Time-series data with proper datetime indices
  • Disconnect Handling: Non-update events tracked separately
  • Parallel Queries: Limited support for multi-channel queries with concurrent execution
  • Type Safety: Query builder classes with parameter validation
  • Enum Support: Option to convert enum values to strings
  • Thread-Safe Config: Runtime configuration changes supported
  • History Deployment: Defaults to Jefferson Lab's read-only history deployment
  • Command Line Interface: Command-line tools for quick queries

API Documentation

Documentation can be found here

See Also

Installation

pip install jlab_archiver_client

Developer Quick Start Guide

Download the repo, create a virtual environment using pythong 3.11+, and install the package in editable mode with development dependencies. Then develop using your preferred IDE, etc.

Linux (bash)

git clone https://github.com/JeffersonLab/jlab_archiver_client
cd jlab_archiver_client
python3.11 -m venv venv
# bash
source venv/bin/activate
pip install -e .[dev]

Linux (tcsh / csh)

git clone https://github.com/JeffersonLab/jlab_archiver_client
cd jlab_archiver_client
python3.11 -m venv venv
# tcsh / csh
source venv/bin/activate.csh
pip install -e '.[dev]'

Windows (PowerShell)

git clone https://github.com/JeffersonLab/jlab_archiver_client
cd jlab_archiver_client
\path\to\python3 -m venv venv
venv\Scripts\activate.ps1
pip install -e .[dev]

To start the provided database.

docker compose up

Testing

This application supports testing using pytest and code coverage using coverage. Configuration in pyproject.toml. Integration tests required that the provided docker container(s) are running. Tests are automatically run on appropriate triggers.

Test Type Command
Unit pytest test/unit
Integration pytest test/integration
Unit & Integration pytest
Code Coverage Report pytest --cov-report=html
Linting ruff check [--fix]

Documentation

Documentation is done in Sphinx and automatically built and published to GitHub Pages when triggering a new release. To build documentation, run this commands from the project root.

sphinx-build -b html docsrc/source build/docs

Release

Release are generated automatically when the VERSION file recieves a commit on the main branch. Artifcats (packages) are deployed to PyPI automatically as this is intended for a broader audience. Build artifacts are automatically attached to the releases when generated along with the python dependency information for the build (requirements.txt).

Configuration (Optional)

The package come pre-configured for use with CEBAF's production myquery service. This requires authentication when used offsite, which this package does not currently support.

If you need to access a non-standard myquery or the development container bundled in this repo, then configure the myquery server first.

from jlab_archiver_client.config import config

# For production
config.set(myquery_server="epicsweb.jlab.org", protocol="https")

# For local development/testing
config.set(myquery_server="localhost:8080", protocol="http")

Usage Examples

MySampler - Regularly Sampled Data

Query multiple PVs at regularly spaced time intervals. Useful for synchronized sampling across channels.

from jlab_archiver_client import MySampler, MySamplerQuery
from datetime import datetime

# Query two channels with 30-minute intervals
query = MySamplerQuery(
    start=datetime.strptime("2019-08-12 00:00:00", "%Y-%m-%d %H:%M:%S"),
    interval=1_800_000,  # 30 minutes in milliseconds
    num_samples=15,
    pvlist=["R12XGMES", "R13XGMES"],
)

mysampler = MySampler(query)
mysampler.run()

# Access the data as a DataFrame with datetime index
print(mysampler.data)
                     R12XGMES  R13XGMES
Date                                   
2019-08-12 00:00:00    57.265    44.813
2019-08-12 00:30:00    57.265    44.811
2019-08-12 01:00:00    57.265    44.811
2019-08-12 01:30:00    57.265    44.811
...

# Access disconnect events - dictionary of chanel_names: pd.Series
print(mysampler.disconnects)
{}

# Access channel metadata
print(mysampler.metadata)
{'R12XGMES': {'metadata': {'name': 'R12XGMES', 'datatype': 'DBR_DOUBLE', 'datasize': 1, 'datahost': 'hstmya3', 'ioc': None, 'active': True}, 'returnCount': 15}, 'R13XGMES': {'metadata': {'name': 'R13XGMES', 'datatype': 'DBR_DOUBLE', 'datasize': 1, 'datahost': 'hstmya0', 'ioc': None, 'active': True}, 'returnCount': 15}}

Interval - All Events in Time Range

Retrieve all archived events for a single PV. Best for detailed event history. Also includes option to run multiple interval queries in parallel and return combined results. This results in a single DataFrame with a row for each timestamp that any single channel updated.

Note: Example assumes you are running the provided docker container.

from jlab_archiver_client import Interval, IntervalQuery
from datetime import datetime

# Query a single channel for all events
query = IntervalQuery(
    channel="channel100",
    begin=datetime(2018, 4, 24),
    end=datetime(2018, 5, 1),
    deployment="docker"
)

interval = Interval(query)
interval.run()

# Access data as a pandas Series
print(interval.data)
# 2018-04-24 06:25:01    0.000
# 2018-04-24 06:25:05    5.911
# 2018-04-24 11:18:19    5.660
# ...

# Access disconnect events separately
print(interval.disconnects)

# For multiple channels, use parallel queries
data, disconnects, metadata = Interval.run_parallel(
    pvlist=["channel2", "channel3"],
    begin=datetime(2019, 8, 12, 0, 0, 0),
    end=datetime(2019, 8, 12, 1, 20, 45),
    deployment="docker",
    prior_point=True
)

MyStats - Statistical Aggregations

Compute statistics (min, max, mean, etc.) over time bins. Efficient for analyzing trends.

Note: Statistical computations are performed on the myquery server which saves on outbound traffic, but still requires all data be streamed to the myquery server.

Note: Example assumes you are running the provided docker container.

from jlab_archiver_client import MyStats, MyStatsQuery
from datetime import datetime
import pandas as pd

# Query statistics with 1-hour bins
query = MyStatsQuery(
    start=datetime.strptime("2019-08-12 00:00:00", "%Y-%m-%d %H:%M:%S"),
    end=datetime.strptime("2019-08-13 00:00:00", "%Y-%m-%d %H:%M:%S"),
    num_bins=24,  # 24 bins (one hour per bin)
    pvlist=["channel1", "channel100"],
    deployment="docker"
)

mystats = MyStats(query)
mystats.run()

# Access data as MultiIndex DataFrame (timestamp, stat)
print(mystats.data)
#                                   channel1    channel100
# timestamp           stat
# 2019-08-12 00:00:00 duration    3594.421033   3600.000000
#                     eventCount  1716.000000      2.000000
#                     max           96.952400      5.658000
#                     mean          94.964400      5.658000
# ...

# Query specific statistics at a time
print(mystats.data.loc['2019-08-12 00:00:00'])

# Query specific stat and time
print(mystats.data.loc[(pd.Timestamp('2019-08-12 00:00:00'), 'mean'), 'channel1'])
# 94.9644

# Query a range of times and stats using IndexSlice
idx = pd.IndexSlice
print(mystats.data.loc[idx['2019-08-12 00:00:00':'2019-08-12 12:00:00', ['mean', 'max']], :])

Point - Single Event Query

Retrieve a single event at or near a specific timestamp.

Note: Example assumes you are running the provided docker container.

from jlab_archiver_client import Point, PointQuery
from datetime import datetime

# Get the event at or before a specific time
query = PointQuery(
    channel="channel1",
    time=datetime.strptime("2019-08-12 12:00:00", "%Y-%m-%d %H:%M:%S"),
    deployment="docker"
)

point = Point(query)
point.run()

# Access event data
print(point.event)
# {'datatype': 'DBR_DOUBLE', 'datasize': 1, 'datahost': 'mya',
#  'data': {'d': '2019-08-12 11:55:22', 'v': 6.20794}}

Channel - Search for Channels

Discover available channels and their metadata using SQL-style pattern matching.

Note: Example assumes you are running the provided docker container.

from jlab_archiver_client import Channel, ChannelQuery

# Search for all channels starting with "channel10"
query = ChannelQuery(pattern="channel10%", deployment="docker")

channel = Channel(query)
channel.run()

# Access matching channels
print(channel.matches)
# [{'name': 'channel100', 'datatype': 'DBR_DOUBLE', 'datasize': 1, ...},
#  {'name': 'channel101', 'datatype': 'DBR_DOUBLE', 'datasize': 1, ...}]

Command Line Tools

This package includes command-line tools for quick queries. After installation, use the --help or -h flag for usage information.

Command Description
jac-interval Query all events for a single PV over a time range
jac-mysampler Regularly sample multiple PVs
jac-mystats Compute statistical aggregations over time bins
jac-point Retrieve a single event at or near a specific time
jac-channel Search and discover available channel names and metadata

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jlab_archiver_client-2.0.0.tar.gz (27.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jlab_archiver_client-2.0.0-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file jlab_archiver_client-2.0.0.tar.gz.

File metadata

  • Download URL: jlab_archiver_client-2.0.0.tar.gz
  • Upload date:
  • Size: 27.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for jlab_archiver_client-2.0.0.tar.gz
Algorithm Hash digest
SHA256 abe75aa29885d0264bc4ff6c7aa8c80c06cf08097ac5faa4a561ba05551c9cba
MD5 74fcd4a8dc534c24602273b3457319a8
BLAKE2b-256 687e8cd0eccebf9fedd0d0dcc10957e2041d4a29aef23b666f8e45a236e237c9

See more details on using hashes here.

File details

Details for the file jlab_archiver_client-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for jlab_archiver_client-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 534cdefdc5a3c1c0bcf296a15327ff80400bb2cb5fb084c88db8559fa3bb19f3
MD5 adcf8fc77e53dba5de5e428b6ac293f6
BLAKE2b-256 1327867f0ad03fe12718140ad91357789c57672f225a3a6c26547353d2937933

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page