Skip to main content

Python Client for European XFEL Metadata Catalogue Web App available at https://in.xfel.eu/metadata

Project description

MyMdC is the Web App design for Data Management at European XFEL.

This library (metadata_client) is a client for the RESTful APIs exposed by the European XFEL Metadata Catalogue Web Application - myMdC (https://in.xfel.eu/metadata).

Repository:

Project policy:

  • Security reporting: see SECURITY.md

  • Contribution guidelines: see CONTRIBUTING.md

Dependencies:

Public Usage Notes

  • The package itself is public and can be installed/imported from public environments.

  • The primary production service URL (https://in.xfel.eu/metadata) is an internal European XFEL service.

  • Integration tests in this repository need either:

    • access to an internal myMdC deployment, or

    • a locally deployed metadata_catalog instance with valid OAuth client credentials.

Installation

Python project

  1. Install requirements, if never done before

1.1. For OS X distributions:

1.1.1. Homebrew

      brew install python3

1.1.2 Port

      sudo port install python310

      sudo port select --set python3 python310

      sudo port install py310-pip
      sudo port select --set pip pip310

1.2. For Linux distributions:

sudo apt-get update
sudo apt-get install python3.10 python3.10-venv
  1. Make metadata_client library available in your python environment

2.1. Install it via pip:

# Use venv
 python3.10 -m venv .venv
 source .venv/bin/activate

# Install dependencies from the pypi
pip install .

# Force re-installation of packages
pip install . --ignore-installed

# Install dependencies from local wheels files is possible,
# if manually generated:
# ```pip wheel --wheel-dir=./external_dependencies .```
pip install . --no-index --find-links ./external_dependencies/

2.2. Offline/public smoke test (no service calls):

python -c "import metadata_client as m; print(m.__version__)"

Installing it will place two folders under the current Python installation site-packages folder:

  • metadata_client with the sources;

  • metadata_client-4.2.1.dist-info/ with Wheels configuration files.

To identify your Python site-packages folder run:

python -c "import sysconfig; print(sysconfig.get_paths()['purelib'])"

Usage

To use this project you need to import it:

from metadata_client import MetadataClient
  1. Connection to the MdC (Metadata Catalog):

    from metadata_client import MetadataClient
    
    # Necessary configuration variables to establish a connection
    # Go to https://in.xfel.eu/metadata/oauth/applications to make a token for
    # the metadata catalogue.
    user_id = 'PUT_HERE_YOUR_CLIENT_KEY'
    user_secret = 'PUT_HERE_YOUR_SECRET_KEY'
    user_email = 'PUT_HERE_YOUR_CLIENT_CONTACT_EMAIL'
    #
    metadata_web_app_url = 'https://in.xfel.eu/metadata'
    token_url = 'https://in.xfel.eu/metadata/oauth/token'
    refresh_url = 'https://in.xfel.eu/metadata/oauth/token'
    auth_url = 'https://in.xfel.eu/metadata/oauth/authorize'
    scope = ''
    base_api_url = 'https://in.xfel.eu/metadata/api/'
    
    # Generate the connection (example with minimum parameter options)
    client_conn = MetadataClient(client_id=user_id,
                                 client_secret=user_secret,
                                 user_email=user_email,
                                 token_url=token_url,
                                 refresh_url=refresh_url,
                                 auth_url=auth_url,
                                 scope=scope,
                                 base_api_url=base_api_url)
    
    # Generate the connection (example with all parameter options)
    client_conn = MetadataClient(client_id=user_id,
                                 client_secret=user_secret,
                                 user_email=user_email,
                                 token_url=token_url,
                                 refresh_url=refresh_url,
                                 auth_url=auth_url,
                                 scope=scope,
                                 base_api_url=base_api_url,
                                 session_token=None,
                                 max_retries=3,
                                 timeout=12,
                                 ssl_verify=True)
  2. Interaction with the MyMdC (Metadata Catalog):

2.1 Example data_group_types:

all_group_types = client_conn.get_all_data_group_types()

all_group_types
# >>> {'success': True,
#      'pagination': {'Date': 'Tue, 10 May 2022 22:48:14 GMT', 'X-Total-Pages': '1', 'X-Count-Per-Page': '100', 'X-Current-Page': '1', 'X-Total-Count': '6'},
#      'data': [{'description': '', 'identifier': 'RAW', 'name': 'Raw', 'flg_available': True, 'id': 1},
#               {'description': '', 'identifier': 'CAL', 'name': 'Calibration', 'flg_available': True, 'id': 2},
#               {'description': '', 'identifier': 'PROC', 'name': 'Processed', 'flg_available': True, 'id': 3},
#               {'description': '', 'identifier': 'REDU', 'name': 'Reduced', 'flg_available': True, 'id': 4},
#               {'description': '', 'identifier': 'SIM', 'name': 'Simulation', 'flg_available': True, 'id': 5},
#               {'description': '', 'identifier': 'UNK', 'name': 'Unknown', 'flg_available': True, 'id': 6}],
#      'app_info': {},
#      'info': 'Got data_group_type successfully'}

all_group_types['success']
# >>> True

all_group_types['pagination']
# >>> {'Date': 'Wed, 11 May 2022 09:55:34 GMT', 'X-Total-Pages': '1', 'X-Count-Per-Page': '100', 'X-Current-Page': '1', 'X-Total-Count': '6'}

all_group_types['data'][0]
# >>> {'description': '', 'identifier': 'RAW', 'name': 'Raw', 'flg_available': True, 'id': 1}

all_group_types['data'][0]['name']
# >>> 'Raw'

2.2 Example instruments:

all_xfel_instruments = client_conn.get_all_xfel_instruments()

>>> for instrument in all_xfel_instruments['data']:
...   print('id = {0} | name = {1}'.format(instrument['id'], instrument['name']))
...
# id = -1 | name = test-instrument
# id = 1 | name = SPB/SFX SASE1
# id = 2 | name = FXE SASE1
# id = 3 | name = SQS SASE3
# id = 4 | name = SCS SASE3
# id = 5 | name = MID SASE2
# id = 6 | name = HED SASE2
# id = 7 | name = Hera South Detector Test Stand
# id = 8 | name = SASE1 Test Stand
# id = 9 | name = SASE2 Test Stand
# id = 10 | name = SASE3 Test Stand

all_xfel_instruments = client_conn.get_all_xfel_instruments(page=1, page_size=1)
all_xfel_instruments

# >>> {'success': True,
#      'info': 'Got instrument successfully',
#      'app_info': {},
#      'pagination': {'Date': 'Wed, 11 May 2022 09:57:45 GMT', 'X-Total-Pages': '21', 'X-Count-Per-Page': '1', 'X-Current-Page': '1', 'X-Total-Count': '21'},
#      'data': [{'id': 1, 'name': 'SPB/SFX SASE1', 'identifier': 'SPB', 'url': 'https://www.xfel.eu/facility/instruments/spb_sfx', 'instrument_leader_id': 230, 'deputy_instrument_leader_id': 1018, 'facility_id': 1, 'instrument_type_id': 2, 'repository_id': 103, 'topic_id': 1, 'dsg_host': None, 'system_user': None, 'flg_online_resource': True, 'online_script': 'make_online', 'flg_available': True, 'description': 'The Single Particles, Clusters, and Biomolecules & Serial Femtosecond Crystallography (SPB/SFX) instrument of the European XFEL is primarily concerned with three-dimensional diffractive imaging, and three-dimensional structure determination, of micrometre-scale and smaller objects, at atomic or near-atomic¿resolution.', 'doi': None, 'techniques': [{'id': 250, 'identifier': 'PaNET01168', 'name': 'serial femtosecond crystallography', 'url': 'http://purl.org/pan-science/PaNET/PaNET01168', 'flg_available': True, 'description': None}, {'id': 259, 'identifier': 'PaNET01188', 'name': 'small angle x-ray scattering', 'url': 'http://purl.org/pan-science/PaNET/PaNET01188', 'flg_available': True, 'description': None}, {'id': 364, 'identifier': 'PaNET01101', 'name': 'x-ray powder diffraction', 'url': 'http://purl.org/pan-science/PaNET/PaNET01101', 'flg_available': True, 'description': None}, {'id': 28, 'identifier': 'PaNET01174', 'name': 'coherent diffraction imaging', 'url': 'http://purl.org/pan-science/PaNET/PaNET01174', 'flg_available': True, 'description': None}]}]}

2.3 Get instrument active proposal:

active_proposal = client_conn.get_active_proposal_by_instrument(1)

2.4 Register Run replica:

# (e.g. proposal_number == 1234)
# (e.g. proposal_number == 12)
# (e.g. repository_identifier == 'XFEL_GPFS_OFFLINE_RAW_CC')

resp = client_conn.register_run_replica(
    proposal_number, run_number, repository_identifier
)
# resp = {'success': True,
#         'info': 'Run replica registered successfully',
#         'pagination': {'Date': 'Tue, 10 May 2022 22:48:14 GMT', 'X-Total-Pages': '1', 'X-Count-Per-Page': '100', 'X-Current-Page': '1', 'X-Total-Count': '6'},
#         'data': {'experiment_id': '-1',
#                  'sample_id': '-1',
#                  'run_id': '1588',
#                  'data_group_id': '777'},
#         'app_info': {}}

2.5 Unregister Run replica:

# (e.g. proposal_number == 1234)
# (e.g. proposal_number == 12)
# (e.g. repository_identifier == 'XFEL_GPFS_OFFLINE_RAW_CC')

resp = client_conn.unregister_run_replica(
    proposal_number, run_number, repository_identifier
)
# resp = {'success': True,
#         'info': 'Run replica unregistered successfully',
#         'pagination': {'Date': 'Tue, 10 May 2022 22:48:14 GMT', 'X-Total-Pages': '1', 'X-Count-Per-Page': '100', 'X-Current-Page': '1', 'X-Total-Count': '6'},
#         'data': {'data_group_id': '-1',
#                  'repository_id': '1',
#                  'flg_available': 'false'},
#         'app_info': {}}

2.6 Get proposal’s runs:

# (e.g. proposal_number == 1234)
# (e.g. page == 1 | Default == 1)
# (e.g. page_size == 5 | Default == 100 | Limit: 500)

resp = client_conn.get_proposal_runs(proposal_number, page=1, page_size=5)
# RESPONSE example
#
# resp = {'info': 'Got proposal successfully',
#         'success': True,
#         'pagination': {'Date': 'Tue, 10 May 2022 22:48:14 GMT',
#                        'X-Total-Pages': '1',
#                        'X-Count-Per-Page': '100',
#                        'X-Current-Page': '1',
#                        'X-Total-Count': '6'},
#         'data': {
#           'proposal': {
#               'id': -1,
#               'number': 0,
#               'title': 'Proposal Title 001'
#                  },
#           'runs': [
#               {
#               'id': -1,
#               'run_number': 1,
#               'flg_status': 1,
#               'flg_run_quality': -1,
#               'size': null,
#               'num_files': 0,
#               'repositories': {
#                   'XFEL_TESTS_REPO': {
#                       'name": 'XFEL Tests Repository',
#                       'mount_point': '/webstorage/XFEL',
#                       'data_groups': 1
#                       }
#                   }
#               }
#            ]
#          },
#         'app_info': {}}

2.7 Get proposal’s samples:

# (e.g. proposal_number == 1234)
# (e.g. page == 1 | Default == 1)
# (e.g. page_size == 50 | Default == 100 | Limit: 500)

resp = client_conn.get_proposal_samples(proposal_number, page=1, page_size=50)
#
# RESPONSE example
#
# resp = {'info': 'Got sample successfully',
#         'success': True,
#         'pagination': {'Date': 'Tue, 10 May 2022 22:48:14 GMT',
#                        'X-Total-Pages': '1',
#                        'X-Count-Per-Page': '100',
#                        'X-Current-Page': '1',
#                        'X-Total-Count': '6'},
#         'data': [{'id': -1,
#                   'name': 'TestSample DO NOT DELETE!',
#                   'proposal_id': -1,
#                   'sample_type_id': 1,
#                   'flg_available': True,
#                   'url': '',
#                   'description': ''}],
#         'app_info': {}}

For additional examples, please take a look in metadata_client/tests. Most tests are integration-oriented and require reachable myMdC endpoints plus valid OAuth credentials.

Development & Testing

When developing, and before commit changes, please validate that:

  1. All tests continue passing successfully (to validate that run pytest):

    # Go to the source code directory
    cd metadata_client
    
    # Use venv
    python3.14 -m venv .venv
    
    source .venv/bin/activate
    
    # Upgrade package and all its required packages
    pip install . -U --upgrade-strategy eager
    
    # Install test dependencies
    pip install '.[test]' -U --upgrade-strategy eager
    
    # Create local test secrets from the template
    cp metadata_client/tests/common/secrets_example.py metadata_client/tests/common/secrets.py
    
    # Run all tests using pytest (requires reachable myMdC endpoint)
    pytest metadata_client/tests
    
    # When running all tests against the standard http application
    OAUTHLIB_INSECURE_TRANSPORT=1 pytest metadata_client/tests
    
    # Run all tests and get information about coverage for all files inside metadata_client package
    pytest --cov metadata_client --cov-report term-missing
  2. Code keeps respecting pycodestyle code conventions (to validate that run pycodestyle):

    pycodestyle .
    pycodestyle . --exclude venv
  3. To generate all the wheels files for the dependencies, execute:

    # Generate Wheels to itself and dependencies
    pip wheel --wheel-dir=./external_dependencies .
    pip wheel --wheel-dir=./external_dependencies --find-links=./external_dependencies .
  4. Check that you have the desired dependency versions in external_dependencies folder, since not all dependency versions are pinned in setup.py.

  5. (Optional) Rewrite git history to remove previously committed values:

    # Replace each old value with a neutral marker
    scripts/rewrite_history_remove_values.sh \
      "OLD_CLIENT_ID_VALUE" "REDACTED_CLIENT_ID" \
      "OLD_CLIENT_SECRET_VALUE" "REDACTED_CLIENT_SECRET"
    
    # Push rewritten history
    git push --force-with-lease origin <branch>

Dependency Policy

  • Runtime dependencies are declared in setup.py.

  • requests is constrained to >=2.30,<3 to avoid unplanned major-version breakage.

  • Other runtime dependencies currently keep looser constraints and are validated through CI and release checks.

  • Offline wheel bundles in external_dependencies/ are operational artifacts and should not be committed.

Registering library on https://pypi.org

This project uses tag-driven publishing from GitLab CI with PyPI Trusted Publishing (OIDC), so no long-lived PyPI API token is required in CI.

To prepare a release locally (before creating a tag):

# Install build and twine
python -m pip install --upgrade build twine

# Generates source distribution (.tar.gz) and wheel (.whl) files in the dist/ folder
# using the PEP 517 build-system defined in pyproject.toml
# python setup.py sdist
# python setup.py bdist_wheel
python -m build

# (Optional/manual) Upload new version .tar.gz and .whl files
twine upload dist/*

# In case a test is necessary, it is possible to test it against test.pypi.org
twine upload --repository-url https://test.pypi.org/legacy/ dist/* --verbose

Release Validation (Tags)

On GitLab CI, tag pipelines run an additional release-validation job that:

  • verifies the git tag matches metadata_client.__version__;

  • verifies HISTORY.rst has an entry for that exact version;

  • builds source/wheel artifacts and runs twine check;

  • installs the generated wheel in a clean virtual environment and validates that the installed version matches the tagged version;

  • uploads artifacts to PyPI using OIDC Trusted Publishing.

All pipelines also run an automated secret scan (gitleaks) to detect accidental credential leaks before merge/release.

Recommended tag format: X.Y.Z.

Trusted Publishing Setup

Before first release, configure publishing as follows:

  • For projects hosted on https://gitlab.com, configure a PyPI Trusted Publisher using environment name pypi and audience pypi.

  • For self-managed GitLab instances (for example https://git.xfel.eu), set a masked CI variable PYPI_API_TOKEN (or TWINE_PASSWORD) with a PyPI API token and use token-based upload in CI.

  • If the token variable is marked Protected, ensure the release tag is also protected and the protected tag rule matches your release tags (for example * for all tags, or 4.2.* for tags like 4.2.1).

If you want to publish tags to TestPyPI instead, set PYPI_REPOSITORY_URL=https://test.pypi.org/legacy/. For gitlab.com OIDC Trusted Publishing, use audience testpypi.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metadata_client-4.2.1.tar.gz (81.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metadata_client-4.2.1-py3-none-any.whl (145.2 kB view details)

Uploaded Python 3

File details

Details for the file metadata_client-4.2.1.tar.gz.

File metadata

  • Download URL: metadata_client-4.2.1.tar.gz
  • Upload date:
  • Size: 81.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for metadata_client-4.2.1.tar.gz
Algorithm Hash digest
SHA256 323598cbc7d1786d6bedcaaecf59bc32225320e9031637c2212f3e384068cd7e
MD5 9dfdfcf7102fa2b46657c5df68629a95
BLAKE2b-256 c039412638f80ad18926e6a07517b5740f1c165a150c39c0543a10693260ba1b

See more details on using hashes here.

File details

Details for the file metadata_client-4.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for metadata_client-4.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 795fc147265089de39d5328d90993672f814d4f7896322dfebc440c898bb19d4
MD5 5ce3c4354122bc9c3e9e8cde29a81eec
BLAKE2b-256 f032a64ac599cc0a41824860bd1c441f96e4661124ad55b8f197821bc147f3ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page