Skip to main content

Top-level package for xedocs.

Project description

xedocs is meant to replace cmt and bodega as well as helping tracking all shared documents especially if they need to be versioned.

What does Xedocs give you

Data reading

  • Read data from multiple formats (e.g. mongodb, pandas) and locations with a simple unified interface.

  • Custom logic implemented on the document class, e.g. creating a tensorflow model from the data etc.

  • Multiple APIs for reading data, fun functional, ODM style, pandas and xarray.

  • Read data as objects, dataframes, dicts, json.

Writing data

  • Write data to multiple storage backends with the same interface

  • Custom per-collection rules for data insertion, deletion and updating.

  • Schema validation and type coercion so storage has uniform and consistent data.

Other

  • Custom panel widgets for graphical representation of data, web client

  • Auto-generated API server and client + openapi documentation

  • CLI for viewing and downloading data

Basic Usage

Explore the available schemas

import xedocs

>>> xedocs.list_schemas()
>>> ['detector_numbers',
    'fax_configs',
    'plugin_lineages',
    'context_lineages',
    'pmt_gains',
    'global_versions',
    'electron_drift_velocities',
    ...]

>>> xedocs.help('pmt_gains')

>>>
        Schema name: pmt_gains
        Index fields: ['version', 'time', 'detector', 'pmt']
        Column fields: ['created_date', 'comments', 'value']

Read/write data from the shared analyst database, this database is writable from the default analysis username/password

import xedocs

db = xedocs.analyst_db()

docs = db.pmt_gains.find_docs(version='v1', pmt=[1,2,3,5], time='2021-01-01T00:00:00', detector='tpc')
gains = [doc.value for doc in docs]

doc = db.pmt_gains.find_one(version='v1', pmt=1, time='2021-01-01T00:00:00', detector='tpc')
pmt1_gain = doc.value

Read from the straxen processing database, this database is read-only for the default analysis username/password

import xedocs

db = xedocs.straxen_db()

...

You can also query documents directly from the schema class, Schemas will query the mongodb analyst database by default, if no explicit datasource is given.

from xedocs.schemas import DetectorNumber

drift_velocity = DetectorNumber.straxen_db.find_one(field='drift_velocity', version='v1')

# Returns a Bodega object with attributes value, description etc.
drift_velocity.value

all_v1_documents = DetectorNumber.straxen_db.find(version='v1')

Read data from alternative data sources specified by path, e.g csv files which will be loaded by pandas.

from xedocs.schemas import DetectorNumber

g1_doc = DetectorNumber.find_one(datasource='/path/to/file.csv', version='v1', field='g1')
g1_value = g1_doc.value
g1_error = g1_doc.uncertainty

The path can also be a github URL or any other URL supported by fsspec.

from xedocs.schemas import DetectorNumber

g1_doc = DetectorNumber.find_one(
                         datasource='github://org:repo@/path/to/file.csv',
                         version='v1',
                         field='g1')

Supported data sources

  • MongoDB collections

  • TinyDB tables

  • JSON files

  • REST API clients

Please open an issue on rframe if you want support for an additional data format.

Documentation

Full documentation hosted by Readthedocs

Credits

This package was created with Cookiecutter and the briggySmalls/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xedocs-0.2.5.tar.gz (36.0 kB view hashes)

Uploaded Source

Built Distribution

xedocs-0.2.5-py3-none-any.whl (50.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page