Skip to main content

Sport Data Valley client library for Python

Project description

Sport Data Valley client library for Python

Downloads

Introduction

sdvclient is a Python client library for the Sport Data Valley platform. It is basically a wrapper around the REST API (documented here).

Installation

If you are working in the Sport Data Valley JupyterHub environment, this library is automatically installed.

If you are working in a different Python environment, this library can be installed from PyPI:

pip install sdvclient

When you have previously installed the library and want to upgrade to a newer version:

pip install --upgrade sdvclient

Usage

import sdvclient as sdv


for dataset in sdv.my_datasets():
    # Do something
    pass

The dataset summaries that are returned from my_datasets() have attributes like title, event_start, event_end, owner, sport, tags and more...

dataset.sport
>>> "sports.riding"

To retrieve data from your network:

import sdvclient as sdv


for dataset in sdv.network_datasets():
    # Do something
    pass

To retrieve data from a specific group in your network (see below for how to retrieve these groups):

import sdvclient as sdv


for dataset in sdv.group_datasets():
    # Do something
    pass

Limit the number of results

Both sdv.my_datasets(), sdv.network_datasets() and sdv.group_datasets() accept an optional limit argument that can be used to limit the number of dataset summaries that are returned.

import sdvclient as sdv

for dataset in sdv.my_datasets(limit=10):
    # Process maximum 10 datasets
    pass

Please note that if there are less datasets available then the limit you specify, the number of returned dataset summaries is lower than limit.

Filter network data

sdv.network_datasets() accepts an optional query argument that can be used to filter the returned datasets:

import sdvclient as sdv

for dataset in sdv.network_datasets(query="strava"):
    # Process datasets that are matched by the "strava" query
    pass

Please note that the query argument filters on all the fields of a dataset. This means that filtering on the name of a user does not necessarily only retrieve data for that user, as this name may also occur anywhere else in a different dataset.

N.B. The query argument is not available for sdv.my_datasets().

Retrieve groups and connections

To retrieve the groups in your network:

import sdvclient as sdv

for group in sdv.groups():
    # Do something
    pass

To retrieve the connections in your network:

import sdvclient as sdv

for connection in sdv.connections():
    # Do something
    pass

When you found a connection that you want to retrieve data for you can retrieve those like this:

import sdvclient as sdv

for dataset in sdv.connection_datasets(user=connection):
    # Do something
    pass

The connection input argument is a User object that can come from sdv.connections() or from the dataset.owner from a previous request.

Please be aware that this method uses sdv.network_datasets() under the hood and can therefore be slow when you have a lot of datasets from other connections.

Retrieving raw/full data

After you have retrieved a dataset summary, you can then continue to download the raw/full data from this dataset by calling the get_data() method on this object:

import sdvclient as sdv

for dataset in sdv.my_datasets():
    full_data = dataset.get_data()

Or you can retrieve the raw/full data directly if you know the dataset id:

import sdvclient as sdv

full_data = sdv.get_data(id=1337)

Every object that is returned from get_data() has attributes like title, event_start, event_end, owner, sport, type, tags and more fields depending on the data_type. For example a dataset with type strava_type has an attribute dataframe that contains a pandas.DataFrame with the data from this dataset.

dataset.data_type
>>> "strava_type"
dataset.dataframe
>>> <pandas.DataFrame>

Strava data type

As mentioned above, dasets of type strava_type have an attribute dataframe with the corresponding data in a dataframe:

dataset.data_type
>>> "strava_type"
dataset.dataframe
>>> <pandas.DataFrame>

Questionnaire data type

Datasets of type questionnaire have a questions attribute which contains of all the questions and answers in the questionnaire. For each question+answer, the question and answer are available on the question and answer attributes, respectivily.

dataset.questions[2].question
>>> "this is a question"

dataset.questions[2].answer
>>> "this is an answer"

Generic CSV data type

For generic tabular data like csv's the returned dataset has an attribute dataframe with the corresponding data in a dataframe:

dataset.data_type
>>> "generic_csv_type"
dataset.dataframe
>>> <pandas.DataFrame>

Daily activity data type

For daily activity data that is coming from e.g. Fitbit or Polar, the returned dataset has a range of attributes:

  • steps
  • distance
  • calories
  • floors
  • sleep_start
  • sleep_end
  • sleep_duration
  • resting_heart_rate
  • minutes_sedentary
  • minutes_lightly_active
  • minutes_fairly_active
  • minutes_very_active
dataset.data_type
>>> "fitbit_type"
dataset.resting_heart_rate
>>> 58

Please note that not all attributes are always available, this is platform and device dependent.

Unstructured data

Unstructured data is data (files) that Sport Data Valley does not know how to process. These files are stored "as is" in the platform and can be download via this client library as well: For generic tabular data like csv's the returned dataset has an attribute dataframe with the corresponding data in a dataframe: Unstructured data has a file_response attribute that contains a requests.Response object.

dataset.data_type
>>> "unstructured"
dataset.file_response
>>> <Response [200]>

Read more about processing files downloaded with the Python requests library here. E.g. to process binary response content, see here.

Other data types

Although this library will be updated when new data types are added it can happen that a specific data type is not fully supported yet. In that case the returned dateset will be identical as unstructured data, with an file_response attribute that contains a requests.Response object. Unstructured data is data (files) that Sport Data Valley does not know how to process.

dataset.data_type
>>> "some new data type"
dataset.file_response
>>> <Response [200]>

Authentication

The library retrieves your API token from the SDV_AUTH_TOKEN environment variable. If you are working in the Sport Data Valley JupyterHub, this is automatically set. If you are working in a different environment, you can retrieve an API token from the "Advanced" page here and set it like this:

sdv.set_token("your API token here")

Development

Adding Python versions

The supported Python versions are specified in pyproject.toml[tool.poetry.dependencies]#python. The Python versions that are tested are specified in pyproject.toml[tool.tox]#envlist and in Dockerfile.test. If you want to add a new supported Python version, or want to test against a newer version of an existing Python version, the versions at these locations need to be updated.

Contributors

License

See LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdvclient-0.5.1.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdvclient-0.5.1-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file sdvclient-0.5.1.tar.gz.

File metadata

  • Download URL: sdvclient-0.5.1.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.5 Linux/5.8.0-43-generic

File hashes

Hashes for sdvclient-0.5.1.tar.gz
Algorithm Hash digest
SHA256 9fa0ce1f885150378911a1ea51b346ce03b4a5b01e1dbb0eff18351a6cf8c268
MD5 236c82399e5e268db84cb78581bdeacd
BLAKE2b-256 f27dee893569e6418575f2c2b34b8aec19551757bf3a6787aec2628e04eed37c

See more details on using hashes here.

File details

Details for the file sdvclient-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: sdvclient-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.5 Linux/5.8.0-43-generic

File hashes

Hashes for sdvclient-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 476eefdf78524a5a7cacbb3e3e01700895914bc59c59a9b30d9afe1550363b89
MD5 42bcd3a2d1ff923dadef818c0cd78a8b
BLAKE2b-256 40351f821af3ad37f61346d2f5119de43bb3d84dbef6ef32d79c5950338562d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page