Skip to main content

Sport Data Valley client library for Python

Project description

Sport Data Valley client library for Python

Downloads

Introduction

sdvclient is a Python client library for the Sport Data Valley platform. It is basically a wrapper around the REST API (documented here).

Installation

If you are working in the Sport Data Valley JupyterHub environment, this library is automatically installed.

If you are working in a different Python environment, this library can be installed from PyPI:

pip install sdvclient

When you have previously installed the library and want to upgrade to a newer version:

pip install --upgrade sdvclient

Usage

import sdvclient as sdv


for dataset in sdv.my_datasets():
    # Do something
    pass

The dataset summaries that are returned from my_datasets() have attributes like title, event_start, event_end, owner, sport, tags and more...

dataset.sport
>>> "sports.riding"

To retrieve data from your network:

import sdvclient as sdv


for dataset in sdv.network_datasets():
    # Do something
    pass

To retrieve data from a specific group in your network (see below for how to retrieve these groups):

import sdvclient as sdv


for dataset in sdv.group_datasets():
    # Do something
    pass

Limit the number of results

Both sdv.my_datasets(), sdv.network_datasets() and sdv.group_datasets() accept an optional limit argument that can be used to limit the number of dataset summaries that are returned.

import sdvclient as sdv

for dataset in sdv.my_datasets(limit=10):
    # Process maximum 10 datasets
    pass

Please note that if there are less datasets available then the limit you specify, the number of returned dataset summaries is lower than limit.

Filter network data

sdv.network_datasets() accepts an optional query argument that can be used to filter the returned datasets:

import sdvclient as sdv

for dataset in sdv.network_datasets(query="strava"):
    # Process datasets that are matched by the "strava" query
    pass

Please note that the query argument filters on all the fields of a dataset. This means that filtering on the name of a user does not necessarily only retrieve data for that user, as this name may also occur anywhere else in a different dataset.

N.B. The query argument is not available for sdv.my_datasets().

Retrieve groups and connections

To retrieve the groups in your network:

import sdvclient as sdv

for group in sdv.groups():
    # Do something
    pass

To retrieve the connections in your network:

import sdvclient as sdv

for connection in sdv.connections():
    # Do something
    pass

When you found a connection that you want to retrieve data for you can retrieve those like this:

import sdvclient as sdv

for dataset in sdv.connection_datasets(user=connection):
    # Do something
    pass

The connection input argument is a User object that can come from sdv.connections() or from the dataset.owner from a previous request.

Please be aware that this method uses sdv.network_datasets() under the hood and can therefore be slow when you have a lot of datasets from other connections.

Retrieving raw/full data

After you have retrieved a dataset summary, you can then continue to download the raw/full data from this dataset by calling the get_data() method on this object:

import sdvclient as sdv

for dataset in sdv.my_datasets():
    full_data = dataset.get_data()

Or you can retrieve the raw/full data directly if you know the dataset id:

import sdvclient as sdv

full_data = sdv.get_data(id=1337)

Every object that is returned from get_data() has attributes like title, event_start, event_end, owner, sport, type, tags and more fields depending on the data_type. For example a dataset with type strava_type has an attribute dataframe that contains a pandas.DataFrame with the data from this dataset.

dataset.data_type
>>> "strava_type"
dataset.dataframe
>>> <pandas.DataFrame>

Strava data type

As mentioned above, dasets of type strava_type have an attribute dataframe with the corresponding data in a dataframe:

dataset.data_type
>>> "strava_type"
dataset.dataframe
>>> <pandas.DataFrame>

Questionnaire data type

Datasets of type questionnaire have a questions attribute which contains of all the questions and answers in the questionnaire. For each question+answer, the question and answer are available on the question and answer attributes, respectivily.

dataset.questions[2].question
>>> "this is a question"

dataset.questions[2].answer
>>> "this is an answer"

Generic CSV data type

For generic tabular data like csv's the returned dataset has an attribute dataframe with the corresponding data in a dataframe:

dataset.data_type
>>> "generic_csv_type"
dataset.dataframe
>>> <pandas.DataFrame>

Daily activity data type

For daily activity data that is coming from e.g. Fitbit or Polar, the returned dataset has a range of attributes:

  • steps
  • distance
  • calories
  • floors
  • sleep_start
  • sleep_end
  • sleep_duration
  • resting_heart_rate
  • minutes_sedentary
  • minutes_lightly_active
  • minutes_fairly_active
  • minutes_very_active
dataset.data_type
>>> "fitbit_type"
dataset.resting_heart_rate
>>> 58

Please note that not all attributes are always available, this is platform and device dependent.

Unstructured data

Unstructured data is data (files) that Sport Data Valley does not know how to process. These files are stored "as is" in the platform and can be download via this client library as well: For generic tabular data like csv's the returned dataset has an attribute dataframe with the corresponding data in a dataframe: Unstructured data has a file_response attribute that contains a requests.Response object.

dataset.data_type
>>> "unstructured"
dataset.file_response
>>> <Response [200]>

Read more about processing files downloaded with the Python requests library here. E.g. to process binary response content, see here.

Other data types

Although this library will be updated when new data types are added it can happen that a specific data type is not fully supported yet. In that case the returned dateset will be identical as unstructured data, with an file_response attribute that contains a requests.Response object. Unstructured data is data (files) that Sport Data Valley does not know how to process.

dataset.data_type
>>> "some new data type"
dataset.file_response
>>> <Response [200]>

Authentication

The library retrieves your API token from the SDV_AUTH_TOKEN environment variable. If you are working in the Sport Data Valley JupyterHub, this is automatically set. If you are working in a different environment, you can retrieve an API token from the "Advanced" page here and set it like this:

sdv.set_token("your API token here")

Development

Adding Python versions

The supported Python versions are specified in pyproject.toml[tool.poetry.dependencies]#python. The Python versions that are tested are specified in pyproject.toml[tool.tox]#envlist and in Dockerfile.test. If you want to add a new supported Python version, or want to test against a newer version of an existing Python version, the versions at these locations need to be updated.

Contributors

License

See LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdvclient-0.5.0.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdvclient-0.5.0-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file sdvclient-0.5.0.tar.gz.

File metadata

  • Download URL: sdvclient-0.5.0.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.6 Linux/5.9.11-3-MANJARO

File hashes

Hashes for sdvclient-0.5.0.tar.gz
Algorithm Hash digest
SHA256 523c4290fe923fa167d01663f6a6dc2528d9088293d814668655130f3474564e
MD5 8e7118bc242e6b9768d6b47510d4e076
BLAKE2b-256 b8e9c307cc15390389382ce243333d614a930314ed0ea4a828b9cbaa7d93fffe

See more details on using hashes here.

File details

Details for the file sdvclient-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: sdvclient-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.6 Linux/5.9.11-3-MANJARO

File hashes

Hashes for sdvclient-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aeaf11b66984e0efeb95f79112a659d0ecf5f597a4ca483a988101254f7c4463
MD5 7e8ec2f6276dab9ac71cfccaa1d185d3
BLAKE2b-256 516695568679124e4c9a4493318a350f107df50e4616d457421ec289f462de45

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page