Skip to main content

Sport Data Valley client library for Python

Project description

Sport Data Valley client library for Python

Downloads

Introduction

sdvclient is a Python client library for the Sport Data Valley platform. It is basically a wrapper around the REST API (documented here).

Installation

If you are working in the Sport Data Valley JupyterHub environment, this library is automatically installed.

If you are working in a different Python environment, this library can be installed from PyPI:

pip install sdvclient

When you have previously installed the library and want to upgrade to a newer version:

pip install --upgrade sdvclient

Usage

import sdvclient as sdv


for dataset in sdv.my_datasets():
    # Do something
    pass

The dataset summaries that are returned from my_datasets() have attributes like title, event_start, event_end, owner, sport, tags and more...

dataset.sport
>>> "sports.riding"

To retrieve data from your network:

import sdvclient as sdv


for dataset in sdv.network_datasets():
    # Do something
    pass

Limit the number of results

Both sdv.my_datasets() and sdv.network_datasets() accept an optional limit argument that can be used to limit the number of dataset summaries that are returned.

import sdvclient as sdv

for dataset in sdv.my_datasets(limit=10):
    # Process maximum 10 datasets
    pass

Please note that if there are less datasets available then the limit you specify, the number of returned dataset summaries is lower than limit.

Filter network data

sdv.network_datasets() accepts an optional query argument that can be used to filter the returned datasets:

import sdvclient as sdv

for dataset in sdv.network_datasets(query="strava"):
    # Process datasets that are matched by the "strava" query
    pass

Please note that the query argument filters on all the fields of a dataset. This means that filtering on the name of a user does not necessarily only retrieve data for that user, as this name may also occur anywhere else in a different dataset.

N.B. The query argument is not available for sdv.my_datasets().

Retrieving raw/full data

After you have retrieved a dataset summary, you can then continue to download the raw/full data from this dataset by calling the get_data() method on this object:

import sdvclient as sdv

for dataset in sdv.my_datasets():
    full_data = dataset.get_data()

Or you can retrieve the raw/full data directly if you know the dataset id:

import sdvclient as sdv

full_data = sdv.get_data(id=1337)

Every object that is returned from get_data() has attributes like title, event_start, event_end, owner, sport, type, tags and more fields depending on the data_type. For example a dataset with type strava_type has an attribute dataframe that contains a pandas.DataFrame with the data from this dataset.

dataset.data_type
>>> "strava_type"
dataset.dataframe
>>> <pandas.DataFrame>

Strava data type

As mentioned above, dasets of type strava_type have an attribute dataframe with the corresponding data in a dataframe:

dataset.data_type
>>> "strava_type"
dataset.dataframe
>>> <pandas.DataFrame>

Questionnaire data type

Datasets of type questionnaire have a questions attribute which contains of all the questions and answers in the questionnaire. For each question+answer, the question and answer are available on the question and answer attributes, respectivily.

dataset.questions[2].question
>>> "this is a question"

dataset.questions[2].answer
>>> "this is an answer"

Generic CSV data type

For generic tabular data like csv's the returned dataset has an attribute dataframe with the corresponding data in a dataframe:

dataset.data_type
>>> "generic_csv_type"
dataset.dataframe
>>> <pandas.DataFrame>

Daily activity data type

For daily activity data that is coming from e.g. Fitbit or Polar, the returned dataset has a range of attributes:

  • steps
  • distance
  • calories
  • floors
  • sleep_start
  • sleep_end
  • sleep_duration
  • resting_heart_rate
  • minutes_sedentary
  • minutes_lightly_active
  • minutes_fairly_active
  • minutes_very_active
dataset.data_type
>>> "fitbit_type"
dataset.resting_heart_rate
>>> 58

Please note that not all attributes are always available, this is platform and device dependent.

Unstructured data

Unstructured data is data (files) that Sport Data Valley does not know how to process. These files are stored "as is" in the platform and can be download via this client library as well: For generic tabular data like csv's the returned dataset has an attribute dataframe with the corresponding data in a dataframe: Unstructured data has a file_response attribute that contains a requests.Response object.

dataset.data_type
>>> "unstructured"
dataset.file_response
>>> <Response [200]>

Read more about processing files downloaded with the Python requests library here. E.g. to process binary response content, see here.

Other data types

Although this library will be updated when new data types are added it can happen that a specific data type is not fully supported yet. In that case the returned dateset will be identical as unstructured data, with an file_response attribute that contains a requests.Response object. Unstructured data is data (files) that Sport Data Valley does not know how to process.

dataset.data_type
>>> "some new data type"
dataset.file_response
>>> <Response [200]>

Authentication

The library retrieves your API token from the SDV_AUTH_TOKEN environment variable. If you are working in the Sport Data Valley JupyterHub, this is automatically set. If you are working in a different environment, you can retrieve an API token from the "Advanced" page here and set it like this:

sdv.set_token("your API token here")

Development

Adding Python versions

The supported Python versions are specified in pyproject.toml[tool.poetry.dependencies]#python. The Python versions that are tested are specified in pyproject.toml[tool.tox]#envlist and in Dockerfile.test. If you want to add a new supported Python version, or want to test against a newer version of an existing Python version, the versions at these locations need to be updated.

Contributors

License

See LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdvclient-0.4.1.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sdvclient-0.4.1-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file sdvclient-0.4.1.tar.gz.

File metadata

  • Download URL: sdvclient-0.4.1.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.6 Linux/5.9.11-3-MANJARO

File hashes

Hashes for sdvclient-0.4.1.tar.gz
Algorithm Hash digest
SHA256 bfcd2f333619f00006c779e702ab97787f4467c6176fe95254c0eac950ff7751
MD5 f4149468d97bbef66ea1d358e95e55ff
BLAKE2b-256 25ab64a744bf03f6ab4de539417a07fbfb14a4cf154c788556a6df793848d95f

See more details on using hashes here.

File details

Details for the file sdvclient-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: sdvclient-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.8.6 Linux/5.9.11-3-MANJARO

File hashes

Hashes for sdvclient-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fb8006415a98427f3bd2f1e0f8db0458007e15c26459dbef6294f53b6f8da21b
MD5 1a4dc8ae5a442591351b2d9a4d97cf4f
BLAKE2b-256 7166f29c22aa9280625bdf679c5576955e0760851a42fcf64758c29cf89b6a57

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page