Skip to main content

An open taxonomy for classifying sports and physical activities.

Project description

OpenSportTaxonomy

An open taxonomy for classifying sports and physical activities.

Every platform has invented its own list of sports. Apple HealthKit calls it Cycling, Strava calls it Ride, Garmin calls it ROAD_CYCLING. None of them are hierarchical, none map to each other, and none are open standards.

OpenSportTaxonomy provides a single canonical set of sport codes that any application can reference.

[!WARNING] This taxonomy is young and only covers a few sports at the moment. If yours is missing, open an issue. We'd love to expand it together.

How it works

An activity is identified by a sport string: dots (.) separate the sport from its disciplines in the sport hierarchy, plusses (+) attach modifiers.

Example: cycling.road+stationary+virtual

cycling . road + stationary + virtual
\-----/   \--/   \--------/   \-----/
 sport discipline modifier    modifier
\___________/   \____________________/
  sport code          modifiers

More examples:

Sport string Meaning
cycling.road road cycling
cycling.road+race road cycling race
cycling.road+stationary+virtual road cycling, for example on Zwift
cycling.gravel+assisted+commute e-bike gravel commute
running.trail+race trail running race
xc_skiing.classic+roller classic roller skiing

Sport codes form a tree using dot notation. cycling contains cycling.road, cycling.gravel, cycling.track, and so on. The hierarchy is encoded in the code itself: the parent of cycling.road is cycling. Querying for cycling should naturally include all its children.

Modifiers describe circumstances, not disciplines. Road cycling on a trainer is still road cycling, performed on a stationary machine. Modifiers are appended with + and sorted alphabetically. They are independent: a Zwift ride is both stationary and virtual, set separately.

See the full reference for all sport codes and modifiers.

Structured format

When your context needs separate fields (API payloads, database columns), the same information can be represented as:

{ "sport": "cycling.road", "modifiers": ["stationary", "virtual"] }

The sport string is the canonical form. The structured format is derived from it.

Design principles

Sport code or modifier? If you removed it, would an athlete still recognize the activity as the same sport? If yes, it's a modifier. If no, it's a sport code.

One activity, one sport. Multi-sport events like triathlons are composed of separate single-sport activities.

Venues are not modifiers. Track cycling happens in a velodrome. That's its natural setting, not a "modified" version of outdoor cycling.

Modifiers are explicit. No modifier implies another. A Zwift ride is stationary+virtual — both set separately, because a trainer without a screen is stationary but not virtual. Absence means unspecified, not "the opposite."

Schema format

The canonical schema is schema.yaml, a single YAML file with two flat lists: sports (sorted alphabetically, hierarchy in the dot notation) and modifiers (with optional group for mutual exclusivity).

Platform mappings

Mapping files in mappings/ translate OST codes to platform-specific identifiers. One file per platform:

Translations are lossy by design. Some platforms are less granular than the taxonomy: all cycling disciplines map to a single HealthKit value (13). This is the platform's limitation, not an error.

# The same OST code on three platforms:
- ost: cycling.road
  target: 13                            # Apple HealthKit
- ost: cycling.road
  target: { sport: 2, sub_sport: 7 }   # Garmin FIT
- ost: cycling.road
  target: Ride                          # Strava

Python library

Install the reference implementation:

pip install open-sport-taxonomy

Working with sport strings

The library has two entry points for creating Sport objects:

Method Use when
Sport(raw) Application code, constants, prescriptions. Enforces the standard vocabulary.
Sport.parse(raw) Receiving external input. Accepts any structurally valid sport string.

A standard sport is one where the code and all modifiers are defined in the current taxonomy version. A non-standard sport is structurally valid but contains codes or modifiers not yet in the taxonomy, typically from a newer version. Non-standard is not invalid, it's unrecognized.

from open_sport_taxonomy import Sport, Modifier

# Strict constructor for application code
sport = Sport("cycling.road+race+virtual")
sport.code          # "cycling.road"
sport.label         # "road cycling"
sport.modifiers     # frozenset({Modifier.RACE, Modifier.VIRTUAL})
sport.is_standard   # True
str(sport)          # "cycling.road+race+virtual"

# Unknown codes and modifiers are rejected
Sport("cycling.road.criterium")  # ValueError: Unknown sport code
Sport("cycling.road+rainy")     # ValueError (unknown modifier)

# Parse: for external input, preserves everything
sport = Sport.parse("cycling.road.criterium+race+rainy")
sport.code          # "cycling.road.criterium" (preserved)
sport.modifiers     # frozenset({Modifier.RACE, "rainy"})
sport.is_standard   # False
str(sport)          # "cycling.road.criterium+race+rainy" (round-trips)

# Resolve: map a non-standard sport to the nearest standard equivalent
resolved = sport.resolve()
resolved.code       # "cycling.road"
resolved.modifiers  # frozenset({Modifier.RACE})
resolved.is_standard  # True

Storage pattern

Always store str(sport) in your database. It preserves the original sport string with full fidelity. Use Sport.parse() when loading, then .resolve() for application logic. When you upgrade the library, previously non-standard sports become standard automatically. No data migration needed.

# On ingest
sport = Sport.parse(api_response["sport"])
db.activity.sport = str(sport)    # store faithfully

# On load
sport = Sport.parse(db.activity.sport)
resolved = sport.resolve()         # for application logic

Class constants

For known sports in application code, use class constants:

Sport.CYCLING_ROAD
Sport.RUNNING_TRAIL
Sport.SWIMMING_OPEN_WATER

Taxonomy navigation

Sport.CYCLING.disciplines   # (Sport('cycling.cyclocross'), Sport('cycling.gravel'), ...)
Sport.CYCLING_ROAD.parent   # Sport('cycling')
Sport.all()                 # all standard sports

# Parent preserves modifiers
Sport("cycling.road+stationary").parent  # Sport('cycling+stationary')

Sport matching

Check if a sport is a more specific version of another:

# Prescription matching: does the execution satisfy the prescription?
executed = Sport("cycling.road+stationary")
prescribed = Sport("cycling+stationary")
executed.is_subsport_of(prescribed)   # True

# Extra modifiers are fine
Sport("cycling.road+stationary+race").is_subsport_of(Sport("cycling+stationary"))  # True

# Missing modifiers or wrong hierarchy: no match
Sport("cycling.road").is_subsport_of(Sport("cycling+stationary"))  # False
Sport("running").is_subsport_of(Sport("cycling"))                  # False

Platform translation

from open_sport_taxonomy.platforms import strava, apple_healthkit, garmin_fit

strava.translate(Sport("cycling.road+virtual"))  # "VirtualRide"
apple_healthkit.translate(Sport.CYCLING_ROAD)     # 13
garmin_fit.translate(Sport.CYCLING_ROAD)           # GarminFitCode(sport=2, sub_sport=7)

Pydantic integration

Install with the pydantic extra:

pip install open-sport-taxonomy[pydantic]

Use SportField in Pydantic models for permissive parsing, or StrictSportField to enforce the standard vocabulary:

from pydantic import BaseModel
from open_sport_taxonomy.pydantic import SportField, StrictSportField

class Workout(BaseModel):
    sport: SportField       # accepts any structurally valid sport string

class Prescription(BaseModel):
    sport: StrictSportField  # rejects unknown codes and modifiers

w = Workout(sport="cycling.road+stationary")
w.sport.code      # "cycling.road"
w.model_dump()    # {"sport": "cycling.road+stationary"}

What the taxonomy does not cover

  • Venue properties like pool length (25m vs 50m) or track size. These matter for records and performance but are not distinct disciplines. Planned for a future version.

Versioning

The taxonomy follows Semantic Versioning. Each release is a git tag and a GitHub Release. Sport codes are stable: once published, never removed, only deprecated.

# Latest
https://raw.githubusercontent.com/sweatstack/open-sport-taxonomy/main/schema.yaml

# Pinned to a version
https://raw.githubusercontent.com/sweatstack/open-sport-taxonomy/v0.1.0/schema.yaml

Contributing

See CONTRIBUTING.md.

License

MIT. Maintained by SweatStack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

open_sport_taxonomy-0.2.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_sport_taxonomy-0.2.0-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file open_sport_taxonomy-0.2.0.tar.gz.

File metadata

  • Download URL: open_sport_taxonomy-0.2.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for open_sport_taxonomy-0.2.0.tar.gz
Algorithm Hash digest
SHA256 346a9c4c0d25cbd9f8af6ea4723254bf36f45d6a5a930fba57eb7a3b4b1ce6b2
MD5 20413fc0c46a2ca046cc7d89636e733a
BLAKE2b-256 69cd4fa8808123baed330ad2a88fbd4cc7f8b49b4e456d892af9e922ca7ef7a9

See more details on using hashes here.

File details

Details for the file open_sport_taxonomy-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for open_sport_taxonomy-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 87d60af3012ac1356ef797232a457e81094e3fd43c4c75dd5917fdf589106bad
MD5 d1f826f3e950b58cabd0bc3168841afd
BLAKE2b-256 f91e3d7647f07676f9e8fe1be937595bcea73a69c486e25977755ec1b8cde681

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page