Skip to main content

easily stream StatsBomb data into Python

Project description

statsbombpy

By: StatsBomb

Support: support@statsbombservices.com

Updated February 23, 2021.

This repository is a Python package to easily stream StatsBomb data into Python using your log in credentials for the API or free data from our GitHub page. API access is for paying customers only

Installation Instructions

git clone https://github.com/statsbomb/statsbombpy.git
cd statsbombpy
pip install .

Running the tests

nose2 -v --pretty-assert

Authentication

Environment Variables

Authentication can be done by setting environment variables named SB_USERNAME and SB_PASSWORD to your login credentials.

Manual Calls

Alternatively, if you don't want to use environment variables, all functions accept an argument creds to pass your login credentials in the format {"user": "", "passwd": ""}

Open Data

StatsBomb's open data can be accessed without the need of authentication.

StatsBomb are committed to sharing new data and research publicly to enhance understanding of the game of Football. We want to actively encourage new research and analysis at all levels. Therefore we have made certain leagues of StatsBomb Data freely available for public use for research projects and genuine interest in football analytics.

StatsBomb are hoping that by making data freely available, we will extend the wider football analytics community and attract new talent to the industry. We would like to collect some basic personal information about users of our data. By giving us your email address, it means we will let you know when we make more data, tutorials and research available. We will store the information in accordance with our Privacy Policy and the GDPR.

Terms & Conditions

Whilst we are keen to share data and facilitate research, we also urge you to be responsible with the data. Please register your details on https://www.statsbomb.com/resource-centre and read our User Agreement carefully. By using this repository, you are agreeing to the user agreement. If you publish, share or distribute any research, analysis or insights based on this data, please state the data source as StatsBomb and use our logo.

Usage

from statsbombpy import sb

Competitions

sb.competitions()
competition_id season_id country_name competition_name competition_gender season_name match_updated match_available
0 9 42 Germany 1. Bundesliga male 2019/2020 2019-12-29T07:47:45.981 2019-12-29T07:47:45.981
1 9 4 Germany 1. Bundesliga male 2018/2019 2019-12-16T23:09:16.168756 2019-12-16T23:09:16.168756
2 9 1 Germany 1. Bundesliga male 2017/2018 2019-12-16T23:09:16.168756 2019-12-16T23:09:16.168756
3 78 42 Croatia 1. HNL male 2019/2020 2020-01-02T10:35:49.065 2020-01-02T10:35:49.065
4 10 42 Germany 2. Bundesliga male 2019/2020 2019-12-27T00:36:37.498 2019-12-27T00:36:37.498

Matches

sb.matches(competition_id=9, season_id=42)
match_id match_date kick_off competition season home_team away_team home_score away_score match_status last_updated match_week competition_stage stadium referee data_version shot_fidelity_version xy_fidelity_version
0 303299 2019-12-15 18:00:00.000 Germany - 1. Bundesliga 2019/2020 Schalke 04 Eintracht Frankfurt 1 0 available 2019-12-17T09:50:17.558 15 Regular Season VELTINS-Arena F. Zwayer 1.1.0 2 2
1 303223 2019-09-01 18:00:00.000 Germany - 1. Bundesliga 2019/2020 Eintracht Frankfurt Fortuna Düsseldorf 2 1 available 2019-12-16T23:09:16.168756 3 Regular Season Commerzbank-Arena F. Willenborg 1.1.0 2 2
2 303083 2019-12-15 15:30:00.000 Germany - 1. Bundesliga 2019/2020 Wolfsburg Borussia Mönchengladbach 2 1 available 2019-12-17T15:52:17.843 15 Regular Season VOLKSWAGEN ARENA F. Brych 1.1.0 2 2
3 303266 2019-12-14 15:30:00.000 Germany - 1. Bundesliga 2019/2020 Hertha Berlin Freiburg 1 0 available 2019-12-17T17:43:18.285 15 Regular Season Olympiastadion Berlin F. Willenborg 1.1.0 2 2
4 303073 2019-12-21 15:30:00.000 Germany - 1. Bundesliga 2019/2020 Bayern Munich Wolfsburg 2 0 available 2019-12-23T18:02:36.454 17 Regular Season Allianz Arena C. Dingert 1.1.0 2 2

Lineups

sb.lineups(match_id=303299)["Eintracht Frankfurt"]
player_id player_name player_nickname birth_date player_gender player_height player_weight jersey_number country
0 3204 Almamy Touré None 1996-04-28 male 182.0 72.0 18 Mali
1 5591 Filip Kostić None 1992-11-01 male 184.0 82.0 10 Serbia
2 7713 Obite Evan N"Dicka Evan N'Dicka 1999-08-20 male 190.0 NaN 2 France
3 8307 Martin Hinteregger None 1992-09-07 male 184.0 83.0 13 Austria
4 8669 Mijat Gaćinović None 1995-02-08 male 175.0 66.0 11 Serbia

Events

The default settings for querying events return a single dataframe with all event types and event attributes.

events = sb.events(match_id=303299)
ball_receipt_outcome ball_recovery_offensive ball_recovery_recovery_failure block_deflection carry_end_location clearance_aerial_won clearance_body_part clearance_head clearance_left_foot clearance_right_foot counterpress dribble_no_touch dribble_outcome dribble_overrun duel_outcome duel_type duration foul_committed_advantage foul_committed_card foul_won_advantage foul_won_defensive goalkeeper_body_part goalkeeper_end_location goalkeeper_outcome goalkeeper_position goalkeeper_technique goalkeeper_type id index injury_stoppage_in_chain interception_outcome location match_id minute off_camera out pass_aerial_won pass_angle pass_assisted_shot_id pass_body_part pass_cross pass_cut_back pass_deflected pass_end_location pass_goal_assist pass_height pass_length pass_outcome pass_outswinging pass_recipient pass_shot_assist pass_straight pass_switch pass_technique pass_through_ball pass_type pass_xclaim period play_pattern player position possession possession_team related_events second shot_aerial_won shot_body_part shot_end_location shot_first_time shot_freeze_frame shot_key_pass_id shot_one_on_one shot_outcome shot_statsbomb_xg shot_statsbomb_xg2 shot_technique shot_type substitution_outcome substitution_replacement team timestamp type under_pressure
500 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3.498736 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 84828c06-41b5-44eb-aa92-1710bdb818ac 1838 NaN NaN [50.1, 16.6] 303299 47 NaN NaN NaN 2.720095 NaN Left Foot NaN NaN NaN [13.3, 33.1] NaN Ground Pass 40.329765 NaN NaN Frederik Rønnow NaN NaN NaN NaN NaN NaN NaN 2 Regular Play Obite Evan N"Dicka Left Center Back 103 Eintracht Frankfurt [ae3094e3-faa3-4608-8284-d9b8cca77711, c1202f1c-0831-4e88-83b2-597f56f0c858] 52 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Eintracht Frankfurt 00:02:52.438 Pass True
501 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3.604236 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 9061cd20-513b-499f-b925-f1de5f241281 1840 NaN NaN [13.3, 33.1] 303299 47 NaN NaN NaN -0.153945 NaN Right Foot NaN NaN NaN [77.1, 23.2] NaN High Pass 64.563540 Incomplete NaN Mijat Gaćinović NaN NaN NaN NaN NaN NaN NaN 2 Regular Play Frederik Rønnow Goalkeeper 103 Eintracht Frankfurt [8e6495a7-782a-4f1a-845f-3ec50d761a1e, ff758a12-1ba6-4dd4-8b2c-7d39aa7aed97] 55 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Eintracht Frankfurt 00:02:55.937 Pass NaN
502 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2.101999 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 8e6495a7-782a-4f1a-845f-3ec50d761a1e 1842 NaN NaN [43.0, 56.9] 303299 47 NaN NaN NaN -0.703110 NaN Head NaN NaN NaN [64.0, 39.1] NaN High Pass 27.528894 NaN NaN Amine Harit NaN NaN NaN NaN NaN Recovery NaN 2 Regular Play Ozan Muhammed Kabak Right Center Back 104 Schalke 04 [9061cd20-513b-499f-b925-f1de5f241281, be6dfe7d-7596-4cc2-8cd9-8c17d064317e] 59 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Schalke 04 00:02:59.541 Pass NaN
503 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 1.187459 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 86431bc7-210a-4868-8e18-26ff38becefc 1854 NaN NaN [65.9, 12.6] 303299 48 NaN NaN NaN -0.730239 NaN Right Foot NaN NaN NaN [74.5, 4.9] NaN Ground Pass 11.543396 NaN NaN Amine Harit NaN NaN NaN NaN NaN NaN NaN 2 Regular Play Suat Serdar Left Defensive Midfield 104 Schalke 04 [761b4e65-8f64-464c-8153-6a98465208ba] 7 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Schalke 04 00:03:07.689 Pass NaN
504 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.766628 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 6e58c713-622c-4246-8243-e4162e487a1c 1858 NaN NaN [79.1, 10.5] 303299 48 NaN NaN NaN 1.254940 NaN Right Foot NaN NaN NaN [84.1, 25.8] NaN Ground Pass 16.096273 NaN NaN Rabbi Matondo NaN NaN NaN NaN NaN NaN NaN 2 Regular Play Amine Harit Center Attacking Midfield 104 Schalke 04 [b1960a76-d3ae-4ef3-a2cd-47eca8c25e0a, dd1575c0-a408-4177-944d-7e86d2f79181] 11 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Schalke 04 00:03:11.719 Pass True

It's also possible to get distinct dataframes for each event type and/or to have distinct event attributes on their own columns

sb.events(match_id=303299, split=True, flatten_attrs=False)["dribbles]
id index period timestamp minute second type possession possession_team play_pattern team player position location duration under_pressure related_events dribble match_id
0 b190c01f-ad24-468c-8241-f955b91d996c 131 1 00:02:08.032 2 8 Dribble 4 Schalke 04 Regular Play Schalke 04 Daniel Caligiuri Right Wing [110.2, 62.9] 0.000000 True [60f822df-5747-4787-b0f9-45bf5217eb8a] {'outcome': {'id': 8, 'name': 'Complete'}} 303299
1 4d773c92-f89f-491e-b3e0-3a1d2e863148 399 1 00:08:48.623 8 48 Dribble 18 Schalke 04 Regular Play Schalke 04 Amine Harit Center Attacking Midfield [88.9, 22.7] 0.000000 True [93d829df-eea7-416b-95aa-7593828cfade] {'outcome': {'id': 8, 'name': 'Complete'}} 303299
2 8a78dce4-998a-4e81-902c-9f3957cebc9d 460 1 00:13:30.202 13 30 Dribble 23 Schalke 04 Regular Play Schalke 04 Daniel Caligiuri Right Wing [99.5, 68.1] 0.007309 True [772c5aae-e34e-4364-8a98-7caf7636c90b] {'outcome': {'id': 9, 'name': 'Incomplete'}} 303299
3 e44d0122-2f2e-4771-820d-cc326a8b0379 496 1 00:14:10.135 14 10 Dribble 24 Schalke 04 From Throw In Schalke 04 Suat Serdar Left Defensive Midfield [41.2, 31.7] 0.000000 True [4de4039f-7efc-461b-b7d6-27c32ec2cd2a] {'outcome': {'id': 8, 'name': 'Complete'}} 303299
4 9555afbd-d838-42c9-8f80-be3cd09e4c4a 793 1 00:20:18.409 20 18 Dribble 33 Eintracht Frankfurt Regular Play Eintracht Frankfurt Timothy Chandler Right Wing Back [81.8, 75.7] 0.000000 True [a5c88cee-6319-4c25-91cd-8a028d8dbfbf] {'outcome': {'id': 9, 'name': 'Incomplete'}} 303299

Competition Events

All events from a given competition can be queried and stored on a single dataframe

events = sb.competition_events(
    country="Germany",
    division= "1. Bundesliga",
    season="2019/2020",
    gender="male"
)

grouped_events = sb.competition_events(
    country="Germany",
    division= "1. Bundesliga",
    season="2019/2020",
    split=True
)
grouped_events["dribbles"]
id index period timestamp minute second type possession possession_team play_pattern team player position location duration under_pressure related_events dribble match_id
0 b190c01f-ad24-468c-8241-f955b91d996c 131 1 00:02:08.032 2 8 Dribble 4 Schalke 04 Regular Play Schalke 04 Daniel Caligiuri Right Wing [110.2, 62.9] 0.000000 True [60f822df-5747-4787-b0f9-45bf5217eb8a] {'outcome': {'id': 8, 'name': 'Complete'}} 303299
1 4d773c92-f89f-491e-b3e0-3a1d2e863148 399 1 00:08:48.623 8 48 Dribble 18 Schalke 04 Regular Play Schalke 04 Amine Harit Center Attacking Midfield [88.9, 22.7] 0.000000 True [93d829df-eea7-416b-95aa-7593828cfade] {'outcome': {'id': 8, 'name': 'Complete'}} 303299
2 8a78dce4-998a-4e81-902c-9f3957cebc9d 460 1 00:13:30.202 13 30 Dribble 23 Schalke 04 Regular Play Schalke 04 Daniel Caligiuri Right Wing [99.5, 68.1] 0.007309 True [772c5aae-e34e-4364-8a98-7caf7636c90b] {'outcome': {'id': 9, 'name': 'Incomplete'}} 303299
3 e44d0122-2f2e-4771-820d-cc326a8b0379 496 1 00:14:10.135 14 10 Dribble 24 Schalke 04 From Throw In Schalke 04 Suat Serdar Left Defensive Midfield [41.2, 31.7] 0.000000 True [4de4039f-7efc-461b-b7d6-27c32ec2cd2a] {'outcome': {'id': 8, 'name': 'Complete'}} 303299
4 9555afbd-d838-42c9-8f80-be3cd09e4c4a 793 1 00:20:18.409 20 18 Dribble 33 Eintracht Frankfurt Regular Play Eintracht Frankfurt Timothy Chandler Right Wing Back [81.8, 75.7] 0.000000 True [a5c88cee-6319-4c25-91cd-8a028d8dbfbf] {'outcome': {'id': 9, 'name': 'Incomplete'}} 303299

Raw Files

Alternatively, entities can be accessed as python dictionaries serving as an interface to raw jsons and without performing any preprocessing


sb.competitions(fmt="dict")

sb.matches(competition_id=9, season_id=42, fmt="dict")

sb.lineups(match_id=303299, fmt="dict")

sb.events(303299, fmt="dict")

sb.competition_events(
    country="Germany",
    division= "1. Bundesliga",
    season="2019/2020",
    gender="male",
    fmt="dict"
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statsbombpy-1.0.tar.gz (18.9 kB view hashes)

Uploaded Source

Built Distribution

statsbombpy-1.0-py3-none-any.whl (10.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page