Scraping data package for www.understat.com
Project description
⚽️ Understat
A clean, fast, and dependency-free Python client for football analytics data from Understat.
Understat provides a simple, object-oriented interface to access detailed football statistics, including Expected Goals (xG), match results, and individual player performance. It processes the data into clean, ready-to-use Pandas DataFrames, making it perfect for data analysis, visualizations, and modeling. There are available 6 european leagues: Premier League, La Liga, Bundesliga, Serie A, Ligue 1 and Russian Premier League from season 2014/2015.
🎯 Key Features
- League-Level Data: Get full season standings, match schedules, and player stats for major leagues.
- Team-Specific Analysis: Access a team's complete match history and seasonal roster.
- Detailed Player Statistics: Retrieve game-by-game logs and individual shot data for any player.
- Granular Match Data: Pull shot maps, lineups, and key events from a single match.
- Pandas Integration: All data is delivered in structured and intuitive Pandas DataFrames.
- Lightweight: No browser or external driver dependencies required.
🔖 Note
This package is in development yet, then can change.
🚀 Installation
To install the package:
pip install underdata
or:
git clone git@github.com:osvaldomx/UnderData.git
cd understat
python setup.py install
🛫 Quick Start & Usage
The library is designed to be intuitive. Here are a few examples to get you started.
| Object | url |
|---|---|
| underdata.league() | https://www.understat.com/league/<league_name>/<year> |
| underdata.team() | https://www.understat.com/team/<team_name>/<year> |
| underdata.player() | https://www.understat.com/player/<player_id> |
| underdata.match() | https://www.understat.com/player/<match_id> |
1. Get League Standings
Analyze an entire league's performance. The get_teams() method can provide a basic or advanced statistical table.
from underdata.league import League
# Initialize the league for a specific season
la_liga = League(league_name="La_liga", season=2023)
# Get the advanced standings table
teams_df = la_liga.get_teams(advanced=True)
print("La Liga 2023/2024 Final Standings (Advanced Stats)")
print(teams_df.head())
2. Analyze a Specific Team
Drill down into a single team's performance over a season.
from underdata.team import Team
# Initialize a specific team
real_madrid = Team(team_name="Real Madrid", season=2023)
# Get the team's complete match history
match_history_df = real_madrid.get_match_history()
print("Real Madrid's Last 5 Matches of the Season:")
print(match_history_df.tail())
3. Get Detailed Player Data
Analyze a single player's performance, including their shot data, perfect for creating shot maps.
from soccermetrics.player import Player
# Initialize a player using their Understat ID (e.g., Jude Bellingham's ID is 8369)
bellingham = Player(player_id=8369)
# Get all shots from the 2023 season
bellingham_shots = bellingham.get_shot_data(season=2023)
print(f"Jude Bellingham took {len(bellingham_shots)} shots in the 2023/2024 season.")
print(bellingham_shots[['date', 'result', 'xG', 'shotType']].head())
4. Analyze a Single Match
Get all the shot data and lineups from a specific match using its ID.
from soccermetrics.match import Match
# Initialize a match using its Understat ID (e.g., a Real Madrid vs Barcelona match)
el_clasico = Match(match_id=21817)
# Get all shots from the match
shot_data = el_clasico.get_shot_data()
print(f"There were a total of {len(shot_data)} shots in the match.")
✅ Contributing
Contributions are welcome! If you'd like to help improve Underdata, please follow these steps:
-
Open an Issue: Before starting any work, please open an issue on GitHub to discuss the proposed change or feature. This helps ensure that your work aligns with the project's goals.
-
Fork the Repository: Fork the project to your own GitHub account.
-
Create a Feature Branch: Create a new branch for your changes (git checkout -b feat/YourAmazingFeature).
-
Develop: Make your changes and add tests to cover them.
-
Submit a Pull Request: Push your branch to your fork and open a pull request back to the main repository.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file underdata-0.2.0.tar.gz.
File metadata
- Download URL: underdata-0.2.0.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29e8f94afef3b9ec58659f70c3affdc5f200d97312bb797f0f4ade848b4fc35a
|
|
| MD5 |
9a79953215eb0df675ffc55673ace8eb
|
|
| BLAKE2b-256 |
ccd1f80654699da48b458434eabfdc164a03bc9f2980e7d5de45616bce0b0a6d
|
File details
Details for the file underdata-0.2.0-py3-none-any.whl.
File metadata
- Download URL: underdata-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6fbe0114a80899ba06b1cebbbe8765e1f8794c6ed2604e713daddf736e0e4b29
|
|
| MD5 |
8bc93df2b5d8d92dad4bfcf41cc2861b
|
|
| BLAKE2b-256 |
60cbb2cc340ce4d7d4b9eb5eb7f3b67ee49a991d36225d6276976b372cb690ae
|