Skip to main content

Command-line tool to scrape volleyball statistics from Data Project Web Competition websites

Project description

Volley Stats

PyPI

Command-line tool to scrape volleyball statistics from Data Project Web Competition websites.

Volley Stats facilitates the export of data in CSV format of volleyball matches and competitions organized by entities that use Data Project WCM. The tool streamlines the collection of individual matches, match lists, and automates the retrieval of individual match data from the competition matches list.

Additionally, it documents the structure of URLs for Web Competition websites, simplifying the search for identifiers (mID, ID, PID), and also supplies acronyms for the main entities utilizing Data Project Management.

This tool is not affiliated with Genius Sports Italy.

Installation

Requirement

  • Python 3.8+
pip install volleystats

Documentation

Extracted Data

  • Competition

    • Competition ID
    • Home Team
    • Guest Team
    • Home Points
    • Guest Points
    • Date
    • Stadium
  • Match

    • Match ID
    • Match date
    • Home Team
    • Guest Team
    • Coach
    • Stadium
    • Total Points
    • Break Points
    • Win-Lost
    • Total Serves
    • Serve Erros
    • Serve Points
    • Total Receptions
    • Reception Erros
    • Positive Pass Percentage (Pos%)
    • Excellent/ Perfect Pass Percentage (Exc.%)
    • Total Attacks
    • Attack Erros
    • Blocked Attack
    • Attack Points (Exc.)
    • Attack Points Percentage (Exc.%)
    • Block Points

Usage

volleystats [--help] --fed FED (--match MATCH | --comp COMP | --batch CSV_FILE_PATH) [--pid PID] [--log]
  • --fed, -f: Federation Acronym (required)
  • --match, -m: Statistics of a single match (required, unless --comp or --batch are provided)
  • --comp, -c: List of matches in a competition (required, unless --match or --batch are provided)
  • --pid, -p: PID of the competition (optional, only when --comp is provided)
  • --batch, -b: CSV file path with Match IDs (Competition Matches output) (required, unless --match or --comp are provided)
  • --log, -l: View the logging during scraping
  • --help, -h: Show help message

Match

volleystats --fed FED --match MATCH

Examples

  • Brazilian Volleyball Confederation

  • Lithuanian Volleyball Federation

Competition Matches

volleystats --fed FED --comp COMP

Example

Competition Matches with PID

In some competitions, PID can be used to distinguish between seasons, such as regular season and playoffs. Therefore, it is necessary to submit this value to obtain statistics separately.

volleystats --fed FED --comp COMP --pid PID

Examples

Matches via Competition Matches file

volleystats --fed FED --batch CSV_FILE_PATH

Example

  • Brazilian Volleyball Confederation
    • Data Project website: https://cbv-web.dataproject.com/MatchStatistics.aspx?mID=ID
    • Federation Acronym: CBV
    • CSV file path (output of the Competition Matches): data/cbv-18-2022-2023-competition-matches.csv
    • Command: $ volleystats --fed cbv --batch data/cbv-18-2022-2023-competition-matches.csv
    • Output files:
      data/cbv-1623-22-10-28-guest-baruerivolleyballclub.csv
      data/cbv-1623-22-10-28-home-fluminense.csv
      data/cbv-1618-2022-11-01-guest-energis8sãocaetano.csv
      data/cbv-1618-2022-11-01-home-esporteclubepinheiros.csv
      data/cbv-1619-2022-11-01-guest-abelmodavolei.csv
      data/cbv-1619-2022-11-01-home-gerdauminas.csv
      ...
      

Help

volleystats --help

Log

volleystats --fed FED (--match MATCH | --comp COMP | --batch CSV_FILE_PATH) --log

Output messages

                    .
                    |`.
                    |  `.
                    |-_  `.
                    |  -_  `._
____________________|____-_ _|_______________,
',                         -_|                ',
  ',                         |                  ',
    ',                       |                    ',
      ',_____________________|______________________',

volleystats: started
volleystats: data/cbv-1623-22-10-28-home-fluminense.csv file was created
volleystats: data/cbv-1623-22-10-28-guest-baruerivolleyballclub.csv file was created
volleystats: finished

Data Project Web Competition URLs structure

  • Hostname: <Fed_Acronym>-web.dataproject.com

  • Pathnames and search parameters:

    • /MainHome

    • /History?ID=<Fed_ID>

    • /CompetitionHome?ID=<Category_ID> (could be Women, Men, Pro or Youth, e.g.)

    • /CompetitionMatches?ID=<Competition_ID>&PID=<PID> (PID could be regular season or playoffs, e.g.)

    • /MatchStatistics?mID=<Match_ID>&ID=<Competition_ID>

Federations, Confederations and Leagues Acronyms

European Volleyball

South American Volleyball

Troubleshooting

Match files collected from batch file

In some cases, empty files may be returned, usually named as <fed_acronym>-<match_id>-guest_stats.csv and <fed_acronym>-<match_id>-home_stats.csv. This can happen due to the hiding of a match in the competition listing, either because it was canceled or incorrectly entered. The match is hidden from view, but it remains accessible in the HTML, causing the tool to return an empty file. In such cases, simply ignore and delete this file.

It can also happen that the data is only available in PDF, which makes scraping impossible.

Development

$ git clone git@github.com:claromes/volleystats.git

$ cd volleystats

$ pip install -r requirements.txt

$ pip install --editable .

Author

Claromes

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

volleystats-0.8.1.tar.gz (27.4 kB view details)

Uploaded Source

Built Distribution

volleystats-0.8.1-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file volleystats-0.8.1.tar.gz.

File metadata

  • Download URL: volleystats-0.8.1.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for volleystats-0.8.1.tar.gz
Algorithm Hash digest
SHA256 f692cc4e70c66482dfa0cee13fa57211b6da3ff6b975153bcae79a42c249c611
MD5 788f1f8269878f0c597b148bfc4e697f
BLAKE2b-256 64789ff53dcdbe068fa2784ad61d86b24b1d589fa5f1dc4f07b98c341b4bdcb8

See more details on using hashes here.

File details

Details for the file volleystats-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: volleystats-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 26.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for volleystats-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1ffbb02c93ad27d98e3f127620aaa3fd97d282c4a797a58ad9d5415315663103
MD5 1ce3398e8a18dc5a5659c3d243488a2d
BLAKE2b-256 2754071c9acd09685d136bf1b41819d1d0d3b9dee176d71d2e706196ced9a084

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page