A universal scraping tool to acquire CS:GO demofiles from professional esports events provided by hltv.org
Project description
GoScrape 🐙: Universal hltv.org demofile scraper
Go scrape is a little open source project I created to make it easy to bulk download demofiles for the FPS CS:GO from the popular CS:GO fansite hltv.org.
Installation in Python - PyPi release
GoScrape is on PyPi, so you can use pip
to install it.
pip install goscrape
TL;DR
GoScrape consists of two main commands.
command | description |
---|---|
events |
used in the first step to create a json lookup file containing important and structured information about CS:GO esports events in a given timeframe and if specified also links to associated demofiles and matches. |
fetch |
build on top of the events command and can be used to bulk download the demofile json output from the events command otherwise a single event id can be specified to simply download demofiles for that event. |
Getting Started
Events 🎮
argument | datatype | description | notes | |
---|---|---|---|---|
STARTDATE | string | the start date from when evet data should be gathered | formatted as string 'YYYY-MM-DD' | required |
ENDDATE | string | the date to which event data should be gathered | formatted as string 'YYYY-MM-DD' | required |
STORAGEPATH | string | the directory or filepath to which the resulting json should be stored | optional (default is cwd) | |
MATCHES | boolean | whether match information and demofile urls should be scraped as well | This flag is required if the resulting json file should be used for the fetch command |
optional (True if present) |
EVENT TYPE | enum | Which type of event datashould be pulled (Online, Lan ...) | optional (default is online) |
The Objects in the resulting json are identified by their event id given as a key and will look something like this:
{
"6475": {
"event_data": {
"entity": "event",
"event_id": "6475",
"event_url": "https://www.hltv.org/events/6475/iem-dallas-2022-oceania-open-qualifier-2",
"event_name_encoded": "iem-dallas-2022-oceania-open-qualifier-2",
"event_name_full": "IEM Dallas 2022 Oceania Open Qualifier 2",
"nr_of_teams": "8+",
"prize": "Other",
"event_type": "Online",
"location": "Oceania (Online)",
"event_start": "2022-04-20",
"event_end": "2022-04-21"
},
"matches": [
{
"entity": "match",
"teams": ["Paradox", "Aftershock"],
"date_time": "2022-04-21 10:00:00",
"match_url": "https://www.hltv.org//matches/2355881/paradox-vs-aftershock-iem-dallas-2022-oceania-open-qualifier-2",
"demo_id": "71497",
"demo_url": "https://www.hltv.org/download/demo/71497"
}
]
}
Fetch 💾
argument | datatype | description | notes | |
---|---|---|---|---|
EVENT ID | string | int | the start date from when evet data should be gathered | LOOKUP FILE & EVENT ID are mutually exclusive only one can be used |
required |
LOOKUP FILE | string | the filepath of the by the events command generated lookup that should be sued for demo downloading | LOOKUP FILE & EVENT ID are mutually exclusive only one can be used |
required |
STORAGEPATH | string | the directory to which the demofiles should be written | optional (default is cwd) | |
MULTIPROCESSING | boolean | whether multiprocessing should be utilized to speed up downloading | optional (True if present) |
Changelog
Version 0.1.1 (2022.04.29)
- Bug Fixes on multiprocessed downloading
Version 0.1.0 (2022.04.24)
- Initial release
Contributing
Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
goscrape-0.1.1.tar.gz
(10.2 kB
view hashes)
Built Distribution
goscrape-0.1.1-py3-none-any.whl
(12.8 kB
view hashes)