Python bindings to the Chadwick library
Project description
pychadwick
A Python package to interface with the Chadwick
libray.
Chadwick
is a set of tools for parsing retrosheet data
and is available at
http://chadwick.sourceforge.net/doc/index.html
https://github.com/chadwickbureau/chadwick
Features
As of now this package supports retrosheet event data only.
Installation
$ pip install pychadwick
Example use
Python replacement for cwevent
When you install pychadwick
, it will install a
Python exe that mimic the cwevent
exe from the
chadwick project. It reads a set of event files and
prints them out in csv format to stdout
.
This downloads a fresh copy of the retrosheet event files, and parses them with 7 CPUs
$ time pycwevent -n 7 > /tmp/events1.csv
stderr: data_root not given as argument, downloading fresh copy of retrosheet events...
stderr: found 2254 files
Warning: Invalid integer value 'b'
real 3m14.517s
user 12m18.104s
sys 0m25.264s
$ wc -l /tmp/events1.csv
13976191 /tmp/events1.csv
This uses a pre-downloaded copy of the retrosheet event files, with 7 CPUs
$ time pycwevent -n 7 --data-root /tmp/retrosheet-master/event/regular/ > /tmp/events2.csv
stderr: found 2254 files
Warning: Invalid integer value 'b'
real 1m57.499s
user 9m52.236s
sys 0m17.672s
$ wc -l /tmp/events2.csv
13976184 /tmp/events2.csv
Python interface to cwevent
Load events
Load events for a game from a file stored on the web
>>> from pychadwick.chadwick import Chadwick
>>> chadwick = Chadwick()
>>> file_path = "https://raw.githubusercontent.com/chadwickbureau/retrosheet/master/event/regular/1982OAK.EVA"
>>> games = chadwick.games(file_path)
>>> game = next(games)
>>> df = chadwick.game_to_dataframe(game)
>>> df
GAME_ID AWAY_TEAM_ID INN_CT BAT_HOME_ID ... ASS9_FLD_CD ASS10_FLD_CD UNKNOWN_OUT_EXC_FL UNCERTAIN_PLAY_EXC_FL
0 OAK198204060 CAL 1 0 ... 0 0 F F
1 OAK198204060 CAL 1 0 ... 0 0 F F
2 OAK198204060 CAL 1 0 ... 0 0 F F
3 OAK198204060 CAL 1 1 ... 0 0 F F
4 OAK198204060 CAL 1 1 ... 0 0 F F
.. ... ... ... ... ... ... ... ... ...
81 OAK198204060 CAL 11 1 ... 0 0 F F
82 OAK198204060 CAL 11 1 ... 0 0 F F
83 OAK198204060 CAL 11 1 ... 0 0 F F
84 OAK198204060 CAL 11 1 ... 0 0 F F
85 OAK198204060 CAL 11 1 ... 0 0 F F
[86 rows x 159 columns]
Load events for a game from a local file
>>> file_path = " /tmp/retrosheet-master/event/regular/1982OAK.EVA"
>>> games = chadwick.games(file_path)
>>> game = next(games)
>>> df = chadwick.game_to_dataframe(game)
>>> df
GAME_ID AWAY_TEAM_ID INN_CT BAT_HOME_ID ... ASS9_FLD_CD ASS10_FLD_CD UNKNOWN_OUT_EXC_FL UNCERTAIN_PLAY_EXC_FL
0 OAK198204060 CAL 1 0 ... 0 0 F F
1 OAK198204060 CAL 1 0 ... 0 0 F F
2 OAK198204060 CAL 1 0 ... 0 0 F F
3 OAK198204060 CAL 1 1 ... 0 0 F F
4 OAK198204060 CAL 1 1 ... 0 0 F F
.. ... ... ... ... ... ... ... ... ...
81 OAK198204060 CAL 11 1 ... 0 0 F F
82 OAK198204060 CAL 11 1 ... 0 0 F F
83 OAK198204060 CAL 11 1 ... 0 0 F F
84 OAK198204060 CAL 11 1 ... 0 0 F F
85 OAK198204060 CAL 11 1 ... 0 0 F F
[86 rows x 159 columns]
Check which columns are defined
>>> chadwick.all_headers
Check which columns are enabled
>>> chadwick.active_headers
Disable all columns, and add only GAME_ID
and BAT_ID
>>> _ = [chadwick.unset_event_field(e) for e in chadwick.all_headers]
>>> chadwick.active_headers
[]
>>> chadwick.set_event_field("GAME_ID")
>>> chadwick.set_event_field("BAT_ID")
>>> games = chadwick.games(file_path)
>>> game = next(games)
>>> df = chadwick.game_to_dataframe(game)
>>> df
GAME_ID BAT_ID
0 OAK198204060 burlr001
1 OAK198204060 lynnf001
2 OAK198204060 carer001
3 OAK198204060 hendr001
4 OAK198204060 murpd002
.. ... ...
81 OAK198204060 meyed001
82 OAK198204060 armat001
83 OAK198204060 grosw001
84 OAK198204060 spenj101
85 OAK198204060 loped001
[86 rows x 2 columns]
Activate all the columns again
>>> chadwick.set_all_headers()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pychadwick-0.6.1.tar.gz
.
File metadata
- Download URL: pychadwick-0.6.1.tar.gz
- Upload date:
- Size: 122.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82b89781cd4bc62eba224aa8ec5811047afa2c3f246320d0cb9a84e026e3707d |
|
MD5 | a23f35bbc1624d5e528587e7dd7d4eed |
|
BLAKE2b-256 | 357efeca51ce50cce7211282495427cba7b57b364af2e1f9ab547a052288a459 |