Skip to main content

Converts poker hand history files to pandas DataFrames.

Project description

PokerDF

Converts poker hand history files into structured Pandas DataFrames, making it easier to analyze your games.

Fast and reliable, PokerDF is able to process 3,000 hand history files into .parquet per minute, in a MacBook Air M2 with 8-core CPU.

Currently supports PokerStars. Make sure hand histories are saved in English.

Introduction

Converting raw hand histories into structured data is the first step toward building a solid poker strategy and maximizing ROI. What are the optimal VPIP, PFR, and C-BET frequencies for No Limit Hold'em 6-Max? In which specific situations is a 3-Bet most profitable? When is bluffing a clear mistake? Once your data is organized in a Pandas DataFrame, the analytical explorations become unlimited, opening new possibilities to fine-tune your decision-making.

In the processed DataFrame, each row corresponds to a specific player in a specific hand, containing all relevant information about that instance of the game. Below, you’ll find an example of hand history before and after processing.

Before

PokerStars Hand #219372022626: Tournament #3026510091, $1.84+$0.16 USD Hold'em No Limit - Level I (10/20) - 2020/10/14 10:33:59 BRT [2020/10/14 9:33:59 ET]
Table '3026510091 1' 3-max Seat #1 is the button
Seat 1: VillainA (500 in chips) 
Seat 2: garciamurilo (500 in chips) 
Seat 3: VillainB (500 in chips) 
garciamurilo: posts small blind 10
VillainB: posts big blind 20
*** HOLE CARDS ***
Dealt to garciamurilo [6h Ks]
VillainB is disconnected 
VillainA: folds 
garciamurilo: calls 10
VillainB: checks 
*** FLOP *** [4d Qs Qd]
garciamurilo: checks 
VillainB: checks 
*** TURN *** [4d Qs Qd] [3s]
garciamurilo: checks 
VillainB: bets 20
garciamurilo: folds 
Uncalled bet (20) returned to VillainB
VillainB collected 40 from pot
VillainB: doesn't show hand 
*** SUMMARY ***
Total pot 40 | Rake 0 
Board [4d Qs Qd 3s]
Seat 1: VillainA (button) folded before Flop (didn't bet)
Seat 2: garciamurilo (small blind) folded on the Turn
Seat 3: VillainB (big blind) collected (40)

After

Modality TableSize BuyIn TournID TableID HandID LocalTime Level Ante Blinds Owner OwnersHand Playing Player Seat PostedAnte Position PostedBlind Stack PreflopAction FlopAction TurnAction RiverAction AnteAllIn PreflopAllIn FlopAllIn TurnAllIn RiverAllIn BoardFlop BoardTurn BoardRiver ShowDown CardCombination Result Balance FinalRank Prize
0 USD Hold'em No Limit 3 $1.84+$0.16 3026510091 1 219372022626 2020-10-14 10:33:59 I None [10.0, 20.0] garciamurilo ['6h', 'Ks'] 3 VillainA 1 None button nan 500 ['folds', ''] ['', ''] ['', ''] ['', ''] False False False False False ['4d', 'Qs', 'Qd'] ['4d', 'Qs', 'Qd', '3s'] [] [None, None] None folded nan -1 None
1 USD Hold'em No Limit 3 $1.84+$0.16 3026510091 1 219372022626 2020-10-14 10:33:59 I None [10.0, 20.0] garciamurilo ['6h', 'Ks'] 3 garciamurilo 2 None small blind 10 500 ['calls', '10'] ['checks', ''] ['checks', ''], ['folds', ''] ['', ''] False False False False False ['4d', 'Qs', 'Qd'] ['4d', 'Qs', 'Qd', '3s'] [] [None, None] None folded nan -1 None
2 USD Hold'em No Limit 3 $1.84+$0.16 3026510091 1 219372022626 2020-10-14 10:33:59 I None [10.0, 20.0] garciamurilo ['6h', 'Ks'] 3 VillainB 3 None big blind 20 500 ['checks', ''] ['checks', ''] ['bets', '20'] ['', ''] False False False False False ['4d', 'Qs', 'Qd'] ['4d', 'Qs', 'Qd', '3s'] [] [None, None] None non-sd win 40 -1 None

Data Modeling

For advanced analytics, you will need to transform the data and explore different data models. The final structure of your data may vary depending on the specific goals of your project.

Installation

pip install pokerdf

Usage

First, navigate to the directory where you want to save the output:

cd output_directory

Then, run the package to convert all your hand history files:

pokerdf convert /path/to/handhistory/folder

After the process completes, you’ll see an output similar to the following:

output_directory/
└── output/
   └── 20250510-105423/
      ├── 20200607-T2928873630.parquet
      ├── 20200607-T2928880893.parquet
      ├── 20200607-T2928925240.parquet
      ├── 20200607-T2928950825.parquet
      ├── 20200607-T2928996127.parquet
      ├── 20200607-T2929005994.parquet
      ├── ...
      ├── fail.txt
      └── success.txt

Details

  1. Inside output you’ll find a subfolder named with the session ID, in this case, 20250510-105423, containing all .parquet files.
  2. Each hand history file is converted into a .parquet file with the exact same structure, allowing you to concatenate them seamlessly.
  3. Each .parquet file follows the naming convention {DATE_OF_TOURNAMENT}-T{TOURNAMENT_ID}.parquet.
  4. The file fail.txt provides detailed information about any files that failed to process. This file is only generated if there are failures.
  5. The file success.txt lists all successfully converted files.

Incremental pipeline

You may want to build a pipeline to incrementally feed your table with new hand history data. In that case, you can import the convert_txt_to_tabular_data function and use it in your workflows. Refer to the docstrings and explore its usage within the package to better understand how it works.

Metadata

Column Description Example Data Type
Modality The type of game being played Hold'em No Limit string
TableSize Maximum number of players 6 int
BuyIn The buy-in amount for the tournament $4.60+$0.40 string
TournID Unique identifier for the tournament 2928882649 string
TableID Unique identifier for the table inside a tournament 10 int
HandID Unique identifier for the hand inside a tournament 215024616736 string
LocalTime Local time when the hand was played 2020-06-07 07:44:35 datetime
Level Level of the tournament IV string
Ante Ante amount posted in the hand 10.00 float
Blinds Big blind and small blind amounts [10.0, 20.0] list[float]
Owner Owner of the hand history files ownername string
OwnersHand Cards held by the owner in a specific hand [9d, Js] list[string]
Playing Number of players active during the hand 5 int
Player Player involved in the hand playername string
Seat Seat number of the player 3 int
PostedAnte Amount the player paid for the ante 5.00 float
PostedBlind Amount the player paid for the blinds 50.00 float
Position Player's position at the table big blind string
Stack Current stack size of the player 2500.00 float
PreflopAction Actions taken during the preflop stage [[checks, ]] list[list[str]]
FlopAction Actions taken during the flop stage [[bets, 840], [calls, 220]] list[list[str]]
TurnAction Actions taken during the turn stage [[raises, 400], [calls, 500]] list[list[str]]
RiverAction Actions taken during the river stage [[folds, ]] list[list[str]]
AnteAllIn Whether the player went all-in during the ante True bool
PreflopAllIn Whether the player went all-in during preflop False bool
FlopAllIn Whether the player went all-in during the flop False bool
TurnAllIn Whether the player went all-in during the turn False bool
RiverAllIn Whether the player went all-in during the river False bool
BoardFlop Cards dealt on the flop [4d, Qs, Ad] list[string]
BoardTurn Card dealt on the turn [4d, Qs, Ad, 7d] list[string]
BoardRiver Card dealt on the river [4d, Qs, Ad, 7d, 2d] list[string]
ShowDown Player's cards if went to showdown [Ah, Ac] list[string]
CardCombination Card combination held by the player three of a kind, Aces string
Result Result of the hand (folded, lost, mucked, non-sd win, won) won string
Balance Total value won in a hand 9150.25 float
FinalRank Player's final ranking in the tournament 1 int
Prize Prize won by the player, if any 30000.00 float

License

MIT Licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pokerdf-1.0.3.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pokerdf-1.0.3-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file pokerdf-1.0.3.tar.gz.

File metadata

  • Download URL: pokerdf-1.0.3.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pokerdf-1.0.3.tar.gz
Algorithm Hash digest
SHA256 66dfc6785add331da202ad9adf06bbf55dda05a6add92b5314fb5bad43a77fb6
MD5 e0aa7e62b44a2dbca936e03fb80896ef
BLAKE2b-256 5f5dcec20e1845e7d6ef9d064017d92647a18b555f7abec0c317e434262330be

See more details on using hashes here.

Provenance

The following attestation bundles were made for pokerdf-1.0.3.tar.gz:

Publisher: python-publish.yml on murilogmamaral/pokerdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pokerdf-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: pokerdf-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pokerdf-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2f72b596a56d95ad2aba45a59f98821cf41944eb27025dacd9c99c35f0cfceb9
MD5 c60fa29c6e74bcfab952f6606f1245e8
BLAKE2b-256 272b25534d93ebc8170f73708579f68e5cc55d358a041bae8daa8672b4c6187d

See more details on using hashes here.

Provenance

The following attestation bundles were made for pokerdf-1.0.3-py3-none-any.whl:

Publisher: python-publish.yml on murilogmamaral/pokerdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page