PolyTrack Imitation Learning Agent

These details have not been verified by PyPI

Project links

Homepage

Project description

PILA — PolyTrack Imitation Learning Agent

PILA (PolyTrack Imitation Learning AI) is an imitation learning agent trained to play PolyTrack by learning directly from recorded human gameplay — not from hard-coded rules.

Instead of manually programming behavior, PILA uses supervised learning to map game states to player actions, allowing it to imitate realistic human driving.

your view of the imitation learning agent driving

imitation learning agent driving view
AGENT VIEW

What Is PILA?

PILA learns how to play PolyTrack by:

Observing gameplay data (states + actions)
Training a neural network on this data
Reproducing player behavior in real time inside the game

This approach eliminates hand-written logic and relies entirely on learning by example.

How It Works

Data Collection

Gameplay is recorded as observations (inputs) and actions(input and output)
Outputs represent player controls:
- Steering
- Throttle
- Brake

Training

A neural network is trained using imitation learning
The model minimizes loss between:
- Predicted actions
- Recorded player actions
Trained models are saved as checkpoints(every 2 epochs) or complete save for later use

Inference / Playing

The trained model reads live game frames
It predicts the next actions using the current frame
Actions are sent to the game as keyboard inputs

Requirements

Modern CPU or GPU
Python 3.11.9

how to use PILA

for macOS visit the macOS branch

for windows and linux follow this guide⬇

start by cloning the repo

git clone https://github.com/tryfonaskam/pila.git
cd pila

after you successfully cloned the repo you will need to install the requirements for the code to work correctly to do that run this command

pip install -r requirements.txt

or install the manually

note all requirements should be installed

the installation is now complete.

dataset creation and training

you should start with one of these commands

⬇for automatic recapture⬇

python3 loop_datamaker.py

after the execution of the script you should switch to your game window

the game window should be windowed fullscreen for the code to work automatically

or you NEED to set REGION = [None = full monitor] or set (left, top, right, bottom)

press F1 to start capturing. then play the game and press F10 to stop and save

NOTE 1. you need to play the game smoothly and correctly for the model to be good

NOTE 2. you can just press F1 to start capturing again

the keys that are geting captured are just Shift, Ctrl, w, a, s, d, q, e and mouse movment

[!TIP] increase the FPS for better capturing

[!CAUTION] this will greatly increase the size of each run

training

for this step you will need to have a modern GPU or CPU to start training run this command⬇

python3 train.py

this will start training of the model using the datasets/ directory. a checkpoint is saved every SAVE_EVERY = N the default is 2 in the checkpoints/ directory you can stop training at any point and re-run the script to automatically resume training

to see a graphical representation of the loss, mag_key and mag_mouse run this command⬇

for linux⬇

tensorboard --bind_all --logdir runs/

for windows you can run the tensorboard.bat file or run this command⬆

training settings

BASE_WEIGHT = 0.4             # minimum weight for zero frames
DYNAMIC_WEIGHT = 0.1          # how much nonzero frames add (0–1)
ATTN_SCALE = 0.1              # sensitivity

NOTE: a nonzero frame is a frame that has a value of 0 inside the actions.csv

BASE_WEIGHTminimum importance of every frame. if a frame is 0 It still contributes 40% of a normal loss

DYNAMIC_WEIGHTExtra weight added when something happens. The more action → the more weight (up to a limit).

ATTN_SCALE = 0.1ATTN_SCALE = 0.1. Controls how fast attention grows

playing

to get the model to play the game you need to run this command⬇

python3 play.py

after that you have ~5 seconds to switch to your game

NOTE: it needs to be the same size and resolution as in training

now the model will play the game

[!IMPORTANT] if anything goes wrong press "esc" to stop the model

advantages

one big list of advantages is the User-Friendly and Robust Workflow

1. Automatic Data Organization: The data collection scripts automatically create and manage datasets in clearly labeled folders (run01, run02, etc.), keeping your experiments organized.

2. Resumable Training: You can stop and resume the training process at any time without losing progress, which is perfect for long training sessions and serious projects.

3. Integrated Visualization: With TensorBoard support, you can easily graph the model's learning progress to see how its performance improves over time.

Flexible and Configurable: Key parameters for data collection, training, and inference are clearly defined, making it easy to experiment and tune the agent's behavior.

Clear and Modular Codebase: The project is well-structured and serves as an excellent learning resource for anyone interested in imitation learning, computer vision, or game AI.

KEY ADVANTAGES

1. Another advantage is that the code’s modular, flexible design allows it to be reused across different games and applications. This adaptability makes it easy to integrate into new projects or expand its functionality without rewriting the core logic.

2. End-to-End Imitation Learning: the project provides a complete, self-contained pipeline for building a game-playing AI. It handles everything from data collection and training to real-time inference, making it a comprehensive solution.

3. Learns by Example, Not by Rules: PILA learns directly from observing human gameplay. This "show, don't tell" approach is powerful because it requires no hard-coded logic or manual programming of behaviors, allowing it to learn complex and nuanced driving styles.

4. Real-Time Performance: The agent operates in real-time, using efficient screen capturing (dxcam) and a streamlined model to react to live gameplay without significant lag.

5. Intelligent Training with Attention: The training process uses a custom attention mechanism that gives more weight to frames with significant player actions. This helps the model focus on the most important moments of gameplay, leading to more efficient and effective learning.

extra stuff

if you would like to have more/less inputs and outputs you can do that

loop_datamaker

setp 1. add/remove keys inside loop_datamaker.py

# Control keys
KEY_W = 'w'
KEY_S = 's'
KEY_A = 'a'
KEY_D = 'd'
KEY_Q = 'q'
KEY_E = 'e'
KEY_SPACE = 'space'
KEY_SHIFT = 'shift'
KEY_CTRL = 'ctrl'

# mouse buttons
MOUSE_LEFT = "left"
MOUSE_RIGHT = "right"

step 2. then you will need to update the code to use the new keys

# Keyboard controls
w_s = axis(KEY_S, KEY_W)
a_d = axis(KEY_A, KEY_D)
q_e = axis(KEY_Q, KEY_E)
space = 1.0 if keyboard.is_pressed(KEY_SPACE) else 0.0
shift_ctrl_val = shift_ctrl()

# mouse controls
left_click  = 1.0 if mouse.is_pressed(MOUSE_LEFT) else 0.0
right_click = 1.0 if mouse.is_pressed(MOUSE_RIGHT) else 0.0

step 3. update this part

# Save record
records.append([
    frame_name,
    w_s,
    a_d,
    q_e,
    space,
    shift_ctrl_val,
    mouse_dx_scaled,
    mouse_dy_scaled,
    left_click,
    right_click
])

final. step 4. update this

# Save data after stopping
df = pd.DataFrame(records, columns=[
    "frame","w_s","a_d","q_e","space","shift_ctrl","mouse_dx","mouse_dy","left_click","right_click"
])

train

step 1. update this part by adding or removing keys(should be the same as in loop_datamaker.py)

y = np.array([
    float(rows[i]["w_s"]),
    float(rows[i]["a_d"]),
    float(rows[i]["q_e"]),
    float(rows[i]["space"]),
    float(rows[i]["shift_ctrl"]),
    float(rows[i]["mouse_dx"]),
    float(rows[i]["mouse_dy"]),
    float(rows[i]["left_click"]),
    float(rows[i]["right_click"])
], dtype=np.float32)

final. step 2. change so the last numbere is the number of outputs from step. 1⬆

class ControlNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.cnn = nn.Sequential(
            nn.Conv2d(12, 32, 5, stride=2),
            nn.ReLU(),
            nn.Conv2d(32, 64, 5, stride=2),
            nn.ReLU(),
            nn.Conv2d(64, 128, 3, stride=2),
            nn.ReLU(),
            nn.AdaptiveAvgPool2d((1, 1))
        )
        self.fc = nn.Linear(128, 9) #the last numbere is the number of outputs

play

step 1. you can copy and paste step 2⬆ to play.py do that here

class ControlNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.cnn = nn.Sequential(
            nn.Conv2d(12, 32, 5, stride=2),
            nn.ReLU(),
            nn.Conv2d(32, 64, 5, stride=2),
            nn.ReLU(),
            nn.Conv2d(64, 128, 3, stride=2),
            nn.ReLU(),
            nn.AdaptiveAvgPool2d((1, 1))
        )
        self.fc = nn.Linear(128, 9)

step 2. update this code with you new inputs/outputs

w_s, a_d, q_e, space, shift_ctrl, mouse_dx, mouse_dy, left_click, right_click = out

step 3. update these accordingly

 # W and S
    if w_s > WASD_THRESH:
        keyboard.press("w")
        keyboard.release("s")
    elif w_s < -WASD_THRESH:
        keyboard.press("s")
        keyboard.release("w")
    else:
        keyboard.release("w")
        keyboard.release("s")

    # A and D
    if a_d > WASD_THRESH:
        keyboard.press("d")
        keyboard.release("a")
    elif a_d < -WASD_THRESH:
        keyboard.press("a")
        keyboard.release("d")
    else:
        keyboard.release("a")
        keyboard.release("d")


    # Q and E
    if q_e > QE_THRESH:
        keyboard.press("e")
        keyboard.release("q")
    elif q_e < -QE_THRESH:
        keyboard.press("q")
        keyboard.release("e")
    else:
        keyboard.release("q")
        keyboard.release("e")

    # space and mouse clicks
    if space > SPACE_THRESH:
        keyboard.press("space")
    else:
        keyboard.release("space")
    
    if left_click > CLICK_THRESH:
        mouse.press(button='left')
        mouse.release(button='right')
    elif right_click < -CLICK_THRESH:
        mouse.press(button='right')
        mouse.release(button='left')
    else:
        mouse.release(button='left')
        mouse.release(button='right')

final update this

# stop release all keys
camera.stop()
keyboard.release("shift")
keyboard.release("ctrl")
keyboard.release("q")
keyboard.release("e")
keyboard.release("w")
keyboard.release("a")
keyboard.release("s")
keyboard.release("d")
keyboard.release("space")
mouse.release(button='left')
mouse.release(button='right')

exetra

you can run python3 cluster.py

to visualize patterns in the training data.

it is going to look like this

Credits

tryfonaskam - Project author and lead developer.
Designed and implemented the full imitation learning pipeline, including data capture, dataset organization, neural network architecture, training workflow, checkpointing, and real-time inference, contributor for pila macOS. Responsible for model training, experimentation, documentation, and overall project execution and contributing a custom-designed test track used for evaluation.

sahusaurya - Project ideation and environment contribution.
Helped shape the initial project direction, identifying PolyTrack as an appropriate environment for imitation learning, and contributing a custom-designed test track used for evaluation.

polytrack - the game used for training the model, and play the model

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pila-0.1.0.tar.gz (12.7 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pila-0.1.0-py3-none-any.whl (10.4 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file pila-0.1.0.tar.gz.

File metadata

Download URL: pila-0.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 12.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for pila-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2d8b79725c8f4dfc2bd331ef06d337631d1ed7ede8769c4e6545cc1dab6c295e`
MD5	`2435d60e9a2ebb1c77d7423fa7f9ec67`
BLAKE2b-256	`952423eb1df5e179cdcc2813aa1a9b338564d472c6eece59bb47d413ea0745d1`

See more details on using hashes here.

File details

Details for the file pila-0.1.0-py3-none-any.whl.

File metadata

Download URL: pila-0.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 10.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for pila-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3eb79a3e43e231e2221e21ce8e6533cf3963d1e9077b6eab47621f46e1160cf2`
MD5	`b0aa24be3c9ece6adaab59115a4cd94a`
BLAKE2b-256	`0928383ce905e2b2d22e98378590aba57dd0c11de371d45c9ef43d2a879eedf2`

See more details on using hashes here.

pila 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

PILA — PolyTrack Imitation Learning Agent

What Is PILA?

How It Works

Data Collection

Training

Inference / Playing

Requirements

how to use PILA

start by cloning the repo

dataset creation and training

training

training settings

playing

advantages

KEY ADVANTAGES

extra stuff

loop_datamaker

train

play

exetra

Credits

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes