An OpenAI gym environment for ad serving algorithms.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

gym-adserver

gym-adserver is an OpenAI Gym environment for reinforcement learning-based online advertising algorithms. gym-adserver is now one of the official OpenAI environments.

The AdServer environment implements a typical multi-armed bandit scenario where an ad server agent must select the best advertisement (ad) to be displayed in a web page.

Each time an ad is selected, it is counted as one impression. A displayed ad can be clicked (reward = 1) or not (reward = 0), depending on the interest of the user The agent must maximize the overall click-through rate.

OpenAI Environment Attributes

Attribute	Value	Notes
Action Space	Discrete(n)	n is the number of ads to choose from
Observation Space	Box(0, +inf, (2, n))	Number of impressions and clicks for each ad
Actions	[0...n]	Index of the selected ad
Rewards	0, 1	1 = clicked, 0 = not clicked
Render Modes	'human'	Displays the agent's performance graphically

Installation

You can download the source code and install the dependencies with:

git clone https://github.com/falox/gym-adserver
cd gym-adserver
pip install -e .

Alternatively, you can install gym-adserver as a pip package:

pip install gym-adserver

Basic Usage

You can test the environment by running one of the built-in agents:

python gym_adserver/agents/ucb1_agent.py --num_ads 10 --impressions 10000

Or comparing multiple agents (defined in compare_agents.py):

python gym_adserver/wrappers/compare_agents.py --num_ads 10 --impressions 10000

The environent will generate 10 (num_ads) ads with different performance rates and the agent, without prior knowledge, will learn to select the most performant ones. The simulation will last 10000 iterations (impressions).

A window will open and show the agent's performance and the environment's state:

Performance Dashboard

The overall CTR increases over time as the agent learns what the best actions are.

During the initialization, the environment assigns to each ad a "Probability" to be clicked. Such a probability is known by the environment only and will be used to draw the rewards during the simulation. The "Actual CTR" is the CTR actually occurred during the simulation: with time, it approximates the probability.

The effective agent will give most impressions to the most performant ads.

Built-in Agents

The gym_adserver/agents directory contains a collection of agents implementing the following strategies:

Each agent has different parameters to adjust and optimize its performance.

You can use the built-in agents as a starting point to implement your own algorithm.

Unit Tests

You can run the unit test for the environment with:

pytest -v

Next Steps

Extend AdServer with the concepts of budget and bid
Extend AdServer to change the ad performance over time (currently the CTR is constant)
Implement Q-learning agents
Implement a meta-agent that exploits multiple sub-agents with different algorithms
Implement epsilon-Greedy variants

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.2

Jan 30, 2021

0.1.1

Sep 13, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gym_adserver-1.0.2.tar.gz (5.9 kB view hashes)

Uploaded Jan 30, 2021 Source

Built Distribution

gym_adserver-1.0.2-py3-none-any.whl (8.0 kB view hashes)

Uploaded Jan 30, 2021 Python 3

Hashes for gym_adserver-1.0.2.tar.gz

Hashes for gym_adserver-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`ef174548a7cc30e4f49c513c7db63e2f865294aa0e43fa92cb81b587804a06e7`
MD5	`759d3b2556ef2bb9446c494267c3fbdb`
BLAKE2b-256	`2cf987441d9ac3dc24933c6f4145a789e41dad4de46c6c19f649d3894c998bc6`

Hashes for gym_adserver-1.0.2-py3-none-any.whl

Hashes for gym_adserver-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c16ffc6240fb86b663a4ea436ebf95910cdc8710ab05868864e9445a01a4f8f2`
MD5	`bdf9630f921e63de743f6f94d50e9aab`
BLAKE2b-256	`64446fcf2382f5bb1f6e10b9cfbd0ab431a0460ec63f42646ad318487cecf422`