Skip to main content

An OpenAI gym environment for ad serving algorithms.

Project description

Build Status codecov PyPI version shields.io

gym-adserver

gym-adserver is an OpenAI Gym environment for reinforcement learning-based online advertising algorithms. gym-adserver is now one of the official OpenAI environments.

The AdServer environment implements a typical multi-armed bandit scenario where an ad server agent must select the best advertisement (ad) to be displayed in a web page.

Each time an ad is selected, it is counted as one impression. A displayed ad can be clicked (reward = 1) or not (reward = 0), depending on the interest of the user The agent must maximize the overall click-through rate.

OpenAI Environment Attributes

Attribute Value Notes
Action Space Discrete(n) n is the number of ads to choose from
Observation Space Box(0, +inf, (2, n)) Number of impressions and clicks for each ad
Actions [0...n] Index of the selected ad
Rewards 0, 1 1 = clicked, 0 = not clicked
Render Modes 'human' Displays the agent's performance graphically

Installation

You can download the source code and install the dependencies with:

git clone https://github.com/falox/gym-adserver
cd gym-adserver
pip install -e .

Alternatively, you can install gym-adserver as a pip package:

pip install gym-adserver

Basic Usage

You can test the environment by running one of the built-in agents:

python gym_adserver/agents/ucb1_agent.py --num_ads 10 --impressions 10000

Or comparing multiple agents (defined in compare_agents.py):

python gym_adserver/wrappers/compare_agents.py --num_ads 10 --impressions 10000

The environent will generate 10 (num_ads) ads with different performance rates and the agent, without prior knowledge, will learn to select the most performant ones. The simulation will last 10000 iterations (impressions).

A window will open and show the agent's performance and the environment's state:

Performance Dashboard

The overall CTR increases over time as the agent learns what the best actions are.

During the initialization, the environment assigns to each ad a "Probability" to be clicked. Such a probability is known by the environment only and will be used to draw the rewards during the simulation. The "Actual CTR" is the CTR actually occurred during the simulation: with time, it approximates the probability.

The effective agent will give most impressions to the most performant ads.

Built-in Agents

The gym_adserver/agents directory contains a collection of agents implementing the following strategies:

Each agent has different parameters to adjust and optimize its performance.

You can use the built-in agents as a starting point to implement your own algorithm.

Unit Tests

You can run the unit test for the environment with:

pytest -v

Next Steps

  • Extend AdServer with the concepts of budget and bid
  • Extend AdServer to change the ad performance over time (currently the CTR is constant)
  • Implement Q-learning agents
  • Implement a meta-agent that exploits multiple sub-agents with different algorithms
  • Implement epsilon-Greedy variants

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gym_adserver-1.0.2.tar.gz (5.9 kB view hashes)

Uploaded Source

Built Distribution

gym_adserver-1.0.2-py3-none-any.whl (8.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page