Skip to main content

Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface and encourages reproducible results.

Project description

Persine, the Persona Engine

Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface and encourages reproducible results. You tell Persine to drive around YouTube and it gives back a spreadsheet of what else YouTube suggests you watch!

Persine => Pers[ona Eng]ine

For example!

People have suggested that if you watch a few lightly political videos, YouTube starts suggesting more and more extreme content – but does it really?

The theory is difficult to test since it involves a lot of boring clicking and YouTube already knows what you usually watch. Persine to the rescue!

  1. Persine starts a new fresh-as-snow Chrome
  2. You provide a list of videos to watch and buttons to click (like, dislike, "next up" etc)
  3. As it watches and clicks more and more, YouTube customizes and customizes
  4. When you're all done, Persine will save your winding path and the video/playlist/channel recommendations to nice neat CSV files.

Beyond analysis, these files can be used to repeat the experiment again later, seeing if recommendations change by time, location, user history, etc.

If you didn't quite get enough data, don't worry – you can resume your exploration later, picking up right where you left off. Since each "persona" is based on Chrome profiles, all your cookies and history will be safely stored until your next run.

An actual example

See Persine in action on Google Colab. Includes a few examples for analysis, too.

Installation

pip install persine

Persine will automatically install Selenium and BeautifulSoup for browsing/scraping, pandas for data analysis, and pillow for processing screenshots.

You will need to install chromedriver to allow Selenium to control Chrome. Persine won't work without it!

  • Installing chromedriver on OS X: I hear you can install it using homebrew, but I've never done it! You can also follow the link above and click the "latest stable release" link, then download chromedriver_mac64.zip. Unzip it, then move the chromedriver file into your PATH. I typically put it in /usr/local/bin.
  • Installing chromedriver on Windows: Follow the link above, click the "latest stable release" link. Download chromedriver_win32.zip, unzip it, and move chromedriver.exe into your PATH (in the spirit of anarchy I just put it in C:\Windows).
  • Installing chromedriver on Debian/Ubuntu: Just run apt install chromium-chromedriver and it'll work.

Quickstart

In this example, we start a new session by visiting a YouTube video and clicking the "next up" video three times to see where it leads us. We then save the results for later analysis.

from persine import PersonaEngine

engine = PersonaEngine(headless=False)

with engine.persona() as persona:
    persona.run("https://www.youtube.com/watch?v=hZw23sWlyG0")
    persona.run("youtube:next_up#3")
    persona.history.to_csv("history.csv")
    persona.recommendations.to_csv("recs.csv")

We turn off headless mode because it's fun to watch!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

persine-0.1.2.tar.gz (2.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

persine-0.1.2-py3-none-any.whl (2.8 MB view details)

Uploaded Python 3

File details

Details for the file persine-0.1.2.tar.gz.

File metadata

  • Download URL: persine-0.1.2.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.6.9 Darwin/19.6.0

File hashes

Hashes for persine-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0037ef565470d45aa842d6824352e8a22075017707ce3bc73c8bf532b41cf636
MD5 e9bf48bb49965a7df326b5b8cf7e904b
BLAKE2b-256 7ec0e7f66ecc094574490466c188587fb56872a89d5845b4347517d26c0ac821

See more details on using hashes here.

File details

Details for the file persine-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: persine-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.6.9 Darwin/19.6.0

File hashes

Hashes for persine-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 02d322efb9ca1f01ba798cc987fa5b929f3e05d1a5d919dba91fa804fdef7b6a
MD5 61f3dfb603d68e9a71dbfc896164877f
BLAKE2b-256 f512000cb38a0649ba370a691bb99746e0f0d8ba1d8ccb89cc1da5603ffa7fa2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page