Skip to main content

Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface and encourages reproducible results.

Project description

Documentation Status

Persine, the Persona Engine

Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface and encourages reproducible results. You tell Persine to drive around YouTube and it gives back a spreadsheet of what else YouTube suggests you watch!

Persine => Pers[ona Eng]ine

For example!

People have suggested that if you watch a few lightly political videos, YouTube starts suggesting more and more extreme content – but does it really?

The theory is difficult to test since it involves a lot of boring clicking and YouTube already knows what you usually watch. Persine to the rescue!

  1. Persine starts a new fresh-as-snow Chrome
  2. You provide a list of videos to watch and buttons to click (like, dislike, "next up" etc)
  3. As it watches and clicks more and more, YouTube customizes and customizes
  4. When you're all done, Persine will save your winding path and the video/playlist/channel recommendations to nice neat CSV files.

Beyond analysis, these files can be used to repeat the experiment again later, seeing if recommendations change by time, location, user history, etc.

If you didn't quite get enough data, don't worry – you can resume your exploration later, picking up right where you left off. Since each "persona" is based on Chrome profiles, all your cookies and history will be safely stored until your next run.

An actual example

See Persine in action on Google Colab.

Includes a few examples for analysis, too.

Installation

pip install persine

Persine will automatically install Selenium and BeautifulSoup for browsing/scraping, pandas for data analysis, and pillow for processing screenshots.

You will need to manually install chromedriver to allow Selenium to control Chrome. See details here

Quickstart

In this example, we start a new session by visiting a YouTube video and clicking the "next up" video three times to see where it leads us. We then save the results for later analysis.

from persine import PersonaEngine

engine = PersonaEngine(headless=False)

with engine.persona() as persona:
    persona.run("https://www.youtube.com/watch?v=hZw23sWlyG0")
    persona.run("youtube:next_up#3")
    persona.history.to_csv("history.csv")
    persona.recommendations.to_csv("recs.csv")

We turn off headless mode because it's fun to watch!

More examples, more features, more everything

Find the complete documentation here

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

persine-0.1.4.tar.gz (2.8 MB view details)

Uploaded Source

Built Distribution

persine-0.1.4-py3-none-any.whl (2.8 MB view details)

Uploaded Python 3

File details

Details for the file persine-0.1.4.tar.gz.

File metadata

  • Download URL: persine-0.1.4.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.6.9 Darwin/19.6.0

File hashes

Hashes for persine-0.1.4.tar.gz
Algorithm Hash digest
SHA256 22f20f1d9d28f7d3c79364b4f1e733eba58817ae170d97a9e32c6ddcae21d044
MD5 0bd36b94d7223162c4a902bf9aecc7a3
BLAKE2b-256 e7e9be5dc1b80ffa77070354b9fb49d76d9900bdc23dee620abb9f5fd20d5869

See more details on using hashes here.

File details

Details for the file persine-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: persine-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.4 CPython/3.6.9 Darwin/19.6.0

File hashes

Hashes for persine-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b8fd2daf2d681014c83de8e447a4189a5476a3de55369f402b90fdf99c507489
MD5 14cb83eb93701aeda19469e60424705a
BLAKE2b-256 8919d23a8baf9de671562e498eea320152a552b9560e433fee2c5fbd6d200f16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page