Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface and encourages reproducible results.
Project description
Persine, the Persona Engine
Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface and encourages reproducible results. You tell Persine to drive around YouTube and it gives back a spreadsheet of what else YouTube suggests you watch!
Persine => Pers[ona Eng]ine
For example!
People have suggested that if you watch a few lightly political videos, YouTube starts suggesting more and more extreme content – but does it really?
The theory is difficult to test since it involves a lot of boring clicking and YouTube already knows what you usually watch. Persine to the rescue!
- Persine starts a new fresh-as-snow Chrome
- You provide a list of videos to watch and buttons to click (like, dislike, "next up" etc)
- As it watches and clicks more and more, YouTube customizes and customizes
- When you're all done, Persine will save your winding path and the video/playlist/channel recommendations to nice neat CSV files.
Beyond analysis, these files can be used to repeat the experiment again later, seeing if recommendations change by time, location, user history, etc.
If you didn't quite get enough data, don't worry – you can resume your exploration later, picking up right where you left off. Since each "persona" is based on Chrome profiles, all your cookies and history will be safely stored until your next run.
An actual example
See Persine in action on Google Colab.
Includes a few examples for analysis, too.
Installation
pip install persine
Persine will automatically install Selenium and BeautifulSoup for browsing/scraping, pandas for data analysis, and pillow for processing screenshots.
You will need to manually install chromedriver to allow Selenium to control Chrome. See details here
Quickstart
In this example, we start a new session by visiting a YouTube video and clicking the "next up" video three times to see where it leads us. We then save the results for later analysis.
from persine import PersonaEngine
engine = PersonaEngine(headless=False)
with engine.persona() as persona:
persona.run("https://www.youtube.com/watch?v=hZw23sWlyG0")
persona.run("youtube:next_up#3")
persona.history.to_csv("history.csv")
persona.recommendations.to_csv("recs.csv")
We turn off headless mode because it's fun to watch!
More examples, more features, more everything
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file persine-0.1.4.tar.gz
.
File metadata
- Download URL: persine-0.1.4.tar.gz
- Upload date:
- Size: 2.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.4 CPython/3.6.9 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 22f20f1d9d28f7d3c79364b4f1e733eba58817ae170d97a9e32c6ddcae21d044 |
|
MD5 | 0bd36b94d7223162c4a902bf9aecc7a3 |
|
BLAKE2b-256 | e7e9be5dc1b80ffa77070354b9fb49d76d9900bdc23dee620abb9f5fd20d5869 |
File details
Details for the file persine-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: persine-0.1.4-py3-none-any.whl
- Upload date:
- Size: 2.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.4 CPython/3.6.9 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8fd2daf2d681014c83de8e447a4189a5476a3de55369f402b90fdf99c507489 |
|
MD5 | 14cb83eb93701aeda19469e60424705a |
|
BLAKE2b-256 | 8919d23a8baf9de671562e498eea320152a552b9560e433fee2c5fbd6d200f16 |