Skip to main content

Play detective on Reddit

Project description

reddit-detective: Play detective on Reddit

Python version Neo4j version Maintenance GitHub license Documentation Status

pip install reddit_detective

reddit-detective represents reddit in a graph structure using Neo4j.

Created to help researchers, developers and people who are curious about how Redditors behave.

Helping you to:

  • Detect political disinformation campaigns
  • Find trolls manipulating the discussion
  • Find secret influencers and idea spreaders (it might be you!)
  • Detect "cyborg-like" activities
    • "What's that?" Check reddit_detective/analytics/metrics.py for detailed information

Installation and Usage

  • Install Neo4j 4.1.0 here
  • Neo4j uses Cypher language as its query language. Knowing Cypher dramatically increases what you can do with reddit-detective Click here to learn Cypher
  • Install reddit-detective with pip install reddit_detective
    • Note: Version 0.1.2 is broken, any other version is fine

Code Samples

Creating a Reddit network graph

import praw
from neo4j import GraphDatabase

from reddit_detective import RedditNetwork, Comments
from reddit_detective.data_models import Redditor

# Create PRAW client instance
api = praw.Reddit(
    client_id="yourclientid",
    client_secret="yourclientsecret",
    user_agent="reddit-detective"
)

# Create driver instance
driver = GraphDatabase.driver(
    "url_of_database",
    auth=("your_username", "your_password")
)

# Create network graph
net = RedditNetwork(
        driver=driver,
        components=[
            # Other relationship types are Submissions and CommentsReplies
            # Other data models available as components are Subreddit and Submission
            Comments(Redditor(api, "BloodMooseSquirrel", limit=5)),
            Comments(Redditor(api, "Anub_Rekhan", limit=5))
        ]
    )
net.create_constraints() # Optional, doing once is enough
net.run_cypher_code()
net.add_karma(api)  # Shows karma as a property of nodes, optional

Output (in Neo4j): Result

Finding interaction score

# Assuming a network graph is created and database is started

# Interaction score = A / (A + B)
# Where A is the number of comments received in user's submissions
# And B is the number of comments made by the user
from reddit_detective.analytics import metrics

score = metrics.interaction_score(driver, "Anub_Rekhan")
score_norm = metrics.interaction_score_normalized(driver, "Anub_Rekhan")
print("Interaction score for Anub_Rekhan:", score)
print("Normalized interaction score for Anub_Rekhan:", score_norm)

Output:

Interaction score for Anub_Rekhan: 0.375
Normalized interaction score for Anub_Rekhan: 0.057324840764331204

Finding cyborg score

# Assuming a network graph is created and database is started

# For a user, submission or subreddit, return the ratio of cyborg-like comments to all comments
# A cyborg-like comment is basically a comment posted within 6 seconds of the submission's creation
# Why 6? Can't the user be a fast typer? 
#   See reddit_detective/analytics/metrics.py for detailed information

from reddit_detective.analytics import metrics

score, comms = metrics.cyborg_score_user(driver, "Anub_Rekhan")
print("Cyborg score for Anub_Rekhan:", score)
print("List of Cyborg-like comments of Anub_Rekhan:", comms)

Output:

Cyborg score for Anub_Rekhan: 0.2
List of Cyborg-like comments of Anub_Rekhan: ['q3qm5mo']

Running a Cypher statement

# Assuming a network graph is created and database is started

session = driver.session()
result = session.run("Some cypher code")
session.close()

Upcoming features

  • UserToUser relationships
    • A relationship to link users with its only property being the amount of encounters
    • Having ties with the same submission is defined as an encounter
  • Create a wrapper for centrality metrics of Neo4j GDSC (Graph data science library)

Inspirations

List of works/papers that inspired reddit-detective:

authors: [Sachin Thukral (TCS Research), Hardik Meisheri (TCS Research),
Arnab Chatterjee (TCS Research), Tushar Kataria (TCS Research),
Aman Agarwal (TCS Research), Lipika Dey (TCS Research),
Ishan Verma (TCS Research)]

title: Analyzing behavioral trends in community driven
discussion platforms like Reddit

published_in: 2018 IEEE/ACM International Conference on Advances in 
Social Networks Analysis and Mining (ASONAM)

DOI: 10.1109/ASONAM.2018.8508687

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reddit_detective-0.1.4.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

reddit_detective-0.1.4-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file reddit_detective-0.1.4.tar.gz.

File metadata

  • Download URL: reddit_detective-0.1.4.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.4

File hashes

Hashes for reddit_detective-0.1.4.tar.gz
Algorithm Hash digest
SHA256 ed543ec82a5a611d4d6d76f9cd7cef11a8e974097ccbd51ba15601ac6b674fc2
MD5 4e5a516000dc7f52b56d505e82621190
BLAKE2b-256 4fcd5aee108ece915762e0187f3f21bfe48c2844974cdadc4d7d33614d4320f3

See more details on using hashes here.

File details

Details for the file reddit_detective-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: reddit_detective-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.4

File hashes

Hashes for reddit_detective-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1d38eab01f3f65fa1c9b938a629a61f5996b2a4c76a4493fc45a9b6c6c8982b4
MD5 59f5ea79912dd8fe74c6e35c0bd3083b
BLAKE2b-256 44767779e6209907083bbf9cf8485a9f732026c9d1a2f14978c5233bc4a33395

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page