A Python library for detecting astroturfing (coordinated inauthentic behavior) in social media posts.
Project description
Astrodetection
Astrodetection is a Python library designed for detecting astroturfing clues from lists of posts (mainly on X up to now, but not exclusively)
Installation
Pip
pip install "astrodetection[standard]"
or
pip install "astrodetection[light]"
Conda
-
Use the YAML file to configure the environment with conda:
conda create -n astrodetection_env conda activate astrodetection_env conda env update -f environment_standard.yml
Note: the environment_standard.yml configuration file uses FAISS and Fasttext libraries for VIGINUM D3LTA implementation
**If you have compatibility issues, prefer environment_light.yml and use astrodetection_light module
Usage
You can import directly the main functions:
from astrodetection import semantic_faiss, prepare_input_data, compute_bot_likelihood_metrics, create_network
Or use them directly:
import glob
import pandas as pd
import os
import numpy as np
import astrodetection
# Load a single JSON file into a DataFrame
file = "file_path" # Select the first file
df = pd.read_json(file)
df.index = df.index.astype(str) # Compatibility with d3lta
# Preprocess the DataFrame
df = df[df['tweet'].str.len() > 100]
df = df[df['username'] != 'grok']
df.index = df.index.astype(str)
# Compute matches and scores
df_filtered, df_emb = astrodetection.prepare_input_data(df, embeddings=df['emb'])
matches, df_cluster = astrodetection.semantic_faiss(
df_filtered.rename(columns={'tweet': 'original'}),
min_size_txt=0,
df_embeddings_use=df_emb,
threshold_grapheme=0.8,
threshold_language=0.715,
threshold_semantic=0.9
) #function taken from D3LTA
scores = astrodetection.compute_bot_likelihood_metrics(df, matches=matches)
# Create a network
network = astrodetection.create_network(matches, df)
New changes
-
semantic_faissfunction can now take detect only copypastas based on levenshtein distance, ignoring embeddings, if "skip" is passed as argument in df_embeddings_use field. -
compute_bot_likelihood_metricsfunction can now take columns' names as arguments for more customization
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file astrodetection-0.1.6.tar.gz.
File metadata
- Download URL: astrodetection-0.1.6.tar.gz
- Upload date:
- Size: 36.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba36c5f2f9426d11cf04ea1c22a6908acde96b931ac2cabb87f4c5d1ec922272
|
|
| MD5 |
72b1ba94471a86ffc99a64278ac4bccd
|
|
| BLAKE2b-256 |
d3fb7006001c6533e366d5660f4983274c5dcf97d33dcfb353339011e8911918
|
File details
Details for the file astrodetection-0.1.6-py3-none-any.whl.
File metadata
- Download URL: astrodetection-0.1.6-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21ff53e857f908c5587ba8460d12437766a15e4fe442d80f42bd713e77f6adfa
|
|
| MD5 |
f07324f161b931b3fa68a0d24fd59829
|
|
| BLAKE2b-256 |
d4f98d5176256cf0b866bdc7aee2a467be3d272058a9ff9037173e8fb0abb46a
|