A Persian Twitter policy agenda tracking framework

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

A Frameword For Tracking Legislator's Policy Agendas

Last commit ask

This repository contains the implementation for the following paper:

Tracking Legislators’ Expressed Policy Agendas in Real Time

TODO
A Brief Summary of The Papers
Implementation details
Reproducing Results

1) Tracking Legislators’ Expressed Policy Agendas in Real Time

TO-DO:

Summarizing the paper
Outlining the details of implementations
Implement Word2Vec
Training Word2Vec
Seed words
Classification heads
Results & Analysis
Tests&Coverage
Documentation
CI/CD
Smooth Installation

A Brief Summary of The Papers

Introduction:

This work aims to analyse political orientation of legislators on salient policy issues through their temporally granular tweets, using a word embedding for feature extraction, and a classifier to label all legislators’ past and current relevant tweets according to whether they express a particular issue position over time.
Main Problem:

Is it possible to accurately analyse the temporal evolution of political orientation on salient issues by applying natural language processing techniques on users tweets?

The issues of concern in this paper are immigration, and climate change.
Illustrative Example:

Given a tweet about immigration policy, they first encode it using word2vec enhanced dictionary, then its exclusiveness or inclusiveness can be detected using a classifier. Furthermore these results can be disaggregated to see whether it was posted from a Republican or a Democrat.
I/O:
- Input: Tweets (textual modality)
- Output: Predicted stance on the salient political issue
Motivation:
1. Using tweets to track shifts in legislators’ rhetoric is highly scalable. It can be used on any topic of interest, by any political actor with a Twitter account, in any country around the world, from the past decade or into the future.
2. Twitter data has high temporal granularity.
Related (Previous) Works:

According to legislator’s different channels of communications, it is divided into 8 categories:
1. Stump speeches: Fenno 1978
2. Campaign mail: Golbeck, Grimes and Rogers 2010
3. Television advertising: Lau, Sigelman and Rovner 2007
4. Floor speeches: Martin and Vanberg 2008; Martin 2011; Quinn et al. 2010
5. Press releases: Grimmer 2010; Grimmer, Westwood and Messing 2014; Klüver and Sagarzazu 2016
6. Websites: Adler, Gent and Overmeyer 1998; Anstead and Chadwick 2008; Druckman, Kifer and Parkin 2009
7. RSS feeds: Cormack 2013
8. Social media posts: Gulati and Williams 2010; Barbera et al. 2018; Radford and Sinclair 2016; Shapiro et al. 2014; Lilleker and Koc-Michalska 2013
Contributions of this paper:
1. Simple, transparent, and interpretable approach to tweet classification can achieve satisfactory levels of accuracy across diverse issues.
2. Automate the process of updating and maintaining the model.
3. Develop a dynamical, real-time scalable method for tracking elected officials’ expressed policy positions through their tweets.
Proposed Method:
- Stage I: (Feature Extraction)
  
  They used Word2Vec enhanced dictionary to encode the texts. In particular, a set of stemmed seed words is identified as being relevant to the concept of interest. Then use word embeddings to identify other words that are semantically related to these seed words in the data.
- Stage II: Classification of political stance on salient issues.
  
  Choice of classifier: using five-fold cross validation and comparing precision, recall, accuracy, balanced accuracy, and F1 scores to choose the best performing classifier among XGBoost, Naive Bayes, Elastic Net, Lasso.

Experiments:

Datasets:

Their own making. Crawled all senators and the vast majority of members of the House tweets using twitter API from any period of interest up to 2020, excluding those who left office or were elected for the first time.
Results:

Trained word embeddings on the entire corpus of legislators’ tweets. The word2vec dictionaries are limited to the 100 most similar words to the seed words and overly general or irrelevant terms are omitted. The detailed results provided in the appendix is summarised in the below table:

Dataset	Issue	Classification Method	F1-score	Recall	Precision	Accuracy	Balanced Accuracy
Crawled Legislators' Tweets	Immigration (Exclusive or Not)	Naive Bayes	0.885	0.853	0.921	0.813	0.738
		XGBoost	0.871	0.909	0.836	0.795	0.668
		Elastic Net	0.881	0.967	0.809	0.801	0.615
		Lasso	0.871	0.962	0.797	0.784	0.586
	Immigration (Inclusive or Not)	Naive Bayes	0.892	0.865	0.920	0.830	0.781
		XGBoost	0.888	0.916	0.861	0.828	0.746
		Elastic Net	0.890	0.978	0.817	0.821	0.674
		Lasso	0.894	0.974	0.826	0.828	0.691
	Climent Change (No Action or Not)	Naive Bayes	0.889	0.874	0.904	0.827	0.742
		XGBoost	0.888	0.896	0.880	0.818	0.698
		Elastic Net	0.891	0.963	0.830	0.811	0.575
		Lasso	0.892	0.965	0.830	0.813	0.576
	Climent Change (Take Action or Not)	Naive Bayes	0.687	0.742	0.640	0.758	0.746
		XGBoost	0.678	0.694	0.662	0.736	0.729
		Elastic Net	0.706	0.764	0.655	0.745	0.748
		Lasso	0.700	0.764	0.646	0.738	0.742

Implementation details:

mermaid_kroki)

Reproducing Results for XGB

Dataset	Issue	Classification Method	F1-score	Recall	Precision	Accuracy	Balanced Accuracy
Crawled Persian Tweets	JCPOA (Relevant or Not)	Naive Bayes	0.845	0.901	0.792	0.843	0.839
		XGBoost	0.999	0.999	0.999	0.999	0.999
		Passive Aggressive	0.991	0.983	0.994	0.992	0.991
		Lasso	0.988	0.985	0.983	0.984	0.987
	Stock Market (Relevant or Not)	Naive Bayes	0.892	0.865	0.920	0.830	0.781
		XGBoost	0.999	0.999	1.000	0.999	0.999
		Elastic Net	0.890	0.978	0.817	0.821	0.674
		Lasso	0.894	0.974	0.826	0.828	0.691
	Vaccination (Relevant or Not)	Naive Bayes	0.870	0.92	0.82	0.855	0.883
		XGBoost	1.000	1.000	1.000	1.000	1.000
		Passive Aggressive	0.975	0.945	0.965	0.97	0.95
		Lasso	0.971	0.955	0.973	0.970	0.959
	Filtering (Relevant or Not)	Naive Bayes	0.687	0.742	0.640	0.758	0.746
		XGBoost	0.950	0.951	0.958	0.954	0.950
		Elastic Net	0.706	0.764	0.655	0.745	0.748
		Lasso	0.700	0.764	0.646	0.738	0.742

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

1.0.0

Mar 14, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tracking_policy_agendas-1.0.0.tar.gz (16.9 kB view hashes)

Uploaded Mar 14, 2022 Source

Built Distribution

tracking_policy_agendas-1.0.0-py3-none-any.whl (15.0 kB view hashes)

Uploaded Mar 14, 2022 Python 3

Hashes for tracking_policy_agendas-1.0.0.tar.gz

Hashes for tracking_policy_agendas-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b1df183d94c998c6b9223e4f02fe846e65e3871b59ca35a9688a123899570788`
MD5	`19269ece39adea2ae5888599b1d0258c`
BLAKE2b-256	`9969aaa8d69e76e7a10b25b1f51ceb7662c8670bfbd0de2987a7b7a8fd5df74e`

Hashes for tracking_policy_agendas-1.0.0-py3-none-any.whl

Hashes for tracking_policy_agendas-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ae3e58eb0e2aa3250dfa3517c0c47dfe1be1b4451cc604fc8080b036c43d3b6e`
MD5	`f81cdf58c88d51b69e84cc0d2d4fb5fa`
BLAKE2b-256	`baed289c9b48387f261fce5146f4e734d739577882dec6ac0f139286f6fbcb71`

tracking-policy-agendas 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

A Frameword For Tracking Legislator's Policy Agendas

Table of Contents

1) Tracking Legislators’ Expressed Policy Agendas in Real Time

TO-DO:

A Brief Summary of The Papers

Introduction:

Main Problem:

Illustrative Example:

I/O:

Motivation:

Related (Previous) Works:

Contributions of this paper:

Proposed Method:

Stage I: (Feature Extraction)

Stage II: Classification of political stance on salient issues.

Experiments:

Datasets:

Results:

Implementation details:

Reproducing Results for XGB

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution