A package to compute text features for news veracity.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

nela_features

NOTE: This code is for research purposes only!

NELA (News Landscape) Features are groups of hand-crafted, text-based features for news veracity detection. These features have been used on multiple news veracity studies, although they can also be used more generically.

Features

The features can be broken down into 6 groups:

Style - This feature group captures the style and structure of the article. It includes POS (part of speech) tags and simple linguistic features such as number of quotes, punctuation, and all capitalized words.
Complexity - This feature group captures how complex the writing in the article is. It includes lexical diversity (type-token ratio), multiple reading difficulty metrics, length of words, and length of sentences.
Bias - This feature group captures the overall bias and subjectivity in the writing. This feature group is strongly based on Recasens et al. work [1] on detecting bias language.It includes the number of hedges, factives, assertives, implicatives, and opinion words.
Affect - This feature group captures sentiment and emotion used in the text. It includes positive and negative sentiment measures using VADER sentiment [3].
Moral - This feature group is based on Moral Foundation Theory [4] and lexicons used in [5]
Event - This feature group captures two concepts: time and location. This group contains 3 features: the number of locations in the article, the number of dates or times in the article, and the number of time related words in an article.

All features are normalized by the amount of text in a given news article. However, they may not all be in the same scale.

Installation

The easiest way to install is using pip. This will install all Python dependencies and NLTK downloads needed.

pip install nela_features

You can also download the nela_features folder and manually import the package and install dependencies.

Example package use

Input: text as a string

Output: feature vector, names of features in vector, both as Python lists

from nela_features.nela_features import NELAFeatureExtractor

newsarticle = "Breaking News: Ireland Expected To Become World's First Country To Divest From Fossil Fuels ..." 

nela = NELAFeatureExtractor()

# Extract all feature groups at once
feature_vector, feature_names = nela.extract_all(newsarticle)

# Extract each feature group independently
feature_vector, feature_names = nela.extract_style(newsarticle) 
feature_vector, feature_names = nela.extract_complexity(newsarticle) 
feature_vector, feature_names = nela.extract_bias(newsarticle)
feature_vector, feature_names = nela.extract_affect(newsarticle) 
feature_vector, feature_names = nela.extract_moral(newsarticle) 
feature_vector, feature_names = nela.extract_event(newsarticle)

Whats different between old and new NELA features?

If you have used the old version of these features: https://github.com/BenjaminDHorne/Language-Features-for-News, you will notice a few changes: 1. The subjectivity classifier features (previous called NBsubj and NBobj) have been removed. 2. The event group of features has been added. You will also notice the feature names have been better normalized and grouped. 3. Previously these features were paired with LIWC 2007 Dictionary features. In this version they are not. If you are interested in including LIWC features, please contact Dr. James Pennebaker (pennebaker@utexas.edu) for a LIWC dictionary or purchase the latest version of LIWC: https://liwc.wpengine.com/.

Papers to cite when using

The updated features are described in:

@article{horne2019robust, title={Robust Fake News Detection Over Time and Attack}, author={Horne, Benjamin D and N{\o}rregaard, Jeppe and Adali, Sibel}, journal={ACM Transactions on Intelligent Systems and Technology (TIST)}, volume={11}, number={1}, pages={1--23}, year={2019}, publisher={ACM New York, NY, USA} }

The original features were release in:

@inproceedings{horne2018assessing, title={Assessing the news landscape: A multi-module toolkit for evaluating the credibility of news}, author={Horne, Benjamin D and Dron, William and Khedr, Sara and Adali, Sibel}, booktitle={Companion Proceedings of the The Web Conference 2018}, pages={235--238}, year={2018} }

Please cite one of the papers if the features are used in publication.

References

[1] Marta Recasens, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. 2013. Linguistic models for analyzing and de-tecting biased language. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics(Volume 1: Long Papers), Vol. 1. 1650–1659.

[3] Clayton J. Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of socialmedia text. In Proceedings of the 8th International AAAI Conference on Weblogs and Social Media.

[4] Jesse Graham, Jonathan Haidt, Sena Koleva, Matt Motyl, Ravi Iyer, Sean P. Wojcik, and Peter H. Ditto. 2013. Moralfoundations theory: The pragmatic validity of moral pluralism. In Advances in Experimental Social Psychology. Vol. 47.Elsevier, 55–130.

[5] Ying Lin, Joe Hoover, Gwenyth Portillo-Wightman, Christina Park, Morteza Dehghani, and Heng Ji. 2018. Acquiringbackground knowledge to improve moral value prediction. In Proceedings of the IEEE/ACM International Conferenceon Advances in Social Networks Analysis and Mining (ASONAM’18). IEEE, 552–559.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

3.0.1

Oct 14, 2020

2.0.6

Jun 24, 2020

2.0.4

May 1, 2020

2.0.3

May 1, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nela_features-3.0.1.tar.gz (73.0 kB view details)

Uploaded Oct 14, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nela_features-3.0.1-py3-none-any.whl (73.5 kB view details)

Uploaded Oct 14, 2020 Python 3

File details

Details for the file nela_features-3.0.1.tar.gz.

File metadata

Download URL: nela_features-3.0.1.tar.gz
Upload date: Oct 14, 2020
Size: 73.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.6

File hashes

Hashes for nela_features-3.0.1.tar.gz
Algorithm	Hash digest
SHA256	`0cca0700c588e0c4fd4a644297a3380f25a6d1dbf307135deff6723bb357d850`
MD5	`730ac6ed5b025c698e1fdf1b1e71e259`
BLAKE2b-256	`9d46c3ae8cdb900368c6d46f7529c1db4e561563b22d0528450710f44a92e951`

See more details on using hashes here.

File details

Details for the file nela_features-3.0.1-py3-none-any.whl.

File metadata

Download URL: nela_features-3.0.1-py3-none-any.whl
Upload date: Oct 14, 2020
Size: 73.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.6

File hashes

Hashes for nela_features-3.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1dcc58c927c23dcedafdd04d047b4f4f469f1bde1c27252586cf5ec02298479f`
MD5	`babb27ca3570a7f58c77f7920d7bd551`
BLAKE2b-256	`7d86a7dc2a8370f87aa099c00c7f2898fe82c53fba04efe17764a1f0b99e2ed7`

See more details on using hashes here.

nela-features 3.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

nela_features

Features

Installation

Example package use

Whats different between old and new NELA features?

Papers to cite when using

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes