Your library description

Project description

faKy

We introduce faKy, a feature extraction library in the scope of fake news. It includes functions for calculating the readability scores, Information complexity, sentiment analysis using VADER, Named Entities, and part-of-speech tags. With these functions, relevant features for fake news detection can be computed. Furthermore, we provide a Dunn test function which can be used to test the significance between multiple independent variables.

With the development of faKy, we hope to contribute to more sophisticated and interpretable ML models and better comprehend the phenomenon of fake news by understanding these objects’ underlying different linguistic features.

Installation

FaKy can be installed through pip install faKy; the NLTK and spaCy's web_core_web_md are automatically installed within the faKy library

faKy 101

The use case of faKy is the computation of features based on text objects; the faKy library can be used to compute the features for all the text objects in the data frame. See the example code block.

Import the faky library and the corresponding function: from faKy.faKy import process_text_readability

Apply process_text_readability to the data frame dummy_df['readability'] = dummy_df['text-object'].apply(process_text_readability)

faKy functionality

Function Name	Usage
readability_computation	Computes the Flesch-Kincaid Reading Ease score for a spaCy document using the Readability class. Returns the original document object.
process_text_readability	Takes a text string as input, processes it with spaCy's NLP pipeline, and computes the Flesch-Kincaid Reading Ease score. Returns the score.
compress_doc	Compresses the serialized form of a spaCy Doc object using gzip, calculates the compressed size, and sets the compressed size to the custom "compressed_size" attribute of the Doc object. Returns the Doc object.
process_text_complexity	Takes a text string as input, processes it with spaCy's custom NLP pipeline, and computes the compressed size. Returns the compressed size of the string in bits.
VADER_score	Takes a text input and calculates the sentiment scores using the VADER sentiment analysis tool. Returns a dictionary of sentiment scores.
process_text_vader	Takes a text input, applies the VADER sentiment analysis model, and returns the negative, neutral, positive, and compound sentiment scores as separate variables.
count_named_entities	Takes a text input, identifies named entities using spaCy, and returns the count of named entities in the text.
count_ner_labels	Takes a text input, identifies named entities using spaCy, and returns a dictionary of named entity label counts.
create_input_vector_NER	Takes a dictionary of named entity recognition (NER) label counts and creates an input vector with the count for each NER label. Returns the input vector.
count_pos	Counts the number of parts of speech (POS) in a given text. Returns a dictionary with the count of each POS.
create_input_vector_pos	Takes a dictionary of POS tag counts and creates an input vector of zeros. Returns the input vector.
values_by_label	Takes a DataFrame, a feature, a list of labels, and a label column name. Returns a list of lists containing the values of the feature for each label.
dunn_table	Takes a DataFrame of Dunn's test results and creates a new DataFrame with pairwise comparisons between groups. Returns the new DataFrame.

Project details

Release history Release notifications | RSS feed

2.1.0

Jun 8, 2023

This version

2.0.1

Jun 6, 2023

2.0.0

Jun 6, 2023

1.4.0

Jun 6, 2023

1.3.3

Jun 6, 2023

1.3.2

Jun 6, 2023

1.3.1

Jun 6, 2023

1.3.0

Jun 6, 2023

1.2.3

Jun 6, 2023

1.2.1

Jun 6, 2023

1.2.0

Jun 6, 2023

1.0.0

Jun 6, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faKy-2.0.1.tar.gz (10.8 kB view hashes)

Uploaded Jun 6, 2023 Source

Built Distribution

faKy-2.0.1-py3-none-any.whl (14.2 kB view hashes)

Uploaded Jun 6, 2023 Python 3

Hashes for faKy-2.0.1.tar.gz

Hashes for faKy-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`60fee5a3cfdc6b101d0c071a9523f4d359a0a98510dbe1d15b8e9d6fff607450`
MD5	`286e6d2b405c9305c687e7c85d4cd662`
BLAKE2b-256	`124d21cff28ea1e2c20ffabd643383885bde0c8a364f3fd9a1acd3fe2768cf62`

Hashes for faKy-2.0.1-py3-none-any.whl

Hashes for faKy-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`311ed16942b99db412c9bb56b9750c09cc327087d92581e2bf74deb3e079f709`
MD5	`7ffb885a56c7232f0c87f6ef85a634dd`
BLAKE2b-256	`1dd5914e863b65d79938a2159d5550f935cf757c946db1aa8104de40f6814be1`