Skip to main content

A unified toolkit for advanced text processing and linguistic analysis. This package offers a comprehensive set of functions to clean, preprocess, and analyze text data, applying sophisticated linguistic techniques to enhance your text analytics workflow. Whether you're looking to clean raw text or extract detailed linguistic features, Linguistify provides the tools to transform and enrich your text data efficiently.

Project description

linguistify

A unified toolkit for advanced text processing and linguistic analysis. linguistify offers a comprehensive set of functions to clean, preprocess, and analyze text data, applying sophisticated linguistic techniques to enhance your text analytics workflow.

The purpose of the linguistify package is to provide tools that simplify and enhance the process of text cleaning and analysis. It aims to:

  • Clean Text Data: Remove unwanted elements from text, such as punctuation, URLs, and special characters, and normalize text for consistent processing.
  • Feature Extraction: Extract meaningful features from text data, including text length, stop words, and part-of-speech tags, to support various text analysis tasks.
  • Preprocess Text: Prepare text data for further analysis or modeling by transforming it into a suitable format.
  • Analyze Text: Apply linguistic techniques to derive insights from text, such as identifying key terms, sentiment analysis, and more.

Getting Started

To get started with linguistify, you need to install it. The package is available on PyPI and can be installed using pip. Open your terminal and run the following command:

Installation

pip install linguistify

Usage

Here’s a basic example to get you started:

import pandas as pd
from linguistify.cleaning import clean_text
from linguistify.feature_extraction import add_features_to_dataframe

# Sample text data
data = {
    'text': [
        "Check out our new product launch at https://example.com! We are excited to share it with you. Follow us @CompanyName #ProductLaunch :)"
    ]
}

df = pd.DataFrame(data)

# Clean the text
df['cleaned_text'] = df['text'].apply(clean_text)

# Extract features
df_with_features = add_features_to_dataframe(df, 'cleaned_text')

Examples:

Here is an example of a DataFrame processed by the linguistify package:

text cleaned_text length_of_text num_stop_words num_digits num_spaces num_exclamations num_questions num_periods num_adjectives num_nouns num_pronouns num_verbs num_adverbs
Check out our new product launch at https://example.com! We are excited to share it with you. Follow us @CompanyName #ProductLaunch :) check out our new product launch at url we are excit to share it with you follow us mention companynam hashtag productlaunch happi 130 9 0 22 0 0 0 4 6 5 4 0

API

Contribution

Contributions are welcome! If you notice a bug or have suggestions for improvements, please let us know.

Author

  • Main Maintainer: Jawaher Alghamdi

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

linguistify-0.1.0.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

linguistify-0.1.0-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file linguistify-0.1.0.tar.gz.

File metadata

  • Download URL: linguistify-0.1.0.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.7.6 Darwin/17.7.0

File hashes

Hashes for linguistify-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bbff3dfc748c866c67198504b2f614fcb6be3361724162fd9a9a54c78afef4e2
MD5 3783857d77abee7d87f69f6159879f04
BLAKE2b-256 076e82cd404ae5aa249910fa0b6045f3382a40d0951ec10e54b69037913ce88a

See more details on using hashes here.

File details

Details for the file linguistify-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: linguistify-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.7.6 Darwin/17.7.0

File hashes

Hashes for linguistify-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 751d5e0a8fa11f396efb0edce27d348bd1e293a8de04bf3825cc4d959e6391e9
MD5 f2e614aa8ec5502d51c653cea41acf04
BLAKE2b-256 ff006ce78598626858d8f58ebdce92bfcd901ab40f3525d7dd495de84e1a9991

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page