Skip to main content

Linguistic Pattern Lab using spaCy

Project description

LingPatLab: Linguistic Pattern Laboratory

Overview

LingPatLab is a robust API designed to perform advanced Natural Language Processing (NLP) tasks, utilizing the capabilities of the spaCy library. This tool is expertly crafted to convert raw textual data into structured, analyzable forms. It is ideal for developers, researchers, and linguists who require comprehensive processing capabilities, from tokenization to sophisticated text summarization.

Features

  • Tokenization: Splits raw text into individual tokens.
  • Parsing: Analyzes tokens to construct sentences with detailed linguistic annotations.
  • Phrase Extraction: Identifies and extracts significant phrases from sentences.
  • Text Summarization: Produces concise summaries of input text, optionally leveraging extracted phrases.

Usage

To get started with LingPatLab, you can set up the API as follows:

from spacy_core.api import SpacyCoreAPI

api = LingPatLab()

Tokenization and Parsing

To tokenize and parse input text into structured sentences:

parsed_sentence: Sentence = api.parse_input_text("Your input text here.")
print(parsed_sentence.to_string())

Phrase Extraction

To extract phrases from a structured Sentences object:

phrases: List[str] = api.extract_topics(parsed_sentences)
for phrase in phrases:
    print(phrase)

Data Classes

LingPatLab utilizes several custom data classes to structure the data throughout the NLP process:

  • Sentence: Represents a single sentence, containing a list of tokens (SpacyResult objects).
  • Sentences: Represents a collection of sentences, useful for processing paragraphs or multiple lines of text.
  • SpacyResult: Encapsulates the detailed analysis of a single token, including part of speech, dependency relations, and additional linguistic features.
  • OtherInfo: Contains additional information about a token, particularly in relation to its syntactic head.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lingpatlab-0.2.11.tar.gz (347.0 kB view details)

Uploaded Source

Built Distribution

lingpatlab-0.2.11-py3-none-any.whl (377.5 kB view details)

Uploaded Python 3

File details

Details for the file lingpatlab-0.2.11.tar.gz.

File metadata

  • Download URL: lingpatlab-0.2.11.tar.gz
  • Upload date:
  • Size: 347.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for lingpatlab-0.2.11.tar.gz
Algorithm Hash digest
SHA256 dec187251c0d525b1b76d4ac95de2a31ec492a68fa43c086f7825f98bc93f924
MD5 25df450f7e71961cb07de627ad247c11
BLAKE2b-256 771de08bae74779b07a07dca8439f7f52638b8aea694159e340f368c2c8b5d26

See more details on using hashes here.

File details

Details for the file lingpatlab-0.2.11-py3-none-any.whl.

File metadata

  • Download URL: lingpatlab-0.2.11-py3-none-any.whl
  • Upload date:
  • Size: 377.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for lingpatlab-0.2.11-py3-none-any.whl
Algorithm Hash digest
SHA256 aa37bc6c642cec68b3c59c3b7f8facadbdf0500cd5d76fd1608f0d1980559e08
MD5 dee0290be5098893ffd4582e86aefb3b
BLAKE2b-256 1dfe744dce3bdb15f4029c91bc9f82718ab587498148b5f80eb214268d0b72fc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page