Skip to main content

Linguistic Pattern Lab using spaCy

Project description

LingPatLab: Linguistic Pattern Laboratory

Overview

LingPatLab is a robust API designed to perform advanced Natural Language Processing (NLP) tasks, utilizing the capabilities of the spaCy library. This tool is expertly crafted to convert raw textual data into structured, analyzable forms. It is ideal for developers, researchers, and linguists who require comprehensive processing capabilities, from tokenization to sophisticated text summarization.

Features

  • Tokenization: Splits raw text into individual tokens.
  • Parsing: Analyzes tokens to construct sentences with detailed linguistic annotations.
  • Phrase Extraction: Identifies and extracts significant phrases from sentences.
  • Text Summarization: Produces concise summaries of input text, optionally leveraging extracted phrases.

Usage

To get started with LingPatLab, you can set up the API as follows:

from spacy_core.api import SpacyCoreAPI

api = LingPatLab()

Tokenization and Parsing

To tokenize and parse input text into structured sentences:

parsed_sentence: Sentence = api.parse_input_text("Your input text here.")
print(parsed_sentence.to_string())

Phrase Extraction

To extract phrases from a structured Sentences object:

phrases: List[str] = api.extract_topics(parsed_sentences)
for phrase in phrases:
    print(phrase)

Data Classes

LingPatLab utilizes several custom data classes to structure the data throughout the NLP process:

  • Sentence: Represents a single sentence, containing a list of tokens (SpacyResult objects).
  • Sentences: Represents a collection of sentences, useful for processing paragraphs or multiple lines of text.
  • SpacyResult: Encapsulates the detailed analysis of a single token, including part of speech, dependency relations, and additional linguistic features.
  • OtherInfo: Contains additional information about a token, particularly in relation to its syntactic head.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lingpatlab-0.2.13.tar.gz (346.3 kB view details)

Uploaded Source

Built Distribution

lingpatlab-0.2.13-py3-none-any.whl (376.2 kB view details)

Uploaded Python 3

File details

Details for the file lingpatlab-0.2.13.tar.gz.

File metadata

  • Download URL: lingpatlab-0.2.13.tar.gz
  • Upload date:
  • Size: 346.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for lingpatlab-0.2.13.tar.gz
Algorithm Hash digest
SHA256 3aa56eacaba3e119096700679430a20bfabe01639e3e8edcd48c6b1c90ad47b8
MD5 109d632de7bbb362ba0c6609ce598f0a
BLAKE2b-256 b40c00068f040bcfa9f09ff1547756d8ac676cdb281879375fe8e7110823529c

See more details on using hashes here.

File details

Details for the file lingpatlab-0.2.13-py3-none-any.whl.

File metadata

  • Download URL: lingpatlab-0.2.13-py3-none-any.whl
  • Upload date:
  • Size: 376.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for lingpatlab-0.2.13-py3-none-any.whl
Algorithm Hash digest
SHA256 5acac9ebaa0be461720451c98fa7da172c9081b810b1d7d31c6fdea24b5c00a2
MD5 4c09623ac777891b2a64bdec31b170b7
BLAKE2b-256 f5fa463f6ee8cd3dd5584a2b4c35db9d0501badca5785b7a0bf43622b269741e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page