Skip to main content

Wowool NLP Toolkit

Project description

The Wowool NLP Toolkit

install

Install the main sdk.

pip install wowool-sdk

Installing languages.

pip install wowool-[language]

Quick Start

Just create a document and pipeline, pass your document trough the Pipeline, and your done.

from wowool.sdk import Pipeline
from wowool.document import Document

document = Document("Mark Van Den Berg works at Omega Pharma.")
# Create an analyzer for a given language and options
process = Pipeline("english,entity")
# Process the data
document = process(document)
print(document)

API

Examples

You will need to install the english language module to run the sample. pip install wowool-english

Create a pipeline.

This script demonstrates how to use the UUID component to create a pipeline.

from wowool.sdk import Pipeline
from wowool.common.pipeline import UUID

process = Pipeline(
    [
        UUID("english", options={"anaphora": False}),
        UUID("entity"),
        UUID("topics.app", {"count": 3}),
    ]
)
document = process("Mark Janssens works at Omega Pharma.")
print(document)

Custom domain

The script identifies the word "car" as a Vehicle entity in the sentence "I have a car." using custom domain rules and language processing.

For more info on how to write rules see: https://www.wowool.com/docs/nlp/matching-&-capturing

from wowool.sdk import Language, Domain
from wowool.document import Document

english = Language("english")
vehicle = Domain(source="rule:{ 'car'} = Vehicle;")
doc = vehicle(english(Document("I have a car.")))
for entity in doc.entities:
    print(entity)

Using the language identifier

This script demonstrates how to use the LanguageIdentifier to detect the language of a document.

from wowool.sdk import LanguageIdentifier

document = """
Un été de tous les records de chaleur en France.
Record de chaleur battu dans une cinquantaine de villes en France

"""
# Initialize a language identification engine
lid = LanguageIdentifier()
# Process the data
doc = lid(document)
print(doc.language)

Extract dutch entities

This script demonstrates how to perform basic entity analysis on a Dutch sentence using the Wowool SDK.

Install first the dutch language model pip install wowool-dutch

from wowool.sdk import Pipeline
from wowool.document import Document

entities = Pipeline("dutch,entity")
document = entities(Document("Mark Van Den Berg werkte als hoofdarts bij Omega Pharma."))
for sentence in document.sentences:
    for entity in sentence.entities:
        print(entity)

Using the language identifier

This script demonstrates how to use the LanguageIdentifier to detect the different language sections in a text multi-language document.

from wowool.sdk import LanguageIdentifier

document = """
La juventud no es más que un estado de ánimo.

Record de chaleur battu dans une cinquantaine de villes en France

"""
# Initialize a language identification engine
lid = LanguageIdentifier(sections=True, section_data=True)
# Process the data
doc = lid(document)
if lid_results := doc.lid:
    for section in doc.lid.sections:
        assert section.text
        print(f"({section.begin_offset},{section.end_offset}): language= {section.language} text={section.text[:20].strip('\n')}...")

License

In both cases you will need to acquirer a license file at https://www.wowool.com

Non-Commercial

This library is licensed under the GNU AGPLv3 for non-commercial use.  
For commercial use, a separate license must be purchased.  

Commercial license Terms

1. Grants the right to use this library in proprietary software.  
2. Requires a valid license key  
3. Redistribution in SaaS requires a commercial license.  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wowool_sdk-3.5.3.dev4.tar.gz (72.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

wowool_sdk-3.5.3.dev4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

wowool_sdk-3.5.3.dev4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

wowool_sdk-3.5.3.dev4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

File details

Details for the file wowool_sdk-3.5.3.dev4.tar.gz.

File metadata

  • Download URL: wowool_sdk-3.5.3.dev4.tar.gz
  • Upload date:
  • Size: 72.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for wowool_sdk-3.5.3.dev4.tar.gz
Algorithm Hash digest
SHA256 0fd7c7ca57a9b8a2416f01737f9cdb0bec1280c5553e2c9a7f0d5e100dca4ea8
MD5 3c5096c3f9847aeb3004266b0acf27bd
BLAKE2b-256 50400d336ff49ed003363a6e40a7976fc42f958dc5f07dd4f9734b312b779711

See more details on using hashes here.

File details

Details for the file wowool_sdk-3.5.3.dev4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for wowool_sdk-3.5.3.dev4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 eaa530182672d0718a448e735011ed4dd6517d334bc0961b6d4ce2a862492e41
MD5 23214e2f2ab74fce095168831cdfc2c8
BLAKE2b-256 f790c294fd0de3e90b6fcbc29a645a52d9eabcb117038904adbea1a354692041

See more details on using hashes here.

File details

Details for the file wowool_sdk-3.5.3.dev4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for wowool_sdk-3.5.3.dev4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 74099a1ab0a9000aadebcfe7a14f1bc1e6f018f224a8b6db11b66d74b504d955
MD5 f093b5506f11dc11c054ca51446d7da5
BLAKE2b-256 4e7c2a56e8f764a3fa722682f8b8a43625df67e81c7a8d308a35a56468c62808

See more details on using hashes here.

File details

Details for the file wowool_sdk-3.5.3.dev4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for wowool_sdk-3.5.3.dev4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 905b471747a8e8801d00890dd548b89c0899513a65cd72c7fc1c603cfe9a0e90
MD5 02a175c06bf011c2146c85ec723abd5c
BLAKE2b-256 b50cf6cb35fcdc89724edb5f12289696c13a3fc2d03f70ffeb1fbbc3039a75d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page