Skip to main content

Wowool NLP Toolkit

Project description

The Wowool NLP Toolkit

install

Install the main sdk.

pip install wowool-sdk

Installing languages.

pip install wowool-[language]

Quick Start

Just create a document and pipeline, pass your document trough the Pipeline, and your done.

from wowool.sdk import Pipeline
from wowool.document import Document

document = Document("Mark Van Den Berg works at Omega Pharma.")
# Create an analyzer for a given language and options
process = Pipeline("english,entity")
# Process the data
document = process(document)
print(document)

API

Examples

You will need to install the english language module to run the sample. pip install wowool-english

Using the language identifier

This script demonstrates how to use the LanguageIdentifier to detect the language of a document.

from wowool.sdk import LanguageIdentifier

document = """
Un été de tous les records de chaleur en France.
Record de chaleur battu dans une cinquantaine de villes en France

"""
# Initialize a language identification engine
lid = LanguageIdentifier()
# Process the data
doc = lid(document)
print(doc.language)

Custom domain

The script identifies the word "car" as a Vehicle entity in the sentence "I have a car." using custom domain rules and language processing.

For more info on how to write rules see: https://www.wowool.com/docs/nlp/matching-&-capturing

from wowool.sdk import Language, Domain
from wowool.document import Document

english = Language("english")
vehicle = Domain(source="rule:{ 'car'} = Vehicle;")
doc = vehicle(english(Document("I have a car.")))
for entity in doc.entities:
    print(entity)

Extract dutch entities

This script demonstrates how to perform basic entity analysis on a Dutch sentence using the Wowool SDK.

Install first the dutch language model pip install wowool-dutch

from wowool.sdk import Pipeline
from wowool.document import Document

entities = Pipeline("dutch,entity")
document = entities(Document("Mark Van Den Berg werkte als hoofdarts bij Omega Pharma."))
for sentence in document.sentences:
    for entity in sentence.entities:
        print(entity)

Using the language identifier

This script demonstrates how to use the LanguageIdentifier to detect the different language sections in a text multi-language document.

from wowool.sdk import LanguageIdentifier

document = """
La juventud no es más que un estado de ánimo.

Record de chaleur battu dans une cinquantaine de villes en France

"""
# Initialize a language identification engine
lid = LanguageIdentifier(sections=True, section_data=True)
# Process the data
doc = lid(document)
if lid_results := doc.lid:
    for section in doc.lid.sections:
        assert section.text
        print(f"({section.begin_offset},{section.end_offset}): language= {section.language} text={section.text[:20].strip('\n')}...")

License

In both cases you will need to acquirer a license file at https://www.wowool.com

Non-Commercial

This library is licensed under the GNU AGPLv3 for non-commercial use.  
For commercial use, a separate license must be purchased.  

Commercial license Terms

1. Grants the right to use this library in proprietary software.  
2. Requires a valid license key  
3. Redistribution in SaaS requires a commercial license.  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wowool_sdk-3.5.2.dev3.tar.gz (70.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

wowool_sdk-3.5.2.dev3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

wowool_sdk-3.5.2.dev3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

wowool_sdk-3.5.2.dev3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (42.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

File details

Details for the file wowool_sdk-3.5.2.dev3.tar.gz.

File metadata

  • Download URL: wowool_sdk-3.5.2.dev3.tar.gz
  • Upload date:
  • Size: 70.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for wowool_sdk-3.5.2.dev3.tar.gz
Algorithm Hash digest
SHA256 9c3a43b54790d5f48173830ada2ba7b2516eba0752850f890569535d771183f7
MD5 aedfbdf669ae816c7c96477ab2493c31
BLAKE2b-256 25350d6574cd05b6d450d107884a524c03e927e00ce876a2fc72db9d8065af0d

See more details on using hashes here.

File details

Details for the file wowool_sdk-3.5.2.dev3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for wowool_sdk-3.5.2.dev3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4d25d38d24e3d8c9e1c5b437bcb8f4996c983830c6adcf9515402486de84bdd9
MD5 e291a5567a3756ac492ca9ad486c03fe
BLAKE2b-256 f263803c28839c6a20504fca0064ef6521ff5619a67259808b52640c739251d8

See more details on using hashes here.

File details

Details for the file wowool_sdk-3.5.2.dev3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for wowool_sdk-3.5.2.dev3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bf7190733faaa8b06cd9632c3a8e667a92e9cd34f6b9dea76f19b8dbae163826
MD5 2d1575be1486680c69f663c97774ecb3
BLAKE2b-256 447d77f070d3acb09c44ec7762a17caacb7aa8ec33dd0713ac12ffa0decdded3

See more details on using hashes here.

File details

Details for the file wowool_sdk-3.5.2.dev3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for wowool_sdk-3.5.2.dev3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1bcfc0630eed9d5dafe5d78366a760f04a84b7a72ccd786fe4c7d4fe82958358
MD5 f65fee3fa581efc87ec1f9026716fd28
BLAKE2b-256 c8a6728ab0e19301cc2c00b7aaaed91406880686594963f8b682b00e17e18387

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page