Skip to main content

Wowool NLP Toolkit Topic Identifier

Project description

Identifying topics in your documents

The topics app identifies topics in your documents and their relevancy.

Themes are pre-defined categories in which you want to categorize your documents. While topics are extracted from the processed documents.

Prerequisites

The topics.app uses the TopicCandidate entity to identify potential topics. It will automatically run the built-in topics domain to extract these annotations and perform the topic calculus.

Options

TopicsOptions

interface TopicsOptions {
    count?: number;
    threshold?: number;
    ignore_entities?: boolean;
}

with:

Property Description
count Maximum number of topics in the results
threshold Minimal probability, expressed as a percentage, for a topic candidate to be considered a topic
ignore_entities If enabled, entities such as Person, Country and Company won't be considered topic candidates

Results

TopicsResults

type TopicsResults = TopicsResult[];

TopicsResult

interface TopicsResult {
    name: string;
    relevancy: number;
}

with:

Property Description
name Name of the topic
relevancy Relevancy of the topic within the document

Examples

Identifying topics in your documents

The topics app identifies topics in your documents and their relevancy.

Themes are pre-defined categories in which you want to categorize your documents. While topics are extracted from the processed documents.

Prerequisites

The topics.app uses the TopicCandidate entity to identify potential topics. It will automatically run the built-in topics domain to extract these annotations and perform the topic calculus.

Options

TopicsOptions

interface TopicsOptions {
    count?: number;
    threshold?: number;
    ignore_entities?: boolean;
}

with:

Property Description
count Maximum number of topics in the results
threshold Minimal probability, expressed as a percentage, for a topic candidate to be considered a topic
ignore_entities If enabled, entities such as Person, Country and Company won't be considered topic candidates

Results

TopicsResults

type TopicsResults = TopicsResult[];

TopicsResult

interface TopicsResult {
    name: string;
    relevancy: number;
}

with:

Property Description
name Name of the topic
relevancy Relevancy of the topic within the document

API

Examples

Using the pipeline

This script demonstrates how to use the wowool.sdk's Pipeline class to identify topics in an English text.

from wowool.sdk import Pipeline

pipeline = Pipeline("english,topics.app")
doc = pipeline("Gas supplies to Europe wounded soldiers inside Azovstal steel mill")
print(doc.topics)

Using the Topics Identifier object

This script uses the wowool SDK to identify topics in an English sentence, specifying the number of topics to return.

from wowool.sdk import Pipeline
from wowool.topic_identifier import TopicIdentifier

english = Pipeline("english")
number_of_topics = 5
topic_it = TopicIdentifier(language="english", count=number_of_topics)
# display the results of every file, by iterating over every file.
document = topic_it(english("This is the effect of the green house gases"))
for topic in document.topics:
    print(f" - {topic}")

License

In both cases you will need to acquirer a license file at https://www.wowool.com

Non-Commercial

This library is licensed under the GNU AGPLv3 for non-commercial use.  
For commercial use, a separate license must be purchased.  

Commercial license Terms

1. Grants the right to use this library in proprietary software.  
2. Requires a valid license key  
3. Redistribution in SaaS requires a commercial license.  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wowool_topic_identifier-3.1.2-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file wowool_topic_identifier-3.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for wowool_topic_identifier-3.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2604479c8a129a6fbe4c5e92b1709e01ac40b8070b30aacfa29f364661ef60b9
MD5 8eeb0f9b7a767c3eaceb5d48c559ed9c
BLAKE2b-256 a96730638ed906e7ddcdaa9b3b6f97ee67378f2607e45b7ffbbb357ac79e2acf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page