Wowool NLP Toolkit Topic Identifier
Project description
Identifying topics in your documents
The topics app identifies topics in your documents and their relevancy.
Themes are pre-defined categories in which you want to categorize your documents. While topics are extracted from the processed documents.
Prerequisites
The topics.app uses the TopicCandidate entity to identify potential topics. It will automatically run the built-in topics domain to extract these annotations and perform the topic calculus.
Options
TopicsOptions
interface TopicsOptions {
count?: number;
threshold?: number;
ignore_entities?: boolean;
}
with:
| Property | Description |
|---|---|
count |
Maximum number of topics in the results |
threshold |
Minimal probability, expressed as a percentage, for a topic candidate to be considered a topic |
ignore_entities |
If enabled, entities such as Person, Country and Company won't be considered topic candidates |
Results
TopicsResults
type TopicsResults = TopicsResult[];
TopicsResult
interface TopicsResult {
name: string;
relevancy: number;
}
with:
| Property | Description |
|---|---|
name |
Name of the topic |
relevancy |
Relevancy of the topic within the document |
Examples
Identifying topics in your documents
The topics app identifies topics in your documents and their relevancy.
Themes are pre-defined categories in which you want to categorize your documents. While topics are extracted from the processed documents.
Prerequisites
The topics.app uses the TopicCandidate entity to identify potential topics. It will automatically run the built-in topics domain to extract these annotations and perform the topic calculus.
Options
TopicsOptions
interface TopicsOptions {
count?: number;
threshold?: number;
ignore_entities?: boolean;
}
with:
| Property | Description |
|---|---|
count |
Maximum number of topics in the results |
threshold |
Minimal probability, expressed as a percentage, for a topic candidate to be considered a topic |
ignore_entities |
If enabled, entities such as Person, Country and Company won't be considered topic candidates |
Results
TopicsResults
type TopicsResults = TopicsResult[];
TopicsResult
interface TopicsResult {
name: string;
relevancy: number;
}
with:
| Property | Description |
|---|---|
name |
Name of the topic |
relevancy |
Relevancy of the topic within the document |
API
Examples
Using the pipeline
This script demonstrates how to use the wowool.sdk's Pipeline class to identify topics in an English text.
from wowool.sdk import Pipeline
pipeline = Pipeline("english,topics.app")
doc = pipeline("Gas supplies to Europe wounded soldiers inside Azovstal steel mill")
print(doc.topics)
Using the Topics Identifier object
This script uses the wowool SDK to identify topics in an English sentence, specifying the number of topics to return.
from wowool.sdk import Pipeline
from wowool.topic_identifier import TopicIdentifier
english = Pipeline("english")
number_of_topics = 5
topic_it = TopicIdentifier(language="english", count=number_of_topics)
# display the results of every file, by iterating over every file.
document = topic_it(english("This is the effect of the green house gases"))
for topic in document.topics:
print(f" - {topic}")
License
In both cases you will need to acquirer a license file at https://www.wowool.com
Non-Commercial
This library is licensed under the GNU AGPLv3 for non-commercial use.
For commercial use, a separate license must be purchased.
Commercial license Terms
1. Grants the right to use this library in proprietary software.
2. Requires a valid license key
3. Redistribution in SaaS requires a commercial license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wowool_topic_identifier-3.1.2-py3-none-any.whl.
File metadata
- Download URL: wowool_topic_identifier-3.1.2-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2604479c8a129a6fbe4c5e92b1709e01ac40b8070b30aacfa29f364661ef60b9
|
|
| MD5 |
8eeb0f9b7a767c3eaceb5d48c559ed9c
|
|
| BLAKE2b-256 |
a96730638ed906e7ddcdaa9b3b6f97ee67378f2607e45b7ffbbb357ac79e2acf
|