Skip to main content

A Python library for extracting and matching keywords with semantic and entity-based boosting.

Project description

KeywordX

PyPI version License: MIT Python Version

KeywordX is a lightweight Python library for extracting and matching keywords from text using semantic similarity and entity-based boosting.
Perfect for NLP pipelines, chatbots, search systems, and event extraction.


Features

  • Extract keywords with semantic similarity scoring
  • Boost keyword matches using entities (dates, times, places, etc.)
  • Supports custom IDF weighting for better relevance
  • Easy-to-use API for integration into NLP pipelines

Installation

Install from PyPI:

pip install keywordx

Or install from source:

git clone https://github.com/keikurono7/keywordx.git
cd keywordx
pip install -e .

Quick Start

Here is a quick example to get you started:

from keywordx import KeywordExtractor

ke = KeywordExtractor()
text = "Tomorrow I have a work meeting at 5pm in Bangalore."
keywords = ["meeting", "time", "place", "date"]

result = ke.extract(text, keywords)
print(result)

Example Output

The result will include extracted entities and semantic matches with scores:

{
  "entities": [
    {"span": [0, 8], "text": "Tomorrow", "type": "DATE"},
    {"span": [34, 37], "text": "5pm", "type": "TIME"},
    {"span": [41, 50], "text": "Bangalore", "type": "GPE"}
  ],
  "semantic_matches": [
    {"keyword": "meeting", "match": "meeting", "score": 0.99},
    {"keyword": "time", "match": "5pm", "score": 1.0},
    {"keyword": "place", "match": "Bangalore", "score": 1.0},
    {"keyword": "date", "match": "Tomorrow", "score": 1.0}
  ]
}

API Reference

  • KeywordExtractor()
    Initializes the keyword extractor.

  • .extract(text, keywords) → dict
    Extracts keywords and entities from text.

    • text: input string
    • keywords: list of keywords to match
  • Returns:

    • entities: named entities (DATE, TIME, GPE, etc.)
    • semantic_matches: list of matched keywords with similarity scores

Use Cases

  • Event and meeting extraction for calendar assistants
  • Chatbot intent detection
  • Automatic tagging of documents and notes
  • Context-aware search and indexing

Contributing

Contributions are welcome. For significant changes, please open an issue first to discuss the proposal.

Contributors

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keywordx-1.0.2.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keywordx-1.0.2-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file keywordx-1.0.2.tar.gz.

File metadata

  • Download URL: keywordx-1.0.2.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for keywordx-1.0.2.tar.gz
Algorithm Hash digest
SHA256 ebcb7cceced1256d453b772d5de40ade497f1de89618c4e66c909c51757ea099
MD5 57078e2b6b49de1eb6db426b8be37644
BLAKE2b-256 2200f28de15aff8b562085244eb5d3a174310bd13a8599e143b04aef47949292

See more details on using hashes here.

File details

Details for the file keywordx-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: keywordx-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for keywordx-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ce7d5df67dcd1155dfa50ba556efcb33701962b3c3f947c81be7d1be8c1346e3
MD5 0cfb621626335842aec49b952bc7b03d
BLAKE2b-256 a8f2a31aab0b907d4cb629078873ed3de9434abeb8e2c7f395d865a88e5cd0f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page