A library for technology entity recognition and recommendation
Project description
Entity-Recognition
The Entity-Recognition library utilizes spaCy
, BERTopic
, and Transformers
to provide a robust technology entity recognition system capable of identifying technological entities within texts and suggesting relevant technologies using advanced NLP techniques.
The library automatically downloads the required spaCy model if not installed, making it easy to get started.
Features
- Technology Entity Extraction: Automatically extract technology-related terms and tools from texts.
- Recommendation System: Provides context-based technology recommendations.
- BERTopic Integration: Leverages topic modeling to enhance the relevance of recommendations.
- spaCy Matchers: Utilizes custom NLP patterns for precise entity recognition.
Installation
Prerequisites
- Python 3.11+
- pip
Getting Started
Install the library directly from PyPI:
pip install entity-recognition-lib
The required spaCy model (en_core_web_sm
) will be automatically downloaded and installed if not already present on your system.
Usage
Here's how to use the Entity Recognition library in your Python scripts:
from entity_recognition_lib import EntityRecognizer
# Create an instance of the recognizer
recognizer = EntityRecognizer()
# Example texts
texts = ["I need an Express.js Mongo database backend"]
# Process texts
results = recognizer.process_texts(texts)
print(results)
Expected output:
[
{
"input_text": "I need an Express.js Mongo database backend",
"predicted_topic_name": "575_databases_database_tables_schema",
"extracted_entities": [
{
"entity_name": "Express.js",
"score": 1.0,
"category": "Backend Web Frameworks"
},
{
"entity_name": "MongoDB",
"score": 1.0,
"category": "Databases"
}
],
"recommendations": [
{
"category": "Backend Web Frameworks",
"recommendation": "Express.js"
},
{
"category": "Databases",
"recommendation": "MongoDB"
}
]
}
]
Development
Setting Up a Development Environment
- Clone the repository:
git clone https://github.com/cgoncalves94/entity-recognition.git cd entity-recognition
- Create and Activate a Virtual Environment:
python -m venv .venv source .venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install Dependencies:
pip install -r requirements.txt
Testing
Run tests to ensure the setup is correct:
pytest
Contributing
Contributions are welcome! Please follow these steps:
- Fork the repository on GitHub.
- Clone the forked repository to your machine.
- Create a new branch for your changes.
- Make changes and test.
- Submit a pull request with a comprehensive description of changes.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for entity_recognition_lib-0.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7237254a26f706febb260210c3e66a844c6ef9deeb6eed1ef2d0cf6fc8904bd6 |
|
MD5 | 87af3cd8cd43115248d5f4bead8823d4 |
|
BLAKE2b-256 | e71e3141ba15abfe8a5fb3cd95d14a4190046ae11465810900430e5c97f50df8 |
Hashes for entity_recognition_lib-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38baaa905519bb76fbd99ba6866ddbce1b1d4fae426b8e7a943cbbbc27e6e82a |
|
MD5 | 3c0572d5370030d9522eb836bd1bfcf4 |
|
BLAKE2b-256 | 4c986a15042026f2d1d621f2504ae2bd25d2be17e79d2063d9f5adde4e47f5ae |