Skip to main content

KeyCARE is a Python library designed for the unsupervised keyword extraction from biomedical documents with the use of different algorithms, the classification of the keywords according to their semantic nature, and the extraction of is a relations among those keywords and with other terminologies.

Project description

KeyCARE Logo

A framework for biomedical Keyword Extraction, term Categorization, and semantic Relation.


Explore the docs »

Report Bug · Request Feature

Table of Contents

  1. About the Project
  2. Getting Started
    2.1. Installation
    2.2. Usage
  3. Contributing
  4. License
  5. References

1. About The Project

Back to ToC

KeyCARE provides a common interface for extracting, categorizing and associating terms extracted from a text:

  1. Keywords extraction: KeyCARE implements several unsupervised term extraction techniques such as YAKE, RAKE, TextRank or KeyBERT to automatically extract key terms from a text.
  2. Term categorization: KeyCARE allows the application of term clustering techniques to group similar terms, as well as the training and application of supervised techniques to classify keywords into predefined categories, including SetFit.
  3. Semantic relation classification: Beyond the identification and categorization of terms, the library supports the use of neural classification models, such as the Transformer's AutoModelForSequenceClassification, to extract the semantic relation between two terms by means of EXACT, BROAD, NARROW and NO_RELATION relationships, which allows interconnecting the extracted terms and can be used for terminological enrichment, among other tasks.

2. Getting Started

Back to ToC

2.1. Installation

Installation can be done using pypi:

   pip install keycare

You might also need to install spacy's es_core_news_sm:

   python3 -m spacy download es_core_news_sm

2.2. Usage

The library is built on 3 main processes: keyword extraction, term categorization and relations extraction. The two first processes have been implemented within a same pipeline in the class TermExtractor, which automatically extracts classified keywords frim pieces of text. The relations extraction process among term pairs or groups of terms is implemented in the other main class, RelExtractor.

TermExtractor

For the use of TermExtractor with default parameters, use the following code:

   from keycare.TermExtractor import TermExtractor
   extractor = TermExtractor()
   extractor("...") # Introduce your text here
   extractor.keywords

This code calls TermExtractor with default parameters on a piece of text and returns the extracted keywords with their assigned class.

RelExtractor

For the use of RelExtractor with default parameters, use the following code:

   from keycare.RelExtractor import RelExtractor
   relextractor = RelExtractor()
   relextractor("...", "...") # Introduce your term pairs here
   relextractor.relations

This code calls RelExtractor with default parameters on pairs of terms and returns the existing relation among them.

For further information on the functioning of the library and the available parameters refer to the tutorials in the nbs folder.

3. Contributing

Back to ToC

This library has been developed with Python 3.10.12

Any contributions you make are greatly appreciated. For contributing:

  1. Fork/Clone the Project in your system

    git clone https://github.com/nlp4bia-bsc/keycare.git
    
  2. Create a new virtual environment

    python3 -m venv .env_keycare
    
  3. Activate the new environment

    source .env_keycare/bin/activate
    
  4. Install the requirements

    pip install -r requirements.txt
    
  5. Create your Feature Branch (git checkout -b feature/AmazingFeature)

  6. Update requirements file (pip freeze > requirements.txt)

  7. Commit your Changes (git commit -m 'Add some AmazingFeature')

  8. Push to the Branch (git push origin feature/AmazingFeature)

  9. Open a Pull Request from github.

Follow this tutorial to create a branch.

4. License

Back to ToC

Apache License, Version 2.0

5. References

Back to ToC

A paper on the library will soon be published. Please cite if you use the library in scientific works.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keycare-0.1.0.tar.gz (22.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keycare-0.1.0-py3-none-any.whl (27.8 kB view details)

Uploaded Python 3

File details

Details for the file keycare-0.1.0.tar.gz.

File metadata

  • Download URL: keycare-0.1.0.tar.gz
  • Upload date:
  • Size: 22.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for keycare-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2de7278b4b18a5dc716aec5180e5ffa41b51575fcc8af98970447447b6376381
MD5 caceaac784cc44f068305a943dc2de51
BLAKE2b-256 4dc29bf7f41086b71858fbf43b1af1312822e62a0e4675a0b6228c8e80cedd91

See more details on using hashes here.

File details

Details for the file keycare-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: keycare-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for keycare-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a305bbec64cb800489c33091e0cf55fdd145cb4a289409c8521e99f890c52e77
MD5 e13ba51cbfdb64b2fe9ba7059aabf96a
BLAKE2b-256 54ab8c559e513d4f493670b548d8a1414b356ca5a10318c813195a3b2ea96ab1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page