Natural Language Framework, for NER and RE
Project description
☝️ We moved
This library is not maintained anymore, and is only ocassionally receiving bugfixes.
We moved the functionality to train NER & Relation models to the text annotation tool, tagtog:
nalaf - (Na)tural (La)nguage (F)ramework
nalaf is a NLP framework written in python. The goal is to be a general-purpose module-based and easy-to-use framework for common text mining tasks. At the moment two tasks are covered: named-entity recognition (NER) and relationship extraction. These modules support both training and annotating. Associated to these, helper components such as cross-validation training or reading and conversion from different corpora formats are given. At the moment, NER is implemented with Conditional Random Fields (CRFs) and relationship extraction with Support Vector Machines (SVMs) using either linear or tree kernels.
Historically, the framework started from 2 joint theses at Rostlab at Technische Universität München with a focus on bioinformatics / BioNLP. Concretely the first goal was to do extraction of NL mutation mentions. Soon after another master's thesis used and generalized the framework to do relationship extraction of transcription factors (TF) interacting with gene or gene products. The nalaf framework is planned to be used in other BioNLP tasks at Rostlab.
As a result of the original BioNLP focus, some parts of the code are tailored to the biomedical domain. However, current efforts are underway to generalize all parts and this process is almost done. Development is not active and code maintenance is not guaranteed.
Current maintainer: Juan Miguel Cejuela (@juanmirocks).
(editable version on Lucidchart of the pipeline diagram; requires log in)
Install
Requires Python ^3.6
From PyPi
pip3 install nalaf
python3 -m nalaf.download_data
From source
git clone https://github.com/Rostlab/nalaf.git
cd nalaf
poetry shell
poetry update
python3 -m nalaf.download_data
Developing
Test
nosetests
Run Examples
Run example_annotate.py
for a simple example of annotation with a pre-trained NER model for protein names extraction:
python3 example_annotate.py -p 15878741 12625412
python3 example_annotate.py -s "This is c.A1003G an example"
# see issue https://github.com/Rostlab/nalaf/issues/159python3 example_annotate.py -d resources/example.txt
# see issue https://github.com/Rostlab/nalaf/issues/159
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file nalaf-0.6.0.tar.gz
.
File metadata
- Download URL: nalaf-0.6.0.tar.gz
- Upload date:
- Size: 94.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.4 CPython/3.6.12 Darwin/20.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b66ccb948fe6daae0c7839ecaa04954c50aef4a33ba3ef59c8f4b9dff0ac42e |
|
MD5 | a0e2eee741e0952917e9ee4a57b97cf6 |
|
BLAKE2b-256 | 0009c45a81fdf8934340c6276a54a7247a4824fffb700132894f2d7684e1f318 |
File details
Details for the file nalaf-0.6.0-py3-none-any.whl
.
File metadata
- Download URL: nalaf-0.6.0-py3-none-any.whl
- Upload date:
- Size: 111.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.4 CPython/3.6.12 Darwin/20.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb3b4185297582e17a478cef7880bf3c8493e35f053a9d52c1c4676ea5cc8711 |
|
MD5 | ba3ddae45b1d104c9cde829a4ac7afa2 |
|
BLAKE2b-256 | 7a168a21e29a8a43832f9e1a2390d88453749d4d0d168318b6b32856a38e2b6f |