Information Extraction framework in Python
Project description
IEPY is an open source tool for Information Extraction focused on Relation Extraction.
To give an example of Relation Extraction, if we are trying to find a birth date in:
“John von Neumann (December 28, 1903 – February 8, 1957) was a Hungarian and American pure and applied mathematician, physicist, inventor and polymath.”
then IEPY’s task is to identify “John von Neumann” and “December 28, 1903” as the subject and object entities of the “was born in” relation.
- It’s aimed at:
users needing to perform Information Extraction on a large dataset.
scientists wanting to experiment with new IE algorithms.
Features
An active learning relation extraction tool pre-configured with convenient defaults.
A rule based relation extraction tool for cases where the documents are semi-structured or high precision is required.
- A web-based user interface that:
Allows layman users to control some aspects of IEPY.
Allows decentralization of human input.
A shallow entity ontology with coreference resolution via Stanford CoreNLP
An easily hack-able active learning core, ideal for scientist wanting to experiment with new algorithms.
Installation
Install the required packages:
sudo apt-get install build-essential python3-dev liblapack-dev libatlas-dev gfortran openjdk-7-jre
Then simply install with pip:
pip install iepy
Full details about the installation is available on the Read the Docs page.
Running the tests
If you are contributing to the project and want to run the tests, all you have to do is:
Make sure your JAVAHOME is correctly set. Read more about it here
In the root of the project run nosetests
Learn more
The full documentation is available on Read the Docs.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file iepy-0.9.6.tar.gz
.
File metadata
- Download URL: iepy-0.9.6.tar.gz
- Upload date:
- Size: 554.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2fb5ce4da5ed6e222ae8794663b83b6fe0c47d741d49bc264c2cf3eed5f05c6e |
|
MD5 | 2c996e4601dc9907512a27e7ace83bd7 |
|
BLAKE2b-256 | 7c25b0d83c79908d4a74f4456b65b898972f41d71cacddffe47ca7c63d63c9ff |