Relationships Extraction from NARrative Documents
Project description
Renard
Renard (Relationships Extraction from NARrative Documents) is a library for creating and using custom character networks extraction pipelines. Renard can extract dynamic as well as static character networks.
Installation
You can install the latest version using pip:
pip install renard-pipeline
Currently, Renard supports Python 3.8, 3.9 and 3.10.
Documentation
Documentation, including installation instructions, can be found at https://compnet.github.io/Renard/
If you need local documentation, it can be generated using Sphinx
. From the docs
directory, make html
should create documentation under docs/_build/html
.
Tutorial
Renard's central concept is the Pipeline
.A Pipeline
is a list of PipelineStep
that are run sequentially in order to extract a character graph from a document. Here is a simple example:
from renard.pipeline import Pipeline
from renard.pipeline.tokenization import NLTKTokenizer
from renard.pipeline.ner import NLTKNamedEntityRecognizer
from renard.pipeline.character_unification import GraphRulesCharacterUnifier
from renard.pipeline.graph_extraction import CoOccurrencesGraphExtractor
with open("./my_doc.txt") as f:
text = f.read()
pipeline = Pipeline(
[
NLTKTokenizer(),
NLTKNamedEntityRecognizer(),
GraphRulesCharacterUnifier(min_appearance=10),
CoOccurrencesGraphExtractor(co_occurrences_dist=25)
]
)
out = pipeline(text)
For more information, see renard_tutorial.py
, which is a tutorial in the jupytext
format. You can open it as a notebook in Jupyter Notebook (or export it as a notebook with jupytext --to ipynb renard-tutorial.py
).
Running tests
Renard
uses pytest
for testing. To launch tests, use the following command :
poetry run python -m pytest tests
Expensive tests are disabled by default. These can be run by setting the environment variable RENARD_TEST_ALL
to 1
.
Contributing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for renard_pipeline-0.4.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b23f75666a13678e42c80a4dcee3cb43dc2d3f7132d9356da62fa0af002b7742 |
|
MD5 | 170d3536e17055667d49ff53dde794ab |
|
BLAKE2b-256 | c908f7359b66ac45993769da77b679cb3f099ed96cf675c58a5d48cdb3cdad9e |