Relationships Extraction from NARrative Documents
Project description
Renard
Renard (Relationships Extraction from NARrative Documents) is a library for creating and using custom character networks extraction pipelines. Renard can extract dynamic as well as static character networks.
Installation
You can install the latest version using pip:
pip install renard-pipeline
Currently, Renard supports Python 3.8, 3.9 and 3.10.
Documentation
Documentation, including installation instructions, can be found at https://compnet.github.io/Renard/
If you need local documentation, it can be generated using Sphinx
. From the docs
directory, make html
should create documentation under docs/_build/html
.
Tutorial
Renard's central concept is the Pipeline
.A Pipeline
is a list of PipelineStep
that are run sequentially in order to extract a character graph from a document. Here is a simple example:
from renard.pipeline import Pipeline
from renard.pipeline.tokenization import NLTKTokenizer
from renard.pipeline.ner import NLTKNamedEntityRecognizer
from renard.pipeline.character_unification import GraphRulesCharacterUnifier
from renard.pipeline.graph_extraction import CoOccurrencesGraphExtractor
with open("./my_doc.txt") as f:
text = f.read()
pipeline = Pipeline(
[
NLTKTokenizer(),
NLTKNamedEntityRecognizer(),
GraphRulesCharacterUnifier(min_appearance=10),
CoOccurrencesGraphExtractor(co_occurrences_dist=25)
]
)
out = pipeline(text)
For more information, see renard_tutorial.py
, which is a tutorial in the jupytext
format. You can open it as a notebook in Jupyter Notebook (or export it as a notebook with jupytext --to ipynb renard-tutorial.py
).
Running tests
Renard
uses pytest
for testing. To launch tests, use the following command :
poetry run python -m pytest tests
Expensive tests are disabled by default. These can be run by setting the environment variable RENARD_TEST_ALL
to 1
.
Contributing
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for renard_pipeline-0.4.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a54bd8b4161b9a3f918c3549c5b18f10e5437e5db0c387e4d202b36f992487a8 |
|
MD5 | e0301c99ba135b04a5ec274d03e91dca |
|
BLAKE2b-256 | 136af34716d89dc726e5371878144324a2c83de2bcc5719a7d711a718d89ec45 |