Relationships Extraction from NARrative Documents
Project description
Renard
Renard (Relationships Extraction from NARrative Documents) is a library for creating and using custom character networks extraction pipelines. Renard can extract dynamic as well as static character networks.
Installation
You can install the latest version using pip:
pip install renard-pipeline
Currently, Renard supports Python 3.8, 3.9 and 3.10.
Documentation
Documentation, including installation instructions, can be found at https://compnet.github.io/Renard/
If you need local documentation, it can be generated using Sphinx
. From the docs
directory, make html
should create documentation under docs/_build/html
.
Tutorial
Renard's central concept is the Pipeline
.A Pipeline
is a list of PipelineStep
that are run sequentially in order to extract a character graph from a document. Here is a simple example:
from renard.pipeline import Pipeline
from renard.pipeline.tokenization import NLTKTokenizer
from renard.pipeline.ner import NLTKNamedEntityRecognizer
from renard.pipeline.character_unification import GraphRulesCharacterUnifier
from renard.pipeline.graph_extraction import CoOccurrencesGraphExtractor
with open("./my_doc.txt") as f:
text = f.read()
pipeline = Pipeline(
[
NLTKTokenizer(),
NLTKNamedEntityRecognizer(),
GraphRulesCharacterUnifier(min_appearance=10),
CoOccurrencesGraphExtractor(co_occurrences_dist=25)
]
)
out = pipeline(text)
For more information, see renard_tutorial.py
, which is a tutorial in the jupytext
format. You can open it as a notebook in Jupyter Notebook (or export it as a notebook with jupytext --to ipynb renard-tutorial.py
).
Running tests
Renard
uses pytest
for testing. To launch tests, use the following command :
poetry run python -m pytest tests
Expensive tests are disabled by default. These can be run by setting the environment variable RENARD_TEST_ALL
to 1
.
Contributing
see the "Contributing" section of the documentation.
How to cite
If you use Renard in your research project, please cite it as follows:
@Article{Amalvy2024,
doi = {10.21105/joss.06574},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {98},
pages = {6574},
author = {Amalvy, A. and Labatut, V. and Dufour, R.},
title = {Renard: A Modular Pipeline for Extracting Character
Networks from Narrative Texts},
journal = {Journal of Open Source Software},
}
We would be happy to hear about your usage of Renard, so don't hesitate to reach out!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file renard_pipeline-0.5.0.tar.gz
.
File metadata
- Download URL: renard_pipeline-0.5.0.tar.gz
- Upload date:
- Size: 65.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.8-200.fc40.x86_64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e56d3ec910b033ca1fb06a36470754c6854f390cac9330fc7bbcf1cfff0851d8 |
|
MD5 | e883ad68cd687eda3aa546a5f7fe3928 |
|
BLAKE2b-256 | b52a2773238954f889b641ccf7d79e0f166be960a8b36c4ee17d53890e877ff5 |
File details
Details for the file renard_pipeline-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: renard_pipeline-0.5.0-py3-none-any.whl
- Upload date:
- Size: 73.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.8-200.fc40.x86_64
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65c327ff8c9adf8926e9b6ad72140e2f1e06bb691ee3313f0a36825fece27e39 |
|
MD5 | 84b6b199b5e88759013697ef66b1e4e8 |
|
BLAKE2b-256 | 59e6729996836426efd5858b0a539166bb178bad47712b21902273a1cb54ce29 |