Skip to main content

This is a general framework to create arango db graphs and annotate them.

Project description

Welcome to the Corpus Annotation Graph Builder (CAG)

License: MIT Badge: Made with Python Badge: PyPI version Twitter: DLR Software Badge: Open in VSCode Badge: Citation File Format Inside

cag is a Python Library offering an architectural framework to employ the build-annotate pattern when building Graphs.


Paper video.

Corpus Annotation Graph builder (CAG) is an architectural framework that employs the build-and-annotate pattern for creating a graph. CAG is built on top of ArangoDB and its Python drivers (PyArango). The build-and-annotate pattern consists of two phases (see Figure below): (1) data is collected from different sources (e.g., publication databases, online encyclopedias, news feeds, web portals, electronic libraries, repositories, media platforms) and preprocessed to build the core nodes, which we call Objects of Interest. The component responsible for this phase is the Graph-Creator. (2) Annotations are extracted from the OOIs, and corresponding annotation nodes are created and linked to the core nodes. The component dealing with this phase is the Graph-Annotator.

cag

This framework aims to offer researchers a flexible but unified and reproducible way of organizing and maintaining their interlinked document collections in a Corpus Annotation Graph.

Installation

Direct install via pip

The package can also be installed directly via pip.

pip install cag

This will allow you to use the module cag from any python script locally. The two main packages are cag.framework and cag.view_wrapper.

Manual cloning

Clone the repository, go to the root folder and then run:

pip install -e .

Usage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cag-1.1.5.tar.gz (3.0 MB view hashes)

Uploaded Source

Built Distribution

cag-1.1.5-py3-none-any.whl (37.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page