Use this library to transform raw text into differents graph representations.
Project description
text2graph-API
Use this library for text-to-graph tranformations. To use the API it is necessary to install its modules and dependencies in the user’s application. Also, the corpus of text documents to be transformed into graphs has to be loaded and read.
text2graphapi is a text to graph transformation pipeline that consists of four main modules::
- Text Preprocessing and Normalization. This module aims to perform all the cleaning and pre-processing part of the text. Apply NLP methods such as POS-Tag, Lemm, Stem, etc.
- Graph Model. This module aims to define the entities/nodes and their relationships/edges according to the problem specification.
- Graph Extraction. This module aims to build the graph according to the selected model. We use third-party libraries such as NetworkX.
- Graph Transformation and Analysis. This module aims to apply vector transformations to the graph as final output, such as adjacency matrix, dense matrix, etc.
Where to get it
# from PYPI
pip install text2graphapi
Example input data
# Has to be a list of dict, where ecah dict conatins an 'id' and 'doc' text data
input_text_docs = [{"id": 1, "doc": "text_data_1"},
{"id": 2, "doc": "text_data_2"}]
How to use it
from text2graphapi.src.Cooccurrence import Cooccurrence
to_cooccurrence = Cooccurrence(
graph_type = 'DiGraph',
apply_prep = True,
parallel_exec = False,
window_size = 1,
language = 'en',
output_format = 'adj_matrix')
output_text_graphs = to_cooccurrence.transform(corpus_docs)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
text2graphapi-0.1.1.tar.gz
(17.3 kB
view hashes)
Built Distribution
Close
Hashes for text2graphapi-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 89d5ba73032713dd77a0593a149a0f6ca5e7cfe934c7f068ed881a456660c636 |
|
MD5 | 434cabeb74d9b727b825db4dc7b879ee |
|
BLAKE2b-256 | aa0065762dffde40830bcdb543abebf6284db27b58ed08299c73a69e24142cf5 |