Skip to main content

Library designed to compress graphs

Project description

graphcompress

graphcompress is a Python library designed to compress graphs from NTRIPLES.

Formats

Compressed Graph

The resultant format writes the info of a node in a line. First the NODE ID, next, groups of connections, every connection has the EDGE ID and NODES CONNECTED USING THE EDGE. Considers a negative EDGE ID as a connection in opposite direction.

Example:

14 5.3.7 -3.2

The line says:

  • Node 14 connects to Node 3 using Edge 5.
  • Node 14 connects to Node 7 using Edge 5.
  • Node 2 connects to Node 14 using Edge 3.

Indexed Graph

The resultant format writes the info of a node in a line. First the NODE ID, next, indexes of edges where the node participates

Example:

14 3 5 7

The line says:

  • Node 14 appears in the edge with the index 3.
  • Node 14 appears in the edge with the index 5.
  • Node 14 appears in the edge with the index 7.

Install

pip install graph-compress

Usage

Input file: graph.nt.gz

<http://www.wikidata.org/entity/Q1> <http://www.wikidata.org/prop/direct/P5> <http://www.wikidata.org/entity/Q5> .
<http://www.wikidata.org/entity/Q1> <http://www.wikidata.org/prop/direct/P3> <http://www.wikidata.org/entity/Q6> .
<http://www.wikidata.org/entity/Q6> <http://www.wikidata.org/prop/direct/P5> <http://www.wikidata.org/entity/Q5> .
<http://www.wikidata.org/entity/Q6> <http://www.wikidata.org/prop/direct/P5> <http://www.wikidata.org/entity/Q2> .

(The lines above are the uncompressed file)

Compressed Graph

from graphcompress.readers import Reader_NT, Reader_NTgz, Reader_CGgz
from graphcompress.parsers import WikidataParser, CGParser
from graphcompress.builders.compress_graph import CGBuilder

r = Reader_NTgz('graph.nt.gz')
p = WikidataParser()
b = CGBuilder(r,p,'tmp','output.gz')

b.make_partitions()
b.merge_partitions()

Output file: output.gz

1 5.5 3.6
2 -5.6
5 -5.1.6
6 5.5.2 -3.1

(The lines above are the uncompressed file)

Indexed Graph

from graphcompress.readers import Reader_NT, Reader_NTgz, Reader_CGgz
from graphcompress.parsers import WikidataParser, CGParser
from graphcompress.builders.index_graph import IGBuilder

r = Reader_NTgz('graph.nt.gz')
p = WikidataParser()
b = IGBuilder(r,p,'tmp','output.gz')

b.make_partitions()
b.merge_partitions()

Output file: output.gz

1 0 1
2 3
5 0 2
6 1 2 3

(The lines above are the uncompressed file)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph-compress-1.1.0.tar.gz (7.1 kB view hashes)

Uploaded Source

Built Distribution

graph_compress-1.1.0-py3-none-any.whl (9.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page