Skip to main content

Library designed to compress graphs

Project description

graphcompress

graphcompress is a Python library designed to compress graphs from NTRIPLES.

Formats

Compressed Graph

The resultant format writes the info of a node in a line. First the NODE ID, next, groups of connections, every connection has the EDGE ID and NODES CONNECTED USING THE EDGE. Considers a negative EDGE ID as a connection in opposite direction.

Example:

14 5.3.7 -3.2

The line says:

  • Node 14 connects to Node 3 using Edge 5.
  • Node 14 connects to Node 7 using Edge 5.
  • Node 2 connects to Node 14 using Edge 3.

Indexed Graph

The resultant format writes the info of a node in a line. First the NODE ID, next, indexes of edges where the node participates

Example:

14 3 5 7

The line says:

  • Node 14 appears in the edge with the index 3.
  • Node 14 appears in the edge with the index 5.
  • Node 14 appears in the edge with the index 7.

Install

pip install graph-compress

Usage

Input file: graph.nt.gz

<http://www.wikidata.org/entity/Q1> <http://www.wikidata.org/prop/direct/P5> <http://www.wikidata.org/entity/Q5> .
<http://www.wikidata.org/entity/Q1> <http://www.wikidata.org/prop/direct/P3> <http://www.wikidata.org/entity/Q6> .
<http://www.wikidata.org/entity/Q6> <http://www.wikidata.org/prop/direct/P5> <http://www.wikidata.org/entity/Q5> .
<http://www.wikidata.org/entity/Q6> <http://www.wikidata.org/prop/direct/P5> <http://www.wikidata.org/entity/Q2> .

(The lines above are the uncompressed file)

Compressed Graph

from graphcompress.readers import Reader_NT, Reader_NTgz, Reader_CGgz
from graphcompress.parsers import WikidataParser, CGParser
from graphcompress.builders.compress_graph import CGBuilder

r = Reader_NTgz('graph.nt.gz')
p = WikidataParser()
b = CGBuilder(r,p,'tmp','output.gz')

b.make_partitions()
b.merge_partitions()

Output file: output.gz

1 5.5 3.6
2 -5.6
5 -5.1.6
6 5.5.2 -3.1

(The lines above are the uncompressed file)

Indexed Graph

from graphcompress.readers import Reader_NT, Reader_NTgz, Reader_CGgz
from graphcompress.parsers import WikidataParser, CGParser
from graphcompress.builders.index_graph import IGBuilder

r = Reader_NTgz('graph.nt.gz')
p = WikidataParser()
b = IGBuilder(r,p,'tmp','output.gz')

b.make_partitions()
b.merge_partitions()

Output file: output.gz

1 0 1
2 3
5 0 2
6 1 2 3

(The lines above are the uncompressed file)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph-compress-1.1.0.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graph_compress-1.1.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file graph-compress-1.1.0.tar.gz.

File metadata

  • Download URL: graph-compress-1.1.0.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for graph-compress-1.1.0.tar.gz
Algorithm Hash digest
SHA256 752e9a02887f1601fce5866805c92f3ea5449148f45c98f47a2fa92e977e9f5f
MD5 73e1e1e6bba2a13a798b9758d2d93456
BLAKE2b-256 3704c1c7e814212ce818713a6d8a52a9e83d3b10bb43766c3226c244569979ed

See more details on using hashes here.

File details

Details for the file graph_compress-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: graph_compress-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for graph_compress-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b3c705f15d46ad64d12a493bf280dd8f10a88118d67ab99d671ac2b9f4d5a903
MD5 36fb0502d6ffc1750b910e046debeba9
BLAKE2b-256 2e7525f49ef7ec9d3e53b96488808600e17737f91fa5120ded11d78effdce7db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page