Skip to main content

an open-source python library for generating, and loading synthetic and real-world graph datasets

Project description

GraphFaker

GraphFaker is a Python library for generating, and loading synthetic and real-world graph datasets. It supports faker as social graph, OpenStreetMap (OSM) road networks, and real airline flight networks. Use it for data science, research, teaching, rapid prototyping, and more!

Note: The authors and graphgeeks labs do not hold any responsibility for the correctness of this generator.

PyPI version Docs Status Dependency Status image


Join our Discord server 👇

Problem Statement

Graph data is essential for solving complex problems in various fields, including social network analysis, transportation modeling, recommendation systems, and fraud detection. However, many professionals, researchers, and students face a common challenge: a lack of easily accessible, realistic graph datasets for testing, learning, and benchmarking. Real-world graph data is often restricted due to privacy concerns, complexity, or large size, making experimentation difficult.

Solution: graphfaker

GraphFaker is an open-source Python library designed to generate, load, and export synthetic graph datasets in a user-friendly and configurable way. It enables users to generate graph tailored to their specific needs, allowing for better experimentation and learning without needing to think about where the data is coming from or how to fetch the data.

Features

  • Multiple Graph Sources:
    • faker: Synthetic social graphs with rich node/edge types
    • osm: Real-world road networks from OpenStreetMap
    • flights: Real airline, airport, and flight networks
  • Easy CLI & Python Library

Installation

Install from PyPI:

uv pip install graphfaker

For development:

git clone https://github.com/denironyx/graphfaker.git
cd graphfaker
uv pip install -e .

Quick Start


Python Library Usage

from graphfaker import GraphFaker

gf = GraphFaker()
# Synthetic social/knowledge graph
g1 = gf.generate_graph(source="faker", total_nodes=200, total_edges=800)
# OSM road network
g2 = gf.generate_graph(source="osm", place="Berlin, Germany", network_type="drive")
# Flight network
g3 = gf.generate_graph(source="flights", country="United States", year=2024, month=1)

Advanced: Date Range for Flights

Note this isn't recommended and it's still being tested. We are working on ways to make this faster.

g = gf.generate_graph(source="flights", country="United States", date_range=("2024-01-01", "2024-01-15"))

CLI Usage (WIP)

Show help:

python -m graphfaker.cli --help

Generate a Synthetic Social Graph

python -m graphfaker.cli gen \
    --source faker \
    --total-nodes 100 \
    --total-edges 500

Generate a Real-World Road Network (OSM)

python -m graphfaker.cli gen \
    --source osm \
    --place "Berlin, Germany" \
    --network-type drive \
    --export berlin.graphml

Generate a Flight Network (Airlines/Airports/Flights)

python -m graphfaker.cli gen \
    --source flights \
    --country "United States" \
    --year 2024 \
    --month 1

You can also use --date-range for custom time spans (e.g., --date-range "2024-01-01,2024-01-15").


Future Plans: Graph Export Formats

  • GraphML: General graph analysis/visualization (--export graph.graphml)
  • JSON/JSON-LD: Knowledge graphs/web apps (--export data.json)
  • CSV: Tabular analysis/database imports (--export edges.csv)
  • RDF: Semantic web/linked data (--export graph.ttl)

Future Plans: Integration with Graph Tools

GraphFaker generates NetworkX graph objects that can be easily integrated with:

  • Graph databases: Neo4j, Kuzu, TigerGraph
  • Analysis tools: NetworkX, SNAP, graph-tool
  • ML frameworks: PyTorch Geometric, DGL, TensorFlow GNN
  • Visualization: Gephi, Cytoscape, D3.js

Documentation

Full documentation: https://graphfaker.readthedocs.io


⭐ Star the Repo

If you find this project valuable, star ⭐ this repository to support the work and help others discover it!


License

MIT License

Credits

Created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphfaker-0.1.1.tar.gz (163.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphfaker-0.1.1-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file graphfaker-0.1.1.tar.gz.

File metadata

  • Download URL: graphfaker-0.1.1.tar.gz
  • Upload date:
  • Size: 163.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.11

File hashes

Hashes for graphfaker-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c973a2e3efc244cf67173bb94cc4f641c330a67fa28c8e6ec1cc8cb05e832693
MD5 427787b8965b3a204cfbc1efd07e4eaa
BLAKE2b-256 ed112712e8b373301383e60aa0bb5b53f9612831f0acefb05f56786ca8672221

See more details on using hashes here.

File details

Details for the file graphfaker-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: graphfaker-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.11

File hashes

Hashes for graphfaker-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e19d2c3f5e27bc9dd6e9ee56390394eb107d342325aa9228f06b82fcc7079404
MD5 d1d586bcffe1148a626251541dc6a9d8
BLAKE2b-256 d7077ed340bcb74ff54e3b76054211f59b2281fd2e82b6161d0fcd9eb190288f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page