AgeFreighter is a Python package that helps you to create a graph database using Azure Database for PostgreSQL.
Project description
AGEFreighter
a Python package that helps you to create a graph database using Azure Database for PostgreSQL.
Apache AGE™ is a PostgreSQL Graph database compatible with PostgreSQL's distributed assets and leverages graph data structures to analyze and use relationships and patterns in data.
Azure Database for PostgreSQL is a managed database service that is based on the open-source Postgres database engine.
Introducing support for Graph data in Azure Database for PostgreSQL (Preview).
0.5.0 Release
Refactored the code to make it more readable and maintainable with the separated classes for factory model. Please note how to use the new version of the package is tottally different from the previous versions.
Features
- Asynchronous connection pool support for psycopg PostgreSQL driver
- 'direct_loading' option for loading data directly into the graph. If 'direct_loading' is True, the data is loaded into the graph using the 'INSERT' statement, not Cypher queries.
- 'COPY' protocol support for loading data into the graph. If 'use_copy' is True, the data is loaded into the graph using the 'COPY' protocol.
Classes
- AvroFreighter
- CosmosGremlinFreighter
- CSVFreighter
- MultiCSVFreighter
- Neo4jFreighter
- NetworkXFreighter
- ParquetFreighter
- PGFreighter
Method
All the classes have the same load() method. The method loads data into the graph database.
Arguments for each class
-
common arguments
- graph_name (str) : the name of the graph
- chunk_size (int) : the number of rows to be loaded at once
- direct_loading (bool) : if True, the data is loaded into the graph using the 'INSERT' statement, not Cypher queries
- use_copy (bool) : if True, the data is loaded into the graph using the 'COPY' protocol
- drop_graph (bool) : if True, the graph is dropped before loading the data
-
AvroFreighter
- source_avro (str): The path to the Avro file.
- start_v_label (str): The label of the start vertex.
- start_id (str): The ID of the start vertex.
- start_props (list): The properties of the start vertex.
- edge_type (str): The type of the edge.
- end_v_label (str): The label of the end vertex.
- end_id (str): The ID of the end vertex.
- end_props (list): The properties of the end vertex.
-
CosmosGremlinFreighter
- cosmos_gremlin_endpoint (str): The Cosmos Gremlin endpoint.
- cosmos_gremlin_key (str): The Cosmos Gremlin key.
- cosmos_username (str): The Cosmos username.
- id_map (dict): The ID map.
-
CSVFreighter
- csv (str): The path to the CSV file.
- start_v_label (str): The label of the start vertex.
- start_id (str): The ID of the start vertex.
- start_props (list): The properties of the start vertex.
- edge_type (str): The type of the edge.
- end_v_label (str): The label of the end vertex.
- end_id (str): The ID of the end vertex.
- end_props (list): The properties of the end vertex.
-
MultiCSVFreighter
- vertex_csvs (list): The paths to the vertex CSV files.
- vertex_labels (list): The labels of the vertices.
- edge_csvs (list): The paths to the edge CSV files.
- edge_types (list): The types of the edges.
-
Neo4jFreighter
- neo4j_uri (str): The URI of the Neo4j database.
- neo4j_user (str): The username of the Neo4j database.
- neo4j_password (str): The password of the Neo4j database.
- neo4j_database (str): The database of the Neo4j database.
- id_map (dict): The ID map.
-
NetworkXFreighter
- networkx_graph (nx.Graph): The NetworkX graph.
- id_map (dict): The ID map.
-
ParquetFreighter
- source_parquet (str): The path to the Parquet file.
- start_v_label (str): The label of the start vertex.
- start_id (str): The ID of the start vertex.
- start_props (list): The properties of the start vertex.
- edge_type (str): The type of the edge.
- end_v_label (str): The label of the end vertex.
- end_id (str): The ID of the end vertex.
- end_props (list): The properties of the end vertex.
-
PGFreighter
- source_pg_con_string (str): The connection string of the source PostgreSQL database.
- source_schema (str): The source schema.
- source_tables (list): The source tables.
- id_map (dict): The ID map.
Release Notes
- 0.4.0 : Added 'loadFromCosmosGremlin()' function.
- 0.4.1 : Changed base Python version to 3.9 to run on Azure Cloud Shell and Databricks 15.4ML.
- 0.4.2 : Tuning for 'loadFromCosmosGremlin()' function.
- 0.4.3 : Standardized the argument names. Enhanced the tests for each functions.
- 0.4.4 : Performance tuning.
- 0.4.5 : Simplified 'loadFromNeo4j'.
- 0.4.6 : Added 'loadFromAvro()' function.
- 0.5.0 : Refactored the code to make it more readable and maintainable with the separated classes for factory model. Introduced concurrent.futures for better performance.
Install
pip install agefreighter
Prerequisites
- over Python 3.9
- This module runs on psycopg and psycopg_pool
- Enable the Apache AGE extension in your Azure Database for PostgreSQL instance. Login Azure Portal, go to 'server parameters' blade, and check 'AGE" on within 'azure.extensions' and 'shared_preload_libraries' parameters. See, above blog post for more information.
- Load the AGE extension in your PostgreSQL database.
CREATE EXTENSION IF NOT EXISTS age CASCADE;
Usage
from agefreighter import AgeFreighter, Factory
class_name = 'CSVFreighter'
instance = Factory.create_instance(class_name)
await instance.connect(dsn="host=your_host port=5432 dbname=postgres user=your_account password=your_password", max_connections=64)
await instance.load(arguments1, arguments2, ...)
See, tests/agefreightertester.py for more details.
Test & Samples
export PG_CONNECTION_STRING="host=your_host.postgres.database.azure.com port=5432 dbname=postgres user=account password=your_password"
cd tests/
python3.9 agefreightertester.py
For more information about Apache AGE
- Apache AGE : https://age.apache.org/
- GitHub : https://github.com/apache/age
- Document : https://age.apache.org/age-manual/master/index.html
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file agefreighter-0.5.0.tar.gz
.
File metadata
- Download URL: agefreighter-0.5.0.tar.gz
- Upload date:
- Size: 21.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc6a6fc7afd69d7e32fde1b195c04d64674d76fbfc5d815292830492aca15643 |
|
MD5 | 00500bba44d31d22c08e6d7d8bc2b3cc |
|
BLAKE2b-256 | a9e68565d8dcb14cb3f4c3da0bd969c1a65293ddb0ff18cdeb314a25595c1ad1 |
File details
Details for the file agefreighter-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: agefreighter-0.5.0-py3-none-any.whl
- Upload date:
- Size: 21.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7e60789363582b689801d169dd0cc6e3ab291ff3f5b791e713e7d4015f4f38f |
|
MD5 | 16e7e880eb5021123c5c5788e77248d4 |
|
BLAKE2b-256 | f40ed1937bd76d1f4136de7df8cb319fa430582876bcdd51be4bd74863c9a156 |