Graph database in LibSQL

These details have not been verified by PyPI

Project description

Personal-Graph

Modern Interface Some amounts of JSON validation and Context Manager/Class wrapper

# Planned API
with personal_graph.connect(url, token) as graph:
  graph.add([Node(label="sam", attributes={...}), Node(label="ella", attributes={...})])
  graph.connect(from="sam", to="ella", label="brother", attributes={...})

  graph.merge_by_similarity(threshold=0.9)
  
  relatives: list[Node] = graph.find_nodes_like(label="relative", threshold=0.9)

AI Native Features

Semantic Search (Sqlite-vss)
Natural language interface to KGs (Instructor, Pydantic)

# Planned API
with personal_graph.connect(url, token) as graph:
  graph.insert(text="My brother is actually pretty interested in coral reefs near Sri Lanka.")
  subgraph: KnowledgeGraph = graph.find_subgraph_like(text="Why would I be interested in ocean?")
  subgraph.draw()

Good Performance on reads (even with complex queries)
- Local replicas are supported by libsql
Support for Machine Learning Libraries
- Export to dict functions for Networkx/PyG etc

Usage

Basic Functions

The database script provides convenience functions for atomic transactions to add, delete, connect, and search for nodes.

Any single node or path of nodes can also be depicted graphically by using the visualize function within the database script to generate dot files, which in turn can be converted to images with Graphviz.

Example

It needs database url(db_url) and authentication token(auth_token) to connect with a remote database:

>>> from personal_graph import database as db
>>> db.initialize(db_url, auth_token)
>>> db.atomic(db_url, auth_token, db.add_node({'name': 'Apple Computer Company', 'type':['company', 'start-up'], 'founded': 'April 1, 1976'}, 1))
>>> db.atomic(db_url, auth_token, db.add_node({'name': 'Steve Wozniak', 'type':['person','engineer','founder']}, 2))
>>> db.atomic(db_url, auth_token, db.add_node({'name': 'Steve Jobs', 'type':['person','designer','founder']}, 3))
>>> db.atomic(db_url, auth_token, db.add_node({'name': 'Ronald Wayne', 'type':['person','administrator','founder']}, 4))
>>> db.atomic(db_url, auth_token, db.add_node({'name': 'Mike Markkula', 'type':['person','investor']}, 5))
>>> db.atomic(db_url, auth_token, db.connect_nodes(2, 1, {'action': 'founded'}))
>>> db.atomic(db_url, auth_token, db.connect_nodes(3, 1, {'action': 'founded'}))
>>> db.atomic(db_url, auth_token, db.connect_nodes(4, 1, {'action': 'founded'}))
>>> db.atomic(db_url, auth_token, db.connect_nodes(5, 1, {'action': 'invested', 'equity': 80000, 'debt': 170000}))
>>> db.atomic(db_url, auth_token, db.connect_nodes(1, 4, {'action': 'divested', 'amount': 800, 'date': 'April 12, 1976'}))
>>> db.atomic(db_url, auth_token,  db.connect_nodes(2, 3))
>>> db.atomic(db_url, auth_token, db.upsert_node(2, {'nickname': 'Woz'}))

There are also bulk operations, to insert and connect lists of nodes in one transaction.

The nodes can be searched by their ids:

>>> db.atomic(db_url, auth_token, db.find_node(1))
{'name': 'Apple Computer Company', 'type': ['company', 'start-up'], 'founded': 'April 1, 1976', 'id': 1}

Searches can also use combinations of other attributes, both as strict equality, or using LIKE in combination with a trailing % for "starts with" or % at both ends for "contains":

>>> db.atomic(db_url, auth_token, db.find_nodes([db._generate_clause('name', predicate='LIKE')], ('Steve%',)))
[{'name': 'Steve Wozniak', 'type': ['person', 'engineer', 'founder'], 'id': 2, 'nickname': 'Woz'}, {'name': 'Steve Jobs', 'type': ['person', 'designer', 'founder'], 'id': 3}]
>>> db.atomic(db_url, auth_token, db.find_nodes([db._generate_clause('name', predicate='LIKE'), db._generate_clause('name', predicate='LIKE', joiner='OR')], ('%Woz%', '%Markkula',)))
[{'name': 'Steve Wozniak', 'type': ['person', 'engineer', 'founder'], 'id': 2, 'nickname': 'Woz'}, {'name': 'Mike Markkula', 'type': ['person', 'investor'], 'id': 5}]

More complex queries to introspect the json body, using the sqlite json_tree() function, are also possible, such as this query for every node whose type array contains the value founder:

>>> db.atomic(db_url, auth_token,  db.find_nodes([db._generate_clause('type', tree=True)], ('founder',), tree_query=True, key='type'))
[{'name': 'Steve Wozniak', 'type': ['person', 'engineer', 'founder'], 'id': 2, 'nickname': 'Woz'}, {'name': 'Steve Jobs', 'type': ['person', 'designer', 'founder'], 'id': 3}, {'name': 'Ronald Wayne', 'type': ['person', 'administrator', 'founder'], 'id': 4}]

See the _generate_clause() and _generate_query() functions in database.py for usage hints.

Paths through the graph can be discovered with a starting node id, and an optional ending id; the default neighbor expansion is nodes connected nodes in either direction, but that can changed by specifying either find_outbound_neighbors or find_inbound_neighbors instead:

>>> db.traverse(db_url, auth_token, 2, 3)
['2', '1', '3']
>>> db.traverse(db_url, auth_token, 4, 5)
['4', '1', '2', '3', '5']
>>> db.traverse(db_url, auth_token, 5, neighbors_fn=db.find_inbound_neighbors)
['5']
>>> db.traverse(db_url, auth_token, 5, neighbors_fn=db.find_outbound_neighbors)
['5', '1', '4']
>>> db.traverse(db_url, auth_token, 5, neighbors_fn=db.find_neighbors)
['5', '1', '2', '3', '4']

Any path or list of nodes can rendered graphically by using the visualize function. This command produces dot files, which are also rendered as images with Graphviz:

>>> from visualizers import graphviz_visualize
>>> graphviz_visualize(db_url, auth_token, 'apple.dot', [4, 1, 5])

The resulting text file also comes with an associated image (the default is png, but that can be changed by supplying a different value to the format parameter)

The default options include every key/value pair (excluding the id) in the node and edge objects, and there are display options to help refine what is produced:

>>> graphviz_visualize(db_url, auth_token, 'apple.dot', [4, 1, 5], exclude_node_keys=['type'], hide_edge_key=True)
>>> path_with_bodies = db.traverse(db_url, auth_token, source, target, with_bodies=True) 
>>>graphviz_visualize_bodies('apple.dot', path_with_bodies)

The resulting dot file can be edited further as needed; the dot guide has more options and examples.

Time Complexity

Scenario	Average Time Complexity	Worst Case Time Complexity
Single Node Insert	O(1)	O(1)
Single Edge Insert	O(1)	O(1)
Single Node Retrieval by ID	O(1)	O(1)
Single Edge Retrieval by ID	O(1)	O(1)
Retrieval of All Nodes	O(n)	O(n)
Retrieval of All Edges	O(m)	O(m)
Retrieval of All Neighbors of a Node O(avg_degree)	O(n)
Retrieval of All Edges of a Node O(avg_degree)	O(n)
BFS/DFS Traversal	O(n + m)	O(n + m)
Shortest Path (Unweighted)	O(n + m)	O(n + m)
Shortest Path (Weighted)	O((n + m) log n)	O((n + m) log n)
Connected Components	O(n + m)	O(n + m)
Strongly Connected Components	O(n + m)	O(n + m)
Minimum Spanning Tree	O(m log n)	O(m log n)
Semantic Search (Approximate)	O(log n)	O(n)
Natural Language Query (Approximate)	O(n)	O(n^2)

Applications

[!NOTE] This repo adds significant functionality on top of simple-graph-pypi to add AI native features such as similarity search and Natural language interface. We also migrate the project from sqlite to libsql in order to interface with TursoDB.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.2

Jun 8, 2024

2.1

Jun 6, 2024

2.0

Jun 5, 2024

0.1.8

Apr 25, 2024

0.1.7

Apr 20, 2024

This version

0.1.6

Apr 18, 2024

0.1.5

Apr 17, 2024

0.1.4

Apr 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

personal_graph-0.1.6.tar.gz (21.4 kB view hashes)

Uploaded Apr 18, 2024 Source

Built Distribution

personal_graph-0.1.6-py3-none-any.whl (27.2 kB view hashes)

Uploaded Apr 18, 2024 Python 3

Hashes for personal_graph-0.1.6.tar.gz

Hashes for personal_graph-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`bb83e40b9fa6d998f8bc38f5225c807982ed2748fd70f485cd02fc7bf1961e03`
MD5	`3cb88a507dd0d41a5d928886b56fc4b4`
BLAKE2b-256	`b20c2b48851813e37678ad3c10def2b9fb7d420867eab1e3d6beb70316370a8f`

Hashes for personal_graph-0.1.6-py3-none-any.whl

Hashes for personal_graph-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`68d4a4eaccb100c6282b01e6fc99989a6ec8dd8f05bbc75ca48e21411592ceaa`
MD5	`8bee6afd1cc0c4ac700136884fa25491`
BLAKE2b-256	`a76dc0a0f3e6308c16852ce47f27cced4a719d0b67891b1e2916236bc26d963a`