Skip to main content

A Python client library for the Graphon API - Build knowledge graphs from your files

Project description

Graphon Client

A Python client library for the Graphon API - Build unified knowledge graphs from your files.

Features

  • 🚀 Simple API: Upload files, create knowledge bases, and query them in just a few lines
  • 📁 Multi-format Support: Process videos (MP4, MOV), documents (PDF, DOCX), and images (JPG, PNG)
  • 🔄 Async/Await: Built on httpx for high-performance async operations
  • 🎯 Auto Type Detection: Automatically detects file types from extensions
  • 📊 Status Polling: Built-in polling for long-running operations
  • 🔒 Secure: API key authentication with role-based access control

Installation

pip install graphon-client

Quick Start

import asyncio
from graphon_client import GraphonClient

async def main():
    # Initialize client with your API key
    client = GraphonClient(api_key="your_api_key_here")
    
    # Upload and process files (auto-detects file types)
    file_objects = await client.upload_and_process_files(
        ["/path/to/video.mp4", "/path/to/document.pdf"],
        poll_until_complete=True  # Wait for processing to complete
    )
    
    # Create a knowledge graph from processed files
    file_ids = [f.file_id for f in file_objects]
    group_id = await client.create_group(
        file_ids,
        group_name="My Knowledge Base",
        wait_for_ready=True  # Wait for graph building to complete
    )
    
    # Query your knowledge graph
    response = await client.query_group(
        group_id,
        "What are the main topics discussed?"
    )
    print(response.answer)
    
    # Access sources by citation key
    for key, source in response.sources.items():
        print(f"{key}: {source['source']['node_type']}")

asyncio.run(main())

One-Shot Convenience Method

For the simplest workflow, use the all-in-one method:

async def quick_start():
    client = GraphonClient(api_key="your_api_key_here")
    
    # Upload, process, and create group in one call
    group_id = await client.upload_process_and_create_group(
        file_paths=["/path/to/file1.pdf", "/path/to/file2.mp4"],
        group_name="My Knowledge Base"
    )
    
    # Query immediately
    response = await client.query_group(group_id, "Summarize the content")
    print(response.answer)

Get Your API Key

  1. Sign up at https://graphon.ai
  2. Navigate to Settings → API Keys
  3. Create a new API key
  4. Save it securely (it's only shown once!)

Core Concepts

Files

Individual files that are uploaded and processed. Each file gets a unique file_id and goes through processing to extract content and build a knowledge graph.

Processing Statuses:

  • UNPROCESSED: File uploaded but not yet processed
  • PROCESSING: File is being processed
  • SUCCESS: Processing completed successfully
  • FAILURE: Processing failed

Groups

Collections of files with a unified knowledge graph. Query multiple files together as a single knowledge base.

Key Concept: Create Once, Query Many Times

Once you create a group, you can query the same group_id as many times as you want. There's no need to re-create the group for the same set of files. Groups are persistent knowledge bases - creating the group builds the graph (a one-time operation), and querying is lightweight and can be done repeatedly.

# Create the group once
group_id = await client.create_group(file_ids, "My Knowledge Base", wait_for_ready=True)

# Query as many times as you want with the same group_id
response1 = await client.query_group(group_id, "What are the main topics?")
response2 = await client.query_group(group_id, "Summarize the key findings")
response3 = await client.query_group(group_id, "What conclusions were drawn?")

Graph Statuses:

  • pending: Group created but no files added yet
  • building: Unified graph is being built
  • ready: Graph is ready for querying
  • failed: Graph building failed

API Reference

Client Initialization

client = GraphonClient(
    api_key="your_api_key_here",
    base_url="https://api-frontend-485250924682.us-central1.run.app"  # Optional
)

File Operations

Upload and Process Files

file_objects = await client.upload_and_process_files(
    file_paths=["/path/to/file1.pdf", "/path/to/file2.mp4"],
    poll_until_complete=True,  # Wait for processing (default: True)
    timeout=1800,  # Max wait time in seconds (default: 30 min)
    poll_interval=3,  # Check status every N seconds (default: 3)
    on_progress=lambda step, current, total: print(f"{step}: {current}/{total}")
)

# Returns list of FileObject with file_id, file_name, processing_status

Get File Status

file_detail = await client.get_file_status(file_id)
print(f"Status: {file_detail.processing_status}")

List Files

files = await client.list_files(
    status_filter="SUCCESS",  # Optional: filter by status
    file_type="video"  # Optional: filter by type
)

Poll File Until Complete (Manual)

file_detail = await client.poll_file_until_complete(
    file_id,
    timeout=1800,
    poll_interval=3,
    on_progress=lambda status: print(f"Status: {status}")
)

Group Operations

Create Group

group_id = await client.create_group(
    file_ids=["file-id-1", "file-id-2"],
    group_name="My Knowledge Base",
    wait_for_ready=True,  # Wait for graph building (default: False)
    timeout=3600,  # Max wait time in seconds (default: 1 hour)
    poll_interval=5,  # Check status every N seconds (default: 5)
    on_progress=lambda status: print(f"Graph status: {status}")
)

Query Group

Query a group's unified knowledge graph with sources mapped by citation keys:

response = await client.query_group(
    group_id="group-id-here",
    query="What are the key insights?",
    return_source_data=False,  # Set to true to get content/URLs (default: False)
    web_search=False  # Set to true to augment with web search (default: False)
)

print(response.answer)
# Answer contains inline citations like [1], [2], etc.

# Sources are keyed by citation markers - separate cited from non-cited
cited = {k: v for k, v in response.sources.items() if v.get("is_cited")}
other = {k: v for k, v in response.sources.items() if not v.get("is_cited")}

print("Cited Sources:")
for key, node in cited.items():
    source = node['source']
    print(f"  {key}: {source['node_type']} (score: {node['score']:.3f})")

print("Other Relevant Sources:")
for key, node in other.items():
    source = node['source']
    print(f"  {key}: {source['node_type']} (score: {node['score']:.3f})")

Response Structure:

  • answer: Generated answer with inline citation markers like [1], [2]
  • sources: Dictionary mapping citation keys to source metadata:
    • source: Source metadata (node_type, file_id, and type-specific fields)
    • score: Relevance score (0.0 to 1.0)
    • is_cited: Whether this source was explicitly cited in the answer

Get Group Status

group_detail = await client.get_group_status(group_id)
print(f"Graph status: {group_detail.graph_status}")
print(f"Files in group: {len(group_detail.file_ids)}")

List Groups

groups = await client.list_groups()
for group in groups:
    print(f"{group.group_name} - {group.graph_status} - {group.file_count} files")

Note: list_groups() returns GroupListItem objects (summary view with file_count), not full GroupDetail objects. Use get_group_status(group_id) for complete details.

Poll Group Until Ready (Manual)

group_detail = await client.poll_group_until_ready(
    group_id,
    timeout=3600,
    poll_interval=5,
    on_progress=lambda status: print(f"Status: {status}")
)

Supported File Types

Videos

  • .mp4, .mov, .avi, .mkv, .webm
  • Automatic transcription and scene analysis
  • Maximum size: Check API limits

Documents

  • .pdf, .doc, .docx, .txt
  • Text extraction and semantic analysis
  • Maximum size: Check API limits

Images

  • .jpg, .jpeg, .png, .gif, .webp
  • OCR and visual analysis
  • Maximum size: Check API limits

Error Handling

try:
    file_objects = await client.upload_and_process_files(file_paths)
except Exception as e:
    print(f"Upload failed: {e}")

try:
    response = await client.query_group(group_id, query)
    print(response.answer)
except Exception as e:
    if "not ready" in str(e):
        print("Graph is still building, please wait")
    else:
        print(f"Query failed: {e}")

Advanced Usage

Progress Tracking

def track_progress(step: str, current: int, total: int):
    percent = (current / total) * 100
    print(f"[{step}] {percent:.1f}% ({current}/{total})")

file_objects = await client.upload_and_process_files(
    file_paths,
    on_progress=track_progress
)

Without Waiting (Non-blocking)

# Start uploads and processing without waiting
file_objects = await client.upload_and_process_files(
    file_paths,
    poll_until_complete=False  # Returns immediately
)

# Poll manually later
for file_obj in file_objects:
    file_detail = await client.poll_file_until_complete(file_obj.file_id)
    print(f"{file_detail.file_name}: {file_detail.processing_status}")

Custom Timeouts

# For large files that take longer to process
file_objects = await client.upload_and_process_files(
    file_paths,
    poll_until_complete=True,
    timeout=3600,  # 1 hour timeout
    poll_interval=10  # Check every 10 seconds
)

# For large groups that take longer to build
group_id = await client.create_group(
    file_ids,
    group_name="Large Knowledge Base",
    wait_for_ready=True,
    timeout=7200,  # 2 hour timeout
    poll_interval=15  # Check every 15 seconds
)

Changelog

v1.0.0 (2024-12-18)

Breaking Changes:

  • query_group() now returns the v2 response format (QueryResponse with sources as a dictionary)
  • query_group_v2() has been removed - use query_group() instead
  • QueryResponse now contains the v2 format (was QueryResponseV2)
  • QueryResponseLegacy contains the old v1 format (was QueryResponse)

Added:

  • web_search parameter in query_group() to augment answers with web search results

Migration from v0.6.x:

# Old (v0.6.x)
response = await client.query_group_v2(group_id, query)
for key, node in response.sources.items():
    print(f"{key}: {node['source']}")

# New (v1.0.0) - query_group() now uses the v2 format
response = await client.query_group(group_id, query)
for key, node in response.sources.items():
    print(f"{key}: {node['source']}")

v0.6.0 (2024-12-18)

Added:

  • query_group_v2() method - New recommended way to query graphs with cleaner response structure
  • QueryResponseV2 response model with sources as a dictionary keyed by citation markers
  • web_search parameter in query_group_v2() to augment answers with web search results

Deprecated:

  • query_group() method - Will be removed in a future version. Use query_group_v2() instead.
  • QueryResponse model - Use QueryResponseV2 for new integrations.

v0.5.0 (2024-11-27)

Added:

  • return_source_data parameter to query_group() method
  • When return_source_data=True:
    • Documents include text field with summary content
    • Images include time_limited_url with a signed URL (60 min expiry)
    • Videos include time_limited_url with a signed URL to the segment (60 min expiry)

Example:

response = await client.query_group(group_id, query, return_source_data=True)
for source in response.sources:
    if source['node_type'] == 'document':
        print(f"Text: {source['text']}")
    else:
        print(f"URL: {source['time_limited_url']}")

v0.4.0 (2024-11-25)

Added:

  • attention_nodes field in QueryResponse - Access all context nodes fed to the LLM with their similarity scores
  • Cleaner nested structure: each attention node has source (metadata) and score (similarity)
  • Each attention node includes:
    • source: Complete source metadata (video/document/image details)
    • score: Similarity score for the query (0.0 to 1.0, higher = more relevant)

Example:

response = await client.query_group(group_id, query)
print(f"Cited: {len(response.sources)}, Total context: {len(response.attention_nodes)}")

# Access scores and sources
for node in response.attention_nodes:
    print(f"Score: {node['score']:.3f}, Type: {node['source']['node_type']}")

Migration from v0.1.x

Version 0.2.0 introduces breaking changes aligned with the new API architecture.

Changed

  • Authentication: Now uses API keys instead of bearer tokens

    # Old (v0.1.x)
    client = GraphonClient(token="xDhMfTDCpfwewocP93d5")
    
    # New (v0.2.0)
    client = GraphonClient(api_key="sk_live_...")
    
  • Upload workflow: Simplified to a single method

    # Old (v0.1.x)
    upload_infos = await client.generate_upload_urls(filenames)
    # ... manual upload logic ...
    await client.upload_files(file_paths)
    
    # New (v0.2.0)
    file_objects = await client.upload_and_process_files(file_paths)
    
  • Group creation: Now uses file_ids instead of uuid_directories

    # Old (v0.1.x)
    group_uuid = await client.create_index(uuid_directories)
    
    # New (v0.2.0)
    file_ids = [f.file_id for f in file_objects]
    group_id = await client.create_group(file_ids, group_name)
    
  • Querying: Updated method signature

    # Old (v0.1.x)
    answer = await client.query(group_uuid, query_text)  # Returns string
    
    # New (v0.2.0)
    response = await client.query_group(group_id, query)  # Returns QueryResponse
    print(response.answer)
    print(response.sources)
    

Removed

  • generate_upload_urls() - Replaced by upload_and_process_files()
  • upload_file_to_signed_url() - Handled internally
  • upload_files() - Replaced by upload_and_process_files()
  • create_index() - Replaced by create_group()

Support

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphon_client-0.7.1.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphon_client-0.7.1-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file graphon_client-0.7.1.tar.gz.

File metadata

  • Download URL: graphon_client-0.7.1.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for graphon_client-0.7.1.tar.gz
Algorithm Hash digest
SHA256 60e792677dbdbea50a0367e4153f084bea2eca0ef7f19a50298d6bd8dfdb9a2f
MD5 da3a9e135901003e065f06911368c5b6
BLAKE2b-256 d46fc358864d82995280187d7c9ba151781d5734dbc420f04a844d67447e4b4c

See more details on using hashes here.

File details

Details for the file graphon_client-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: graphon_client-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for graphon_client-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2ccf3875b0515c52af51ecdba51e42afa0750bb448e1b021cffaa3d83b6ca163
MD5 479a9b112a884a1edc20090548cbd8d8
BLAKE2b-256 74f515a8d33758b8e84316f7b924917fd8bf63040a890fbd88539b791b87875f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page