Skip to main content

A Python client library for the Graphon API - Build knowledge graphs from your files

Project description

Graphon Client

A Python client library for the Graphon API - Build unified knowledge graphs from your files.

Features

  • 🚀 Simple API: Upload files, create knowledge bases, and query them in just a few lines
  • 📁 Multi-format Support: Process videos (MP4, MOV), documents (PDF, DOCX), and images (JPG, PNG)
  • 🔄 Async/Await: Built on httpx for high-performance async operations
  • 🎯 Auto Type Detection: Automatically detects file types from extensions
  • 📊 Status Polling: Built-in polling for long-running operations
  • 🔒 Secure: API key authentication with role-based access control

Installation

pip install graphon-client

Quick Start

import asyncio
from graphon_client import GraphonClient

async def main():
    # Initialize client with your API key
    client = GraphonClient(api_key="your_api_key_here")
    
    # Upload and process files (auto-detects file types)
    file_objects = await client.upload_and_process_files(
        ["/path/to/video.mp4", "/path/to/document.pdf"],
        poll_until_complete=True  # Wait for processing to complete
    )
    
    # Create a knowledge graph from processed files
    file_ids = [f.file_id for f in file_objects]
    group_id = await client.create_group(
        file_ids,
        group_name="My Knowledge Base",
        wait_for_ready=True  # Wait for graph building to complete
    )
    
    # Query your knowledge graph
    response = await client.query_group(
        group_id,
        "What are the main topics discussed?"
    )
    print(response.answer)
    
    # Access sources by citation key
    for key, source in response.sources.items():
        print(f"{key}: {source['source']['node_type']}")

asyncio.run(main())

One-Shot Convenience Method

For the simplest workflow, use the all-in-one method:

async def quick_start():
    client = GraphonClient(api_key="your_api_key_here")
    
    # Upload, process, and create group in one call
    group_id = await client.upload_process_and_create_group(
        file_paths=["/path/to/file1.pdf", "/path/to/file2.mp4"],
        group_name="My Knowledge Base"
    )
    
    # Query immediately
    response = await client.query_group(group_id, "Summarize the content")
    print(response.answer)

Get Your API Key

  1. Sign up at https://graphon.ai
  2. Navigate to Settings → API Keys
  3. Create a new API key
  4. Save it securely (it's only shown once!)

Core Concepts

Files

Individual files that are uploaded and processed. Each file gets a unique file_id and goes through processing to extract content and build a knowledge graph.

Processing Statuses:

  • UNPROCESSED: File uploaded but not yet processed
  • PROCESSING: File is being processed
  • SUCCESS: Processing completed successfully
  • FAILURE: Processing failed

Groups

Collections of files with a unified knowledge graph. Query multiple files together as a single knowledge base.

Key Concept: Create Once, Query Many Times

Once you create a group, you can query the same group_id as many times as you want. There's no need to re-create the group for the same set of files. Groups are persistent knowledge bases - creating the group builds the graph (a one-time operation), and querying is lightweight and can be done repeatedly.

# Create the group once
group_id = await client.create_group(file_ids, "My Knowledge Base", wait_for_ready=True)

# Query as many times as you want with the same group_id
response1 = await client.query_group(group_id, "What are the main topics?")
response2 = await client.query_group(group_id, "Summarize the key findings")
response3 = await client.query_group(group_id, "What conclusions were drawn?")

Graph Statuses:

  • pending: Group created but no files added yet
  • building: Unified graph is being built
  • ready: Graph is ready for querying
  • failed: Graph building failed

API Reference

Client Initialization

client = GraphonClient(
    api_key="your_api_key_here",
    base_url="https://api-frontend-485250924682.us-central1.run.app"  # Optional
)

File Operations

Upload and Process Files

file_objects = await client.upload_and_process_files(
    file_paths=["/path/to/file1.pdf", "/path/to/file2.mp4"],
    poll_until_complete=True,  # Wait for processing (default: True)
    timeout=1800,  # Max wait time in seconds (default: 30 min)
    poll_interval=3,  # Check status every N seconds (default: 3)
    on_progress=lambda step, current, total: print(f"{step}: {current}/{total}")
)

# Returns list of FileObject with file_id, file_name, processing_status

Get File Status

file_detail = await client.get_file_status(file_id)
print(f"Status: {file_detail.processing_status}")

List Files

files = await client.list_files(
    status_filter="SUCCESS",  # Optional: filter by status
    file_type="video"  # Optional: filter by type
)

Poll File Until Complete (Manual)

file_detail = await client.poll_file_until_complete(
    file_id,
    timeout=1800,
    poll_interval=3,
    on_progress=lambda status: print(f"Status: {status}")
)

Group Operations

Create Group

group_id = await client.create_group(
    file_ids=["file-id-1", "file-id-2"],
    group_name="My Knowledge Base",
    wait_for_ready=True,  # Wait for graph building (default: False)
    timeout=3600,  # Max wait time in seconds (default: 1 hour)
    poll_interval=5,  # Check status every N seconds (default: 5)
    on_progress=lambda status: print(f"Graph status: {status}")
)

Query Group

Query a group's unified knowledge graph with sources mapped by citation keys:

response = await client.query_group(
    group_id="group-id-here",
    query="What are the key insights?",
    return_source_data=False,  # Set to true to get content/URLs (default: False)
    web_search=False  # Set to true to augment with web search (default: False)
)

print(response.answer)
# Answer contains inline citations like [1], [2], etc.

# Sources are keyed by citation markers - separate cited from non-cited
cited = {k: v for k, v in response.sources.items() if v.get("is_cited")}
other = {k: v for k, v in response.sources.items() if not v.get("is_cited")}

print("Cited Sources:")
for key, node in cited.items():
    source = node['source']
    print(f"  {key}: {source['node_type']} (score: {node['score']:.3f})")

print("Other Relevant Sources:")
for key, node in other.items():
    source = node['source']
    print(f"  {key}: {source['node_type']} (score: {node['score']:.3f})")

Response Structure:

  • answer: Generated answer with inline citation markers like [1], [2]
  • sources: Dictionary mapping citation keys to source metadata:
    • source: Source metadata (node_type, file_id, and type-specific fields)
    • score: Relevance score (0.0 to 1.0)
    • is_cited: Whether this source was explicitly cited in the answer

Get Group Status

group_detail = await client.get_group_status(group_id)
print(f"Graph status: {group_detail.graph_status}")
print(f"Files in group: {len(group_detail.file_ids)}")

List Groups

groups = await client.list_groups()
for group in groups:
    print(f"{group.group_name} - {group.graph_status} - {group.file_count} files")

Note: list_groups() returns GroupListItem objects (summary view with file_count), not full GroupDetail objects. Use get_group_status(group_id) for complete details.

Poll Group Until Ready (Manual)

group_detail = await client.poll_group_until_ready(
    group_id,
    timeout=3600,
    poll_interval=5,
    on_progress=lambda status: print(f"Status: {status}")
)

Supported File Types

Videos

  • .mp4, .mov, .avi, .mkv, .webm
  • Automatic transcription and scene analysis
  • Maximum size: Check API limits

Documents

  • .pdf, .doc, .docx, .txt
  • Text extraction and semantic analysis
  • Maximum size: Check API limits

Images

  • .jpg, .jpeg, .png, .gif, .webp
  • OCR and visual analysis
  • Maximum size: Check API limits

Error Handling

try:
    file_objects = await client.upload_and_process_files(file_paths)
except Exception as e:
    print(f"Upload failed: {e}")

try:
    response = await client.query_group(group_id, query)
    print(response.answer)
except Exception as e:
    if "not ready" in str(e):
        print("Graph is still building, please wait")
    else:
        print(f"Query failed: {e}")

Advanced Usage

Progress Tracking

def track_progress(step: str, current: int, total: int):
    percent = (current / total) * 100
    print(f"[{step}] {percent:.1f}% ({current}/{total})")

file_objects = await client.upload_and_process_files(
    file_paths,
    on_progress=track_progress
)

Without Waiting (Non-blocking)

# Start uploads and processing without waiting
file_objects = await client.upload_and_process_files(
    file_paths,
    poll_until_complete=False  # Returns immediately
)

# Poll manually later
for file_obj in file_objects:
    file_detail = await client.poll_file_until_complete(file_obj.file_id)
    print(f"{file_detail.file_name}: {file_detail.processing_status}")

Custom Timeouts

# For large files that take longer to process
file_objects = await client.upload_and_process_files(
    file_paths,
    poll_until_complete=True,
    timeout=3600,  # 1 hour timeout
    poll_interval=10  # Check every 10 seconds
)

# For large groups that take longer to build
group_id = await client.create_group(
    file_ids,
    group_name="Large Knowledge Base",
    wait_for_ready=True,
    timeout=7200,  # 2 hour timeout
    poll_interval=15  # Check every 15 seconds
)

Changelog

v1.0.0 (2024-12-18)

Breaking Changes:

  • query_group() now returns the v2 response format (QueryResponse with sources as a dictionary)
  • query_group_v2() has been removed - use query_group() instead
  • QueryResponse now contains the v2 format (was QueryResponseV2)
  • QueryResponseLegacy contains the old v1 format (was QueryResponse)

Added:

  • web_search parameter in query_group() to augment answers with web search results

Migration from v0.6.x:

# Old (v0.6.x)
response = await client.query_group_v2(group_id, query)
for key, node in response.sources.items():
    print(f"{key}: {node['source']}")

# New (v1.0.0) - query_group() now uses the v2 format
response = await client.query_group(group_id, query)
for key, node in response.sources.items():
    print(f"{key}: {node['source']}")

v0.6.0 (2024-12-18)

Added:

  • query_group_v2() method - New recommended way to query graphs with cleaner response structure
  • QueryResponseV2 response model with sources as a dictionary keyed by citation markers
  • web_search parameter in query_group_v2() to augment answers with web search results

Deprecated:

  • query_group() method - Will be removed in a future version. Use query_group_v2() instead.
  • QueryResponse model - Use QueryResponseV2 for new integrations.

v0.5.0 (2024-11-27)

Added:

  • return_source_data parameter to query_group() method
  • When return_source_data=True:
    • Documents include text field with summary content
    • Images include time_limited_url with a signed URL (60 min expiry)
    • Videos include time_limited_url with a signed URL to the segment (60 min expiry)

Example:

response = await client.query_group(group_id, query, return_source_data=True)
for source in response.sources:
    if source['node_type'] == 'document':
        print(f"Text: {source['text']}")
    else:
        print(f"URL: {source['time_limited_url']}")

v0.4.0 (2024-11-25)

Added:

  • attention_nodes field in QueryResponse - Access all context nodes fed to the LLM with their similarity scores
  • Cleaner nested structure: each attention node has source (metadata) and score (similarity)
  • Each attention node includes:
    • source: Complete source metadata (video/document/image details)
    • score: Similarity score for the query (0.0 to 1.0, higher = more relevant)

Example:

response = await client.query_group(group_id, query)
print(f"Cited: {len(response.sources)}, Total context: {len(response.attention_nodes)}")

# Access scores and sources
for node in response.attention_nodes:
    print(f"Score: {node['score']:.3f}, Type: {node['source']['node_type']}")

Migration from v0.1.x

Version 0.2.0 introduces breaking changes aligned with the new API architecture.

Changed

  • Authentication: Now uses API keys instead of bearer tokens

    # Old (v0.1.x)
    client = GraphonClient(token="xDhMfTDCpfwewocP93d5")
    
    # New (v0.2.0)
    client = GraphonClient(api_key="sk_live_...")
    
  • Upload workflow: Simplified to a single method

    # Old (v0.1.x)
    upload_infos = await client.generate_upload_urls(filenames)
    # ... manual upload logic ...
    await client.upload_files(file_paths)
    
    # New (v0.2.0)
    file_objects = await client.upload_and_process_files(file_paths)
    
  • Group creation: Now uses file_ids instead of uuid_directories

    # Old (v0.1.x)
    group_uuid = await client.create_index(uuid_directories)
    
    # New (v0.2.0)
    file_ids = [f.file_id for f in file_objects]
    group_id = await client.create_group(file_ids, group_name)
    
  • Querying: Updated method signature

    # Old (v0.1.x)
    answer = await client.query(group_uuid, query_text)  # Returns string
    
    # New (v0.2.0)
    response = await client.query_group(group_id, query)  # Returns QueryResponse
    print(response.answer)
    print(response.sources)
    

Removed

  • generate_upload_urls() - Replaced by upload_and_process_files()
  • upload_file_to_signed_url() - Handled internally
  • upload_files() - Replaced by upload_and_process_files()
  • create_index() - Replaced by create_group()

Support

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphon_client-0.7.0.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphon_client-0.7.0-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file graphon_client-0.7.0.tar.gz.

File metadata

  • Download URL: graphon_client-0.7.0.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for graphon_client-0.7.0.tar.gz
Algorithm Hash digest
SHA256 9155efed49b890ee6b7a1793f3d47ce011cbef9e2313b5e3e7a9c2b637ff8965
MD5 baadf73c69f8b28ac481ae161d0a03da
BLAKE2b-256 7aae4d1d70297e93c7d363e393b3ad53494ea6e75a68e72de23544ca373dba08

See more details on using hashes here.

File details

Details for the file graphon_client-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: graphon_client-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for graphon_client-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e6343b92d13ece6bfc93abecec6618cd22dcb363c49a815db29db3db86f450ae
MD5 4b8a5d6a31ccd8e567982a4fe64a9e20
BLAKE2b-256 e1ff635dde832b8568bafc11c9442963444b92cfee6d6148227af53b377cc9e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page