Python SDK for Infino API with AWS SigV4 authentication
Project description
Infino Python SDK
Official Python SDK for Infino - the context engine for your data stack.
Infino provides an intelligent unification layer for your data stack. Query Elasticsearch, OpenSearch, Snowflake, and other data sources in natural language, SQL, QueryDSL, or PromQL. Bring diverse data sources together for deeper analysis—no ETL required. Control access via fine-grained RBAC. All through one unified API.
Built for:
- Connect: Access your data sources without data movement
- Query: Natural language, SQL, Query DSL, and PromQL across all sources
- Correlate: Pull together data from different sources for cross-source correlation
- Govern: Fine-grained RBAC for your entire data stack
Table of Contents
- Quick Start
- API Reference
- Connect – Access Data Sources
- Query – Ask Questions
- Correlate – Cross-Source Operations
- Govern – Security & Access Control
- Error Handling
- Advanced Configuration
- Examples
- Development
- Support
Quick Start
Installation
pip install infino-sdk
Getting Your Credentials
- Sign up at app.infino.ws
- Create a new account (accounts can only be created through the UI)
- Navigate to Settings → API Keys
- Generate your
access_keyandsecret_key
Basic Usage
from infino_sdk import InfinoSDK
# Create SDK instance with your credentials
sdk = InfinoSDK(
access_key="your_access_key",
secret_key="your_secret_key",
endpoint="https://api.infino.ws"
)
# Check connection
info = sdk.ping()
print(f"Connected: {info}")
# Query a dataset
results = sdk.query_dataset_in_querydsl("my-dataset", '{"query": {"match_all": {}}}')
print(f"Found {len(results.get('hits', {}).get('hits', []))} records")
API Reference
Complete reference of SDK methods organized by category. Click any method to jump to its code example.
Initialization & Utilities (5 methods)
| Method | Description |
|---|---|
InfinoSDK(access_key, secret_key, endpoint, retry_config=None) |
Initialize SDK instance |
InfinoSDK.new(access_key, secret_key, endpoint) |
Alternative constructor |
InfinoSDK.new_with_retry(access_key, secret_key, endpoint, retry_config) |
Constructor with retry config |
ping() |
Health check endpoint |
close() |
Close HTTP session |
Context Manager: SDK supports with statement for automatic resource cleanup (see usage).
Datasets (11 methods)
| Method | Description |
|---|---|
create_dataset(dataset) |
Create empty dataset |
delete_dataset(dataset) |
Delete dataset |
get_datasets() |
List all datasets |
get_dataset_metadata(dataset) |
Get dataset metadata (count, size, etc.) |
get_dataset_schema(dataset) |
Get dataset field mappings |
upload_json_to_dataset(dataset, payload) |
Upload NDJSON bulk data |
upsert_to_dataset(query) |
Upsert via SQL INSERT/UPDATE |
upload_metrics_to_dataset(dataset, payload) |
Upload Prometheus metrics |
get_record(dataset, record_id) |
Get single record by ID |
delete_records(dataset, query) |
Delete records matching query |
enrich_dataset(dataset, policy) |
Configure dataset enrichment |
Query Methods (5 methods)
| Method | Description |
|---|---|
query_dataset_in_querydsl(dataset, query) |
Query using Elasticsearch/OpenSearch DSL |
query_dataset_in_sql(query) |
Query using SQL (supports joins) |
query_dataset_in_promql(query, dataset=None) |
Instant PromQL query |
query_dataset_in_promql_range(query, start, end, step, dataset=None) |
Range PromQL query |
query_source(connection_id, dataset, query) |
Query connected source in native DSL |
Fino AI (9 methods)
| Method | Description |
|---|---|
websocket_connect(path, headers=None) |
Connect to WebSocket for streaming |
list_threads() |
List all conversation threads |
create_thread(config) |
Create new thread |
get_thread(thread_id) |
Get thread details |
update_thread(thread_id, config) |
Update thread metadata |
delete_thread(thread_id) |
Delete thread |
add_thread_message(thread_id, message) |
Add message to thread |
clear_thread_messages(thread_id) |
Clear all thread messages |
send_message(payload) |
Send message (simplified API) |
Connections (12 methods)
| Method | Description |
|---|---|
get_sources() |
List available data source types |
get_connections() |
List active connections |
create_connection(source_type, config) |
Create connection |
get_connection(connection_id) |
Get connection status |
update_connection(connection_id, config) |
Update connection |
delete_connection(connection_id) |
Delete connection |
get_source_metadata(connection_id, dataset) |
Get metadata from source |
upload_file(dataset, file_path, format, ...) |
Upload file (JSON, JSONL, CSV) |
get_connector_job_status(run_id) |
Get async job status |
create_import_job(source_type, config) |
Create import job |
get_import_jobs() |
List import jobs |
delete_import_job(job_id) |
Delete import job |
RBAC & Governance (11 methods)
| Method | Description |
|---|---|
create_user(name, config) |
Create user (YAML or JSON) |
get_user(name) |
Get user details |
update_user(name, config) |
Update user |
delete_user(name) |
Delete user |
list_users() |
List all users |
create_role(name, config) |
Create role (YAML or JSON) |
get_role(name) |
Get role details |
update_role(name, config) |
Update role permissions |
delete_role(name) |
Delete role |
list_roles() |
List all roles |
rotate_keys(username) |
Rotate API keys for user |
Connect – Access Data Sources
Connect to data sources and query them in place without data movement.
Discover Available Sources
from infino_sdk import InfinoSDK
sdk = InfinoSDK(access_key, secret_key, endpoint)
# Get list of all available data source types
sources = sdk.get_sources()
for source in sources:
print(f"{source['name']}: {source['description']}")
Manage Connections
# Create connection to Elasticsearch
connection_config = {
"connector_id": "elasticsearch",
"name": "Production ES Cluster",
"config": {
"host": "https://es-cluster.example.com:9200",
"username": "elastic",
"password": "secret"
}
}
connection = sdk.create_connection("elasticsearch", connection_config)
print(f"Created connection: {connection['connection_id']}")
# List all active connections
connections = sdk.get_connections()
# Get connection status
status = sdk.get_connection("conn_abc123")
print(f"Status: {status['status']}")
# Update connection configuration
updated_config = {
"name": "Production ES Cluster - Updated",
"config": {
"host": "https://new-es-cluster.example.com:9200",
"username": "elastic",
"password": "new_secret"
}
}
sdk.update_connection("conn_abc123", updated_config)
# Delete connection
sdk.delete_connection("conn_old_123")
Query Connected Sources
# Query Elasticsearch (via QueryDSL)
results = sdk.query_source(
connection_id="conn_elasticsearch_prod",
dataset="external_logs",
query='{"query": {"match_all": {}}}'
)
# Query Snowflake (via SQL)
results = sdk.query_source(
connection_id="conn_snowflake_prod",
dataset="sales_data",
query="SELECT * FROM sales_data WHERE region='US' LIMIT 10"
)
# Get metadata from a connected source
metadata = sdk.get_source_metadata("conn_elasticsearch_prod", "logs-2024")
print(f"Fields: {metadata['mappings']}")
File Upload
Upload files directly to datasets without setting up connections. Supports JSON, JSONL, and CSV formats.
from infino_sdk import InfinoSDK
sdk = InfinoSDK(access_key, secret_key, endpoint)
# Upload a JSON file (synchronous - waits for completion)
result = sdk.upload_file(
dataset="my-dataset",
file_path="data.json",
format="json" # or "jsonl", "csv", "auto"
)
print(f"Uploaded {result['stats']['documents_processed']} documents")
# Upload a CSV file
result = sdk.upload_file("sales-data", "quarterly_sales.csv", format="csv")
# Upload with async mode (for large files)
result = sdk.upload_file(
dataset="large-dataset",
file_path="big_data.jsonl",
format="jsonl",
async_mode=True # Returns immediately with run_id
)
print(f"Job submitted: {result['run_id']}")
# Poll for completion
import time
while True:
status = sdk.get_connector_job_status(result['run_id'])
print(f"Status: {status['status']}")
if status['status'] in ('completed', 'failed'):
if status['status'] == 'completed':
print(f"Processed {status['stats']['documents_processed']} docs")
break
time.sleep(2)
Parameters:
dataset: Target dataset namefile_path: Path to the file to uploadformat: File format -"json","jsonl","csv", or"auto"(default: auto-detect)batch_size: Documents per batch (default: 1000)async_mode: If True, returns immediately withrun_idfor polling (default: False)
Import Jobs
Import data from connected sources into datasets for correlation.
# Create import job to bring data from Snowflake into a dataset
import_config = {
"connection_id": "conn_snowflake_prod",
"source_dataset": "sales_data",
"target_dataset": "sales-correlation",
"query": "SELECT * FROM sales_data WHERE date >= '2024-01-01'",
"schedule": "0 2 * * *" # Daily at 2 AM
}
job = sdk.create_import_job("snowflake", import_config)
print(f"Import job created: {job['job_id']}")
# List all import jobs
jobs = sdk.get_import_jobs()
for job in jobs:
print(f"Job {job['job_id']}: {job['status']}")
# Delete import job
sdk.delete_import_job("job_abc123")
Query – Ask Questions
Query any connected source or dataset with multiple interfaces.
Natural Language (Fino AI)
import asyncio
import json
async def query_with_fino():
sdk = InfinoSDK(access_key, secret_key, endpoint)
# Connect to Fino WebSocket
ws = await sdk.websocket_connect("/_conversation/ws")
try:
# Send your question
await ws.send(json.dumps({
"type": "query",
"content": "What are the top 5 products by revenue?"
}))
# Receive AI response
async for message in ws:
data = json.loads(message)
print(f"Fino: {data.get('content', '')}")
if data.get("type") == "complete":
break
finally:
await ws.close()
sdk.close()
# Run async
asyncio.run(query_with_fino())
Manage Conversation Threads
# Create and manage Fino conversation threads
sdk = InfinoSDK(access_key, secret_key, endpoint)
# List existing threads
threads = sdk.list_threads()
print(f"Found {len(threads)} threads")
# Create a new thread with optional metadata
thread = sdk.create_thread({
"name": "Sales Analysis",
"workflow_name": "alpha_v1"
})
print(f"Created thread: {thread['id']}")
# Get thread details
thread_info = sdk.get_thread(thread["id"])
print(f"Thread name: {thread_info['name']}")
# Update thread metadata
sdk.update_thread(thread["id"], {
"name": "Q4 Sales Analysis - Updated",
"metadata": {
"department": "finance",
"priority": "high"
}
})
# Add a message to the thread
sdk.add_thread_message(thread["id"], {
"content": {
"user_query": "What are the top 5 regions by revenue?"
},
"role": "user"
})
# Send a message using the simplified API (thread_id in payload)
response = sdk.send_message({
"thread_id": thread["id"],
"content": {
"user_query": "Show me week-over-week trends"
},
"role": "user"
})
# Clean up when finished
sdk.clear_thread_messages(thread["id"])
sdk.delete_thread(thread["id"])
SQL Queries
# Query a dataset with SQL
results = sdk.query_dataset_in_sql("SELECT * FROM products WHERE price > 100 LIMIT 10")
# With aggregations
results = sdk.query_dataset_in_sql("SELECT category, AVG(price) FROM products GROUP BY category")
Query DSL
# Simple query on a dataset
query = '{"query": {"match_all": {}}}'
results = sdk.query_dataset_in_querydsl("products", query)
# Complex query with filters
query = '''
{
"query": {
"bool": {
"must": [{"range": {"price": {"gte": 10, "lte": 100}}}],
"filter": [{"term": {"in_stock": true}}]
}
}
}
'''
results = sdk.query_dataset_in_querydsl("products", query)
# Query data source
results = sdk.query_source(
connection_id="conn_opensearch",
dataset="dataset",
query=query
)
PromQL (Time-Series)
# Instant PromQL query
result = sdk.query_dataset_in_promql(
'http_requests_total{status="200"}',
dataset="metrics_example",
)
# Range PromQL query
result = sdk.query_dataset_in_promql_range(
query='rate(http_requests_total[5m])',
start=1609459200,
end=1609545600,
step=300,
dataset="metrics_example",
)
Correlate – Cross-Source Operations
Use datasets to pull together data from different sources for correlation and analysis without schemas.
When to Use Datasets
- Cross-Source Joins: Correlate data from multiple data sources
- Unified Analysis: Ask deeper questions across silos
- Staging: Test queries before running in production
- Temporary Storage: Hold intermediate results for complex workflows
Create Datasets
# Create a dataset for staging
sdk.create_dataset("staging-analysis-2024")
Upload Data
# Upload JSON records to a dataset (NDJSON format)
bulk_data = '''
{"index": {"_id": "1"}}
{"product_id": "A123", "revenue": 15000, "@timestamp": "2024-10-15"}
{"index": {"_id": "2"}}
{"product_id": "B456", "revenue": 23000, "@timestamp": "2024-10-15"}
'''
sdk.upload_json_to_dataset("sales-correlation", bulk_data)
# Upload via SQL upsert (INSERT or UPDATE)
sql_upsert = """
INSERT INTO sales_correlation (_id, product_id, revenue, timestamp)
VALUES ('3', 'C789', 30000, '2024-10-15')
ON CONFLICT (_id) DO UPDATE SET revenue = 30000
"""
sdk.upsert_to_dataset(sql_upsert)
# Upload Prometheus metrics to a dataset
metrics_data = '''
# TYPE http_requests_total counter
http_requests_total{method="GET",status="200"} 1234 1609459200000
http_requests_total{method="POST",status="201"} 567 1609459200000
'''
sdk.upload_metrics_to_dataset("metrics-correlation", metrics_data)
Manage Datasets
# Get dataset metadata
metadata = sdk.get_dataset_metadata("sales-correlation")
print(f"Document count: {metadata['count']}")
# Get dataset schema
schema = sdk.get_dataset_schema("sales-correlation")
print(f"Fields: {schema['mappings']}")
# List all datasets
datasets = sdk.get_datasets()
for dataset in datasets:
print(f"Dataset: {dataset['name']}")
# Delete dataset
sdk.delete_dataset("old-staging-2023")
Record Operations
# Get a record
record = sdk.get_record("sales-correlation", "prod_123")
# Delete records
sdk.delete_records("sales-correlation", '{"query": {"range": {"@timestamp": {"lt": "2024-01-01"}}}}')
Dataset Enrichment
# Configure enrichment for a dataset
import json
enrich_policy = json.dumps({
"enrich_policy": {
"match_field": "user_id",
"enrich_fields": ["email", "name", "department"]
}
})
sdk.enrich_dataset("sales-correlation", enrich_policy)
Govern – Security & Access Control
Control access to your entire data stack with centralized governance for both humans and agents.
Complete Workflow Example
from infino_sdk import InfinoSDK
sdk = InfinoSDK(access_key, secret_key, endpoint)
# Step 1: Create a role with specific permissions
role_config = """
Version: 2025-01-01
Permissions:
- ResourceType: record
Actions: [read]
Resources: ["logs-*", "metrics-*"]
- ResourceType: metadata
Actions: [read]
Resources: ["*"]
"""
sdk.create_role("readonly-analyst", role_config)
# Step 2: Create user and assign the role
user_config = """
Version: 2025-01-01
Password: SecureP@ssw0rd123!
Roles:
- readonly-analyst
"""
sdk.create_user("analytics-agent", user_config)
# Step 3: Rotate API keys when needed
new_keys = sdk.rotate_keys()
print(f"New access key: {new_keys['access_key']}")
User Management
# List all users
users = sdk.list_users()
# Get specific user
user = sdk.get_user("analytics-agent")
# Update user password or roles
updated_config = """
Version: 2025-01-01
Password: NewP@ssw0rd456!
Roles:
- readonly-analyst
- data-viewer
"""
sdk.update_user("analytics-agent", updated_config)
# Delete user
sdk.delete_user("analytics-agent")
Role Management
# Create role with field-level security
role_with_masking = """
Version: 2025-01-01
Permissions:
- ResourceType: record
Actions: [read]
Resources: ["users-*"]
Fields:
Allow: ["id", "name", "email"]
Mask:
email: redact
ssn: remove
Deny:
- password
- api_key
"""
sdk.create_role("privacy-compliant-analyst", role_with_masking)
# List all roles
roles = sdk.list_roles()
for role in roles:
print(f"Role: {role['name']}")
# Get role details
role = sdk.get_role("readonly-analyst")
print(f"Permissions: {role['permissions']}")
# Update role permissions
updated_role = """
Version: 2025-01-01
Permissions:
- ResourceType: record
Actions: [read, write]
Resources: ["logs-*", "metrics-*"]
"""
sdk.update_role("readonly-analyst", updated_role)
# Delete role
sdk.delete_role("old-role")
Resource Types & Actions
Permissions use universal terminology that works across SQL, NoSQL, logs, and metrics:
| ResourceType | Actions | What It Controls |
|---|---|---|
metadata |
read |
View schemas, mappings, list datasets |
dataset |
create, delete |
Create/delete datasets |
record |
read, write |
Query/insert/update/delete records |
field |
N/A | Controlled via Fields in record permissions |
Centralized Governance: Apply consistent policies across all connected sources for both humans and agents.
Error Handling
from infino_sdk import InfinoSDK, InfinoError
async with InfinoSDK(access_key, secret_key, endpoint) as sdk:
try:
record = sdk.get_record("products", "missing_id")
except InfinoError as e:
if e.error_type == InfinoError.Type.REQUEST:
if e.status_code() == 404:
print("Record not found")
elif e.status_code() == 403:
print("Access denied - check user permissions")
elif e.status_code() == 401:
print("Authentication failed")
elif e.error_type == InfinoError.Type.NETWORK:
print(f"Network error: {e.message}")
Advanced Configuration
SDK Initialization
from infino_sdk import InfinoSDK, RetryConfig
# Standard initialization
sdk = InfinoSDK(
access_key="your_access_key",
secret_key="your_secret_key",
endpoint="https://api.infino.ws"
)
# Using class methods (alternative constructors)
sdk = InfinoSDK.new(access_key, secret_key, endpoint)
# With custom retry configuration
retry_config = RetryConfig()
retry_config.initial_interval = 500
retry_config.max_retries = 5
sdk = InfinoSDK.new_with_retry(
access_key,
secret_key,
endpoint,
retry_config
)
Context Manager Usage
from infino_sdk import InfinoSDK
# Automatic resource cleanup with context manager
with InfinoSDK(access_key, secret_key, endpoint) as sdk:
results = sdk.query_dataset_in_sql("SELECT * FROM products")
print(f"Found {len(results)} records")
# Session automatically closed on exit
# Manual resource management
sdk = InfinoSDK(access_key, secret_key, endpoint)
try:
results = sdk.query_dataset_in_sql("SELECT * FROM products")
finally:
sdk.close() # Explicitly close session
Custom Retry Configuration
from infino_sdk import InfinoSDK, RetryConfig
retry_config = RetryConfig()
retry_config.initial_interval = 500 # milliseconds
retry_config.max_interval = 30000 # milliseconds
retry_config.max_retries = 5
retry_config.max_elapsed_time = 180000 # milliseconds
sdk = InfinoSDK(
access_key=access_key,
secret_key=secret_key,
endpoint=endpoint,
retry_config=retry_config
)
Logging
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("infino_sdk")
logger.setLevel(logging.DEBUG)
# SDK logs all requests and signing operations
sdk = InfinoSDK(access_key, secret_key, endpoint)
sdk.ping()
Examples
Complete working examples organized by workflow:
Connect Examples
- basic_query.py - Query data sources with Query DSL
- file_upload.py - Upload JSON, JSONL, and CSV files
Query Examples
- sql_analytics.py - SQL queries across sources
- fino_nl.py - Natural language with Fino AI
- promql_metrics.py - PromQL time-series queries
Correlate Examples
- upload_json.py - Pull data together for cross-source analysis
Govern Examples
- user_management.py - Centralized access control
Utilities
- error_handling.py - Robust error handling patterns
Requirements
- Python 3.8 or higher
- aiohttp >= 3.8.0
- websockets >= 10.0
- backoff >= 2.0.0
Development
See CONTRIBUTING.md for development setup and contribution guidelines.
# Clone repository
git clone https://github.com/infinohq/infino-sdk.git
cd infino-sdk
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest
# Run linter
flake8 infino_sdk tests
# Check types
mypy infino_sdk
Support
- 📧 Email: support@infino.ai
- 📖 Documentation: docs.infino.ai
- 🐛 Issues: GitHub Issues
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Changelog
See CHANGELOG.md for version history and release notes.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file infino_sdk-0.4.1.tar.gz.
File metadata
- Download URL: infino_sdk-0.4.1.tar.gz
- Upload date:
- Size: 40.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a04617f8b73ca2c9f2b05b078e2baab9b7ffdb7244a188b7a52691c2a821025
|
|
| MD5 |
ac8fb894f1148ff1dd12f6f48e1957e1
|
|
| BLAKE2b-256 |
3aba1c94ce5162b084e67666ba9e7f085f42852d24b475a8083ce39cab4a89c9
|
File details
Details for the file infino_sdk-0.4.1-py3-none-any.whl.
File metadata
- Download URL: infino_sdk-0.4.1-py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60e64bf11503e11a57d8dab3a280652bcf266f4a561826a5a9ab916a3e04f0d1
|
|
| MD5 |
1265a31d5be6bde87fec00a4f422398e
|
|
| BLAKE2b-256 |
33268b8c9548593c00cb9637e51ef4b456554a97d7138b0aa799e88a7ecc9722
|