Official Python SDK for the Text2Everything API
Project description
Text2Everything SDK
The official Python SDK for the Text2Everything API, providing easy access to text-to-SQL conversion, project management, and data operations.
Features
- Unified Client: Single entry point for all API operations
- Type Safety: Full Pydantic model integration with IDE support
- Error Handling: Comprehensive exception hierarchy with detailed error information
- Retry Logic: Automatic retry with exponential backoff for failed requests
- Pagination: Automatic handling of paginated responses
- Resource Management: Organized clients for each API resource type
- Context Manager: Proper resource cleanup with context manager support
- Custom Tools: Upload and manage custom Python tools with directory-based creation
- Multipart File Uploads: Native support for file uploads with proper Content-Type handling
- Nested Validation: Comprehensive schema validation with nested field requirements
- Environment Configuration: Support for .env files for easy local development setup
Installation
Install from PyPI:
pip install h2o-text-2-everything
# With optional dependencies
pip install h2o-text-2-everything[integrations] # pandas, jupyter, h2o-drive
pip install h2o-text-2-everything[dev] # development tools
pip install h2o-text-2-everything[docs] # documentation tools
For development installation and other options, see INSTALLATION.md
Quick Start
from text2everything_sdk import Text2EverythingClient
# Initialize the client
client = Text2EverythingClient(
base_url="https://your-api-endpoint.com",
access_token="your-access-token",
workspace_name="workspaces/my-workspace"
)
# Create a project
project = client.projects.create(
name="My Project",
description="A sample project for text-to-SQL conversion"
)
# Add context information
context = client.contexts.create(
project_id=project.id,
name="Business Rules",
content="Important business context and rules...",
is_always_displayed=True
)
# Add schema metadata
schema = client.schema_metadata.create(
project_id=project.id,
name="Customers Table",
description="Customer information table",
schema_data={
"table": {
"name": "customers",
"columns": [
{"name": "id", "type": "INTEGER"},
{"name": "name", "type": "VARCHAR(100)"},
{"name": "email", "type": "VARCHAR(255)"}
]
}
}
)
# Create a chat session
session = client.chat_sessions.create(project_id=project.id)
# Generate SQL for a query
response = client.chat.chat_to_sql(
project_id=project.id,
chat_session_id=session.id,
query="Show me all customers from California",
)
print(f"Generated SQL: {response.sql_query}")
API Resources
The SDK provides clients for all Text2Everything API resources:
Projects
# List projects
projects = client.projects.list()
# Get project by ID
project = client.projects.get("project_id")
# Create project
project = client.projects.create(name="New Project")
# Update project
project = client.projects.update("project_id", name="Updated Name")
# Delete project
client.projects.delete("project_id")
Contexts
# Add business context
context = client.contexts.create(
project_id="project_id",
name="Business Rules",
content="Context content...",
is_always_displayed=True
)
# List contexts for a project
contexts = client.contexts.list(project_id="project_id")
Schema Metadata
# Add table schema
table = client.schema_metadata.create(
project_id="project_id",
name="Users Table",
schema_data={
"table": {
"name": "users",
"columns": [...]
}
}
)
# Add dimension
dimension = client.schema_metadata.create(
project_id="project_id",
name="User Status",
schema_data={
"table": {
"dimension": {
"name": "status",
"content": {...}
}
}
}
)
Golden Examples
# Add example query-SQL pairs
example = client.golden_examples.create(
project_id="project_id",
name="High Value Customers",
user_query="Show me customers with orders over $1000",
sql_query="SELECT * FROM customers WHERE total_orders > 1000",
description="Example for high-value customer queries"
)
Chat Sessions and Chat
# Create chat session
session = client.chat_sessions.create(project_id="project_id")
# Convert natural language to SQL
resp = client.chat.chat_to_sql(
project_id="project_id",
chat_session_id=session.id,
query="Your natural language query here",
)
# Or convert and execute
ans = client.chat.chat_to_answer(
project_id="project_id",
chat_session_id=session.id,
query="Top 10 customers by revenue",
connector_id="your-connector-id"
)
Custom Tools
# Create custom tool from individual files
with open("my_tool.py", "rb") as f:
tool = client.custom_tools.create(
name="My Custom Tool",
description="A custom Python tool for data processing",
files=[f]
)
# Create custom tool from directory (uploads all Python files)
tool = client.custom_tools.create_from_directory(
name="Data Processing Suite",
description="Complete data processing toolkit",
directory_path="/path/to/tool/directory"
)
# List custom tools
tools = client.custom_tools.list()
# Get custom tool details
tool = client.custom_tools.get("tool_id")
# Update custom tool
updated_tool = client.custom_tools.update(
"tool_id",
name="Updated Tool Name",
description="Updated description"
)
# Delete custom tool
client.custom_tools.delete("tool_id")
Connectors
# Add database connector
connector = client.connectors.create(
name="Production DB",
db_type="postgres",
host="localhost",
port=5432,
username="user",
password="password",
database="mydb"
)
# Test connection
result = client.connectors.test_connection(connector.id)
Bulk Operations
The SDK provides efficient bulk delete operations for managing multiple resources at once:
Bulk Delete Contexts
# Delete multiple contexts in one operation
context_ids = ["id1", "id2", "id3"]
result = client.contexts.bulk_delete(project_id="project_id", context_ids=context_ids)
print(f"Deleted: {result['deleted_count']}")
print(f"Failed: {result.get('failed_ids', [])}")
Bulk Delete Schema Metadata
# Bulk delete schemas (automatically handles split groups)
schema_ids = ["schema1", "schema2", "schema3"]
result = client.schema_metadata.bulk_delete(project_id="project_id", schema_ids=schema_ids)
# Returns structured response with success/failure details
print(f"Successfully deleted {result['deleted_count']} schemas")
Bulk Delete Golden Examples
# Delete multiple examples at once
example_ids = ["ex1", "ex2", "ex3"]
result = client.golden_examples.bulk_delete(project_id="project_id", example_ids=example_ids)
Bulk Delete Feedback
# Clean up multiple feedback items
feedback_ids = ["fb1", "fb2", "fb3"]
result = client.feedback.bulk_delete(project_id="project_id", feedback_ids=feedback_ids)
Chat Presets
Chat presets allow you to create reusable chat configurations with predefined settings, connectors, and prompt templates:
Creating and Managing Presets
# Create a basic chat preset with existing template
preset = client.chat_presets.create(
project_id="project_id",
name="Production Analytics",
collection_name="analytics_collection",
description="Preset for production data analysis",
prompt_template_id="template_id",
connector_id="connector_id",
chat_settings={
"llm": "gpt-4",
"include_chat_history": "auto"
}
)
# NOTE: Inline template creation - API limitation
# The prompt_template parameter is accepted for API parity but not currently processed.
# To use a custom template, create it first then reference by ID:
template = client.chat_presets.create_prompt_template(
project_id="project_id",
name="Custom Analytics Template",
system_prompt="You are an expert data analyst specializing in...",
description="Template for advanced analytics queries"
)
preset = client.chat_presets.create(
project_id="project_id",
name="Advanced Analytics",
collection_name="advanced_collection",
prompt_template_id=template["id"], # Use the created template ID
connector_id="connector_id",
workspace_id="workspace_123"
)
# Create preset with sharing and workspace settings
preset = client.chat_presets.create(
project_id="project_id",
name="Shared Team Preset",
collection_name="team_collection",
prompt_template={
"name": "Team Template",
"system_prompt": "You are a helpful assistant for the team..."
},
share_prompt_with_usernames=["user1@example.com", "user2@example.com"],
workspace_id="workspace_123",
t2e_url="https://custom-t2e.example.com"
)
# List all presets
presets = client.chat_presets.list(project_id="project_id")
# Search for specific presets
support_presets = client.chat_presets.list(
project_id="project_id",
search="support"
)
# Get specific preset by collection ID
preset = client.chat_presets.get(
project_id="project_id",
collection_id="collection_id"
)
# Update preset
updated = client.chat_presets.update(
project_id="project_id",
collection_id="collection_id",
name="Updated Analytics Preset",
description="Updated description",
chat_settings={
"llm": "gpt-4-turbo",
"include_chat_history": "true"
}
)
# Delete preset
client.chat_presets.delete(
project_id="project_id",
collection_id="collection_id"
)
Managing Prompt Templates
# Add prompt template to preset
template = client.chat_presets.add_prompt_template(
project_id="project_id",
preset_id="preset_id",
template_name="Analysis Template",
template_content="Analyze the following data: {query}"
)
# List templates for a preset
templates = client.chat_presets.list_prompt_templates(
project_id="project_id",
preset_id="preset_id"
)
# Delete template
client.chat_presets.delete_prompt_template(
project_id="project_id",
preset_id="preset_id",
template_id="template_id"
)
Using Presets in Chat Sessions
# Activate a preset for use
client.chat_presets.activate(project_id="project_id", preset_id="preset_id")
# Get currently active preset
active = client.chat_presets.get_active(project_id="project_id")
# Create chat session from preset
session = client.chat_sessions.create_from_preset(
project_id="project_id",
preset_id="preset_id"
)
# Or use the active preset
session = client.chat_sessions.create_from_active_preset(project_id="project_id")
Advanced Features
Project Collections
Access and manage H2OGPTE collections for your project resources:
# List all collections for a project
collections = client.projects.list_collections(project_id="project_id")
for collection in collections:
print(f"{collection.component_type}: {collection.h2ogpte_collection_id}")
# Get collection by type
contexts_collection = client.projects.get_collection_by_type(
project_id="project_id",
component_type="contexts"
)
Execution Cache Lookup
Query the execution cache to find similar past queries for performance optimization:
# Look up cached executions for a query
cache_result = client.chat.execution_cache_lookup(
project_id="project_id",
user_query="Show me top 10 customers",
connector_id="connector_id",
similarity_threshold=0.8, # 0.0 to 1.0
top_n=5, # Return top 5 matches
only_positive_feedback=True # Only include positively rated executions
)
# Check if we got a cache hit
if cache_result.cache_hit:
print(f"Found {len(cache_result.matches)} similar executions")
for match in cache_result.matches:
print(f"Similarity: {match.similarity_score}")
print(f"SQL: {match.execution.sql_query}")
print(f"Results: {match.execution.results}")
Schema Splitting for Large Tables
Tables with more than 8 columns are automatically split into multiple parts. The create() method returns:
- Single
SchemaMetadataResponsefor small schemas (≤8 columns) List[SchemaMetadataResponse]for large schemas (>8 columns)
result = client.schema_metadata.create(
project_id="project_id",
name="My Table",
schema_data=my_schema_data
)
# Always check the return type
if isinstance(result, list):
print(f"Schema split into {len(result)} parts")
# All parts share the same split_group_id
else:
print(f"Created single schema: {result.id}")
📖 For complete documentation on working with split schemas, see:
docs/guides/schema_metadata.md- Basic split handlingdocs/how-to/bulk_operations.md- Bulk operations with splits
Error Handling
The SDK provides comprehensive error handling:
from text2everything_sdk import (
Text2EverythingClient,
AuthenticationError,
ValidationError,
NotFoundError,
RateLimitError
)
try:
project = client.projects.get("invalid_id")
except NotFoundError as e:
print(f"Project not found: {e.message}")
except AuthenticationError as e:
print(f"Authentication failed: {e.message}")
except ValidationError as e:
print(f"Validation error: {e.message}")
print(f"Details: {e.response_data}")
except RateLimitError as e:
print(f"Rate limit exceeded. Retry after: {e.retry_after} seconds")
Configuration
Environment Variables
You can configure the SDK using environment variables:
export TEXT2EVERYTHING_BASE_URL="https://your-api-endpoint.com"
export T2E_ACCESS_TOKEN="your-oidc-access-token"
export T2E_WORKSPACE_NAME="workspaces/my-workspace"
import os
from text2everything_sdk import Text2EverythingClient
client = Text2EverythingClient(
base_url=os.getenv("TEXT2EVERYTHING_BASE_URL"),
access_token=os.getenv("T2E_ACCESS_TOKEN"),
workspace_name=os.getenv("T2E_WORKSPACE_NAME")
)
.env File Support
For local development, create a .env file in your project root:
# .env file
T2E_BASE_URL=https://your-api-endpoint.com
T2E_ACCESS_TOKEN=your-oidc-access-token
T2E_WORKSPACE_NAME=workspaces/my-workspace
The SDK will automatically load these variables when running tests:
# The SDK automatically loads .env files for testing
from text2everything_sdk import Text2EverythingClient
# These will be loaded from .env file automatically
client = Text2EverythingClient()
Advanced Configuration
client = Text2EverythingClient(
base_url="https://your-api-endpoint.com",
access_token="your-oidc-access-token",
workspace_name="workspaces/my-workspace",
timeout=60, # Request timeout in seconds
max_retries=5, # Maximum retry attempts
retry_delay=2.0 # Initial retry delay in seconds
)
Context Manager
Use the client as a context manager for proper resource cleanup:
with Text2EverythingClient(base_url="...", access_token="...", workspace_name="workspaces/dev") as client:
projects = client.projects.list()
# Client will be automatically closed when exiting the context
Pagination
The SDK automatically handles pagination for list operations:
# Get all projects (automatically handles pagination)
all_projects = client.projects.list()
# Manual pagination control
page1_projects = client.projects.list(page=1, per_page=10)
page2_projects = client.projects.list(page=2, per_page=10)
Schema Validation
The SDK includes comprehensive nested field validation for schema metadata:
Required Nested Fields
Different schema types require specific nested fields:
- Tables:
schema_metadata.tableandschema_metadata.table.columns - Dimensions:
schema_metadata.table,schema_metadata.table.dimension, andschema_metadata.table.dimension.content - Metrics:
schema_metadata.table,schema_metadata.table.metric, andschema_metadata.table.metric.content - Relationships:
schema_metadata.relationship
Validation Examples
# Valid table schema
table_schema = {
"table": {
"name": "customers",
"columns": [
{"name": "id", "type": "INTEGER"},
{"name": "name", "type": "VARCHAR(100)"}
]
}
}
# Valid dimension schema
dimension_schema = {
"table": {
"name": "customers",
"dimension": {
"name": "customer_status",
"content": {
"type": "categorical",
"values": ["active", "inactive", "pending"]
}
}
}
}
# Valid metric schema
metric_schema = {
"table": {
"name": "orders",
"metric": {
"name": "total_revenue",
"content": {
"aggregation": "sum",
"column": "amount"
}
}
}
}
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Run the test suite
- Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Documentation: https://h2oai.github.io/text-2-everything-py/
- Issues: GitHub Issues
- Email: support@h2o.ai
Changelog
v0.1.7 (Current)
- 100% API Parity Achieved: Complete coverage of all Text2Everything API endpoints
- Bulk Delete Operations: Added bulk delete support for contexts, schema metadata, golden examples, and feedback
- Chat Presets: Full CRUD operations for chat presets with prompt templates and active preset management
- Project Collections: List and retrieve project collections by type
- Execution Cache Lookup: Query execution cache for performance optimization
- Schema Split Groups: Automatic handling of large table schemas (>8 columns)
- Custom Tools Support: Full CRUD operations for custom Python tools
- Directory-based Tool Creation: Upload entire directories as custom tools
- Multipart File Upload: Native support for file uploads with proper Content-Type handling
- Enhanced Validation: Comprehensive nested field validation for schema metadata
- Environment Configuration: Added .env file support for local development
- Improved Testing: Enhanced test suite with automatic environment loading
- Bug Fixes: Resolved Content-Type header conflicts in multipart requests
v0.1.0
- Initial release
- Complete API coverage for all Text2Everything endpoints
- Type-safe Pydantic models
- Comprehensive error handling
- Automatic pagination and retry logic
- Context manager support
- Integration with existing H2O Drive SDK
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file h2o_text_2_everything-0.1.7rc4.tar.gz.
File metadata
- Download URL: h2o_text_2_everything-0.1.7rc4.tar.gz
- Upload date:
- Size: 86.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63aa83aac5168be307fd01feb07c3ebd0d6afc80201b2ed077e0e768d8be23ba
|
|
| MD5 |
ba2c02d55a23fb19e67f6ad81130ae0a
|
|
| BLAKE2b-256 |
f7838729e3058ff0f65b6253041df361cf0743418c40dfa7b3c0ccd776d021b9
|
Provenance
The following attestation bundles were made for h2o_text_2_everything-0.1.7rc4.tar.gz:
Publisher:
release.yml on h2oai/text-2-everything-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
h2o_text_2_everything-0.1.7rc4.tar.gz -
Subject digest:
63aa83aac5168be307fd01feb07c3ebd0d6afc80201b2ed077e0e768d8be23ba - Sigstore transparency entry: 653132689
- Sigstore integration time:
-
Permalink:
h2oai/text-2-everything-py@5d71ee86f51ad972d72559ec05a0e0aedb7d1f03 -
Branch / Tag:
refs/tags/v0.1.7-rc4 - Owner: https://github.com/h2oai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5d71ee86f51ad972d72559ec05a0e0aedb7d1f03 -
Trigger Event:
push
-
Statement type:
File details
Details for the file h2o_text_2_everything-0.1.7rc4-py3-none-any.whl.
File metadata
- Download URL: h2o_text_2_everything-0.1.7rc4-py3-none-any.whl
- Upload date:
- Size: 114.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4b3581cc42a8c648789b0a45967c02395ac4d032ef787a6c396def405f4c077
|
|
| MD5 |
1a5f5718953e82f549f42289d7e3778d
|
|
| BLAKE2b-256 |
c4f30199a659fef5cf8a8d3b82815a8f76e7a9f00c2c29e394bb8d11148a2baf
|
Provenance
The following attestation bundles were made for h2o_text_2_everything-0.1.7rc4-py3-none-any.whl:
Publisher:
release.yml on h2oai/text-2-everything-py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
h2o_text_2_everything-0.1.7rc4-py3-none-any.whl -
Subject digest:
e4b3581cc42a8c648789b0a45967c02395ac4d032ef787a6c396def405f4c077 - Sigstore transparency entry: 653132690
- Sigstore integration time:
-
Permalink:
h2oai/text-2-everything-py@5d71ee86f51ad972d72559ec05a0e0aedb7d1f03 -
Branch / Tag:
refs/tags/v0.1.7-rc4 - Owner: https://github.com/h2oai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5d71ee86f51ad972d72559ec05a0e0aedb7d1f03 -
Trigger Event:
push
-
Statement type: