Production-ready Python SDK for the OHDSI Athena Concepts API
Project description
athena-client
A production-ready Python SDK for the OHDSI Athena Concepts API.
Installation
pip install athena-client
With optional dependencies:
pip install athena-client[cli] # Command-line interface
pip install athena-client[async] # Async client
pip install athena-client[pandas] # DataFrame output support
pip install athena-client[yaml] # YAML output format
pip install athena-client[crypto] # HMAC authentication
pip install athena-client[all] # All optional dependencies
Quick Start
from athena_client import Athena
# Create a client with default settings (public Athena server)
athena = Athena()
# Search for concepts
results = athena.search("aspirin")
# Various output formats
concepts = results.all() # List of Pydantic models
top_three = results.top(3) # First three results
as_dict = results.to_list() # List of dictionaries
as_json = results.to_json() # JSON string
as_df = results.to_df() # pandas DataFrame
# Get details for a specific concept
details = athena.details(concept_id=1127433)
# Get relationships
rels = athena.relationships(concept_id=1127433)
# Get graph
graph = athena.graph(concept_id=1127433, depth=5)
# Get comprehensive summary
summary = athena.summary(concept_id=1127433)
Concept Exploration - Finding Standard Concepts
The athena-client provides advanced concept exploration capabilities to help you find standard concepts that might not appear directly in search results. This is particularly useful when working with medical terminology where standard concepts may be referenced through synonyms, relationships, or cross-references.
Why Concept Exploration?
Medical terminology systems often have complex hierarchies where:
- Standard concepts are the preferred, canonical representations
- Non-standard concepts may be more commonly used terms
- Synonyms provide alternative names for the same concept
- Relationships connect related concepts across vocabularies
- Cross-references map concepts between different coding systems
The concept exploration functionality helps bridge the gap between user queries and standard medical concepts.
Basic Concept Exploration
from athena_client import Athena, create_concept_explorer
# Create client and explorer
athena = Athena()
explorer = create_concept_explorer(athena)
# Find standard concepts through exploration
results = explorer.find_standard_concepts(
query="headache",
max_exploration_depth=2,
include_synonyms=True,
include_relationships=True,
vocabulary_priority=['SNOMED', 'RxNorm', 'ICD10']
)
print(f"Direct matches: {len(results['direct_matches'])}")
print(f"Synonym matches: {len(results['synonym_matches'])}")
print(f"Relationship matches: {len(results['relationship_matches'])}")
print(f"Cross-references: {len(results['cross_references'])}")
Mapping to Standard Concepts with Confidence Scores
# Map a query to standard concepts with confidence scoring
mappings = explorer.map_to_standard_concepts(
query="migraine",
target_vocabularies=['SNOMED', 'RxNorm'],
confidence_threshold=0.5
)
for mapping in mappings:
concept = mapping['concept']
confidence = mapping['confidence']
path = mapping['exploration_path']
print(f"Concept: {concept.name}")
print(f"Vocabulary: {concept.vocabulary}")
print(f"Confidence: {confidence:.2f}")
print(f"Discovery path: {path}")
print()
Alternative Query Suggestions
When standard concepts aren't found directly, get alternative query suggestions:
# Get alternative query suggestions
suggestions = explorer.suggest_alternative_queries(
query="heart attack",
max_suggestions=8
)
print("Alternative suggestions:")
for suggestion in suggestions:
print(f" - {suggestion}")
# Test a suggestion
test_results = athena.search(suggestions[0], size=5)
standard_concepts = [c for c in test_results.all() if c.standardConcept == "Standard"]
print(f"Found {len(standard_concepts)} standard concepts")
Concept Hierarchy Exploration
Explore the hierarchical relationships of concepts:
# Get concept hierarchy
hierarchy = explorer.get_concept_hierarchy(
concept_id=12345,
max_depth=3
)
print(f"Root concept: {hierarchy['root_concept'].name}")
print(f"Parent relationships: {len(hierarchy['parents'])}")
print(f"Child relationships: {len(hierarchy['children'])}")
print(f"Sibling relationships: {len(hierarchy['siblings'])}")
# Show parent concepts
for parent in hierarchy['parents'][:3]:
print(f" Parent: {parent.targetConceptName} ({parent.relationshipName})")
Comprehensive Workflow Example
Here's a complete workflow for finding standard concepts:
def find_standard_concepts_workflow(query):
"""Comprehensive workflow for finding standard concepts."""
# Step 1: Try direct search first
direct_results = athena.search(query, size=10)
direct_standard = [c for c in direct_results.all() if c.standardConcept == "Standard"]
if direct_standard:
print(f"✅ Found {len(direct_standard)} standard concepts directly")
return direct_standard
# Step 2: Use concept exploration
print("🔍 Exploring for standard concepts...")
exploration_results = explorer.find_standard_concepts(
query=query,
max_exploration_depth=3,
include_synonyms=True,
include_relationships=True
)
# Step 3: Get high-confidence mappings
mappings = explorer.map_to_standard_concepts(
query=query,
confidence_threshold=0.4
)
if mappings:
print(f"✅ Found {len(mappings)} high-confidence mappings")
return [m['concept'] for m in mappings]
# Step 4: Try alternative queries
print("💡 Trying alternative queries...")
suggestions = explorer.suggest_alternative_queries(query, max_suggestions=5)
for suggestion in suggestions:
test_results = athena.search(suggestion, size=5)
standard_found = [c for c in test_results.all() if c.standardConcept == "Standard"]
if standard_found:
print(f"✅ Found standard concepts with suggestion: '{suggestion}'")
return standard_found
print("❌ No standard concepts found")
return []
# Use the workflow
standard_concepts = find_standard_concepts_workflow("myocardial infarction")
Advanced Configuration
Configure exploration behavior for your specific needs:
# Create explorer with custom configuration
explorer = ConceptExplorer(athena)
# Comprehensive exploration with all features
results = explorer.find_standard_concepts(
query="diabetes",
max_exploration_depth=3, # How deep to explore relationships
include_synonyms=True, # Explore synonyms
include_relationships=True, # Explore relationships
vocabulary_priority=[ # Preferred vocabularies
'SNOMED',
'RxNorm',
'ICD10',
'LOINC'
]
)
# High-confidence mapping with specific vocabularies
mappings = explorer.map_to_standard_concepts(
query="hypertension",
target_vocabularies=['SNOMED', 'ICD10'], # Only these vocabularies
confidence_threshold=0.7 # High confidence threshold
)
Use Cases
1. Clinical Decision Support
# Find standard concepts for clinical conditions
conditions = ["chest pain", "shortness of breath", "fever"]
standard_concepts = {}
for condition in conditions:
mappings = explorer.map_to_standard_concepts(
condition,
target_vocabularies=['SNOMED'],
confidence_threshold=0.6
)
if mappings:
standard_concepts[condition] = mappings[0]['concept']
2. Medication Mapping
# Map medication names to standard drug concepts
medications = ["aspirin", "ibuprofen", "acetaminophen"]
drug_concepts = {}
for med in medications:
mappings = explorer.map_to_standard_concepts(
med,
target_vocabularies=['RxNorm'],
confidence_threshold=0.5
)
if mappings:
drug_concepts[med] = mappings[0]['concept']
3. Cross-Vocabulary Mapping
# Map between different coding systems
icd10_concept = athena.search("diabetes", vocabulary="ICD10")[0]
snomed_mappings = explorer.map_to_standard_concepts(
icd10_concept.name,
target_vocabularies=['SNOMED'],
confidence_threshold=0.7
)
Best Practices
- Start with direct search - It's faster and often sufficient
- Use appropriate confidence thresholds - 0.5-0.7 for most use cases
- Specify target vocabularies - Focus on relevant coding systems
- Explore relationships - Useful for finding broader/narrower concepts
- Use synonyms - Helps with alternative terminology
- Monitor exploration depth - Balance thoroughness with performance
Performance Considerations
- Exploration depth affects performance - use 1-3 for most cases
- Vocabulary filtering reduces API calls and improves relevance
- Confidence thresholds help focus on high-quality matches
- Caching can be implemented for frequently used mappings
Error Handling
The concept exploration functionality includes robust error handling:
try:
mappings = explorer.map_to_standard_concepts("diabetes")
print(f"Found {len(mappings)} mappings")
except Exception as e:
print(f"Exploration failed: {e}")
# Fall back to direct search
results = athena.search("diabetes")
This concept exploration functionality helps ensure you can find the standard medical concepts you need, even when they don't appear directly in search results.
Error Handling
The athena-client provides automatic error handling and recovery out of the box. You don't need to implement try-catch blocks - the client handles errors gracefully and provides clear, actionable messages:
from athena_client import Athena
athena = Athena()
# Automatic error handling - no try-catch needed!
results = athena.search("aspirin")
print(f"Found {len(results.all())} concepts")
# If there are network issues, the client automatically retries
# If there are API errors, you get clear, actionable messages
details = athena.details(concept_id=1127433)
print(f"Concept: {details.name}")
What Happens Automatically
✅ Network errors are automatically retried (up to 3 attempts)
✅ API errors provide clear, actionable messages
✅ Timeout issues are handled with exponential backoff
✅ Invalid parameters are caught with helpful suggestions
✅ Missing resources are reported with context
Advanced Error Handling (Optional)
If you want more control, you can still use try-catch blocks:
from athena_client import Athena
from athena_client.exceptions import NetworkError, APIError, ClientError
athena = Athena()
try:
results = athena.search("aspirin")
print(f"Found {len(results.all())} concepts")
except NetworkError as e:
print(f"Network issue: {e}")
# Error includes troubleshooting suggestions
except APIError as e:
print(f"API issue: {e}")
# Specific API error messages with context
except ClientError as e:
print(f"Client error: {e}")
# HTTP 4xx errors with status codes
except Exception as e:
print(f"Unexpected error: {e}")
Disabling Auto-Retry
If you prefer to handle retries yourself, you can disable automatic retry:
# Disable automatic retry for this call
results = athena.search("aspirin", auto_retry=False)
# Or disable for all calls
athena = Athena(max_retries=0)
Advanced Retry Configuration
Developers have fine-grained control over retry behavior:
# Configure retry settings at client level
athena = Athena(
max_retries=5, # Maximum retry attempts
retry_delay=2.0, # Fixed delay between retries (seconds)
enable_throttling=True, # Enable request throttling
throttle_delay_range=(0.1, 0.5), # Throttling delay range (min, max)
timeout=30 # Request timeout
)
# Override retry settings for specific calls
results = athena.search(
"aspirin",
max_retries=3, # Override max retries for this call
retry_delay=1.0 # Override retry delay for this call
)
Detailed Retry Error Reporting
When retries fail, you get comprehensive error information:
try:
results = athena.search("aspirin")
except RetryFailedError as e:
print(f"Retry failed after {e.max_attempts} attempts")
print(f"Last error: {e.last_error}")
print(f"Retry history: {e.retry_history}")
# Error includes detailed retry information and troubleshooting
Retry Configuration Options
| Option | Description | Default | Example |
|---|---|---|---|
max_retries |
Maximum retry attempts for network errors | 3 | max_retries=5 |
retry_delay |
Fixed delay between retries (overrides exponential backoff) | None | retry_delay=2.0 |
enable_throttling |
Enable request throttling to prevent overwhelming server | True | enable_throttling=False |
throttle_delay_range |
Range of delays for throttling (min, max) in seconds | (0.1, 0.3) | throttle_delay_range=(0.2, 0.5) |
timeout |
Request timeout in seconds | 15 | timeout=30 |
Error Types
- NetworkError: DNS, connection, socket issues
- TimeoutError: Request timeout issues
- ClientError: 4xx HTTP status codes
- ServerError: 5xx HTTP status codes
- AuthenticationError: 401/403 authentication issues
- RateLimitError: 429 rate limiting issues
- ValidationError: Data validation failures
- APIError: API-specific error responses
Error Message Features
✅ Clear explanations of what went wrong
✅ Context about where the error occurred
✅ Specific troubleshooting suggestions
✅ Error codes for programmatic handling
✅ User-friendly language (not technical jargon)
✅ Automatic retry for recoverable errors
Enhanced Large Query Handling
The athena-client provides intelligent handling for large queries with enhanced timeouts, progress tracking, and user-friendly error messages.
Intelligent Timeout Management
Different operations use optimized timeouts based on query complexity:
from athena_client import Athena
# Default timeouts are automatically adjusted based on query size
athena = Athena()
# Small queries: 30s timeout
results = athena.search("aspirin 325mg tablet")
# Large queries: 45s+ timeout (auto-adjusted)
results = athena.search("pain") # Estimated 5000+ results
# Complex graphs: 60s+ timeout
graph = athena.graph(concept_id, depth=3, zoom_level=3)
Progress Tracking for Long Operations
Large queries automatically show progress bars with ETA:
# Progress tracking is enabled by default for large queries
results = athena.search("diabetes", size=100)
# Shows: Searching for 'diabetes': [██████████████████████████████] 100.0% (100/100) 2.3s
# Disable progress tracking if needed
results = athena.search("diabetes", show_progress=False)
User-Friendly Warnings
The client warns about potentially large queries:
results = athena.search("pain")
# Output:
# ⚠️ Large query detected: 'pain' (estimated 5,000+ results)
# 💡 Suggestions:
# • Add more specific terms to narrow results
# • Use domain or vocabulary filters
# • Consider using smaller page sizes
# • This query may take several minutes to complete
Smart Pagination
Enhanced pagination with automatic validation and optimization:
# Automatic page size validation
try:
results = athena.search("aspirin", size=2000) # Too large
except ValueError as e:
print(e) # "Page size 2000 exceeds maximum allowed size of 1000"
# Smart defaults based on query size
results = athena.search("pain") # Uses smaller page size for large queries
Enhanced Error Messages for Large Queries
Specific error messages for timeout and complexity issues:
try:
results = athena.search("very broad search term")
except APIError as e:
print(e)
# Output:
# Search timeout: The query 'very broad search term' is taking too long to process.
# Try:
# • Using more specific search terms
# • Adding domain or vocabulary filters
# • Reducing the page size
# • Breaking the query into smaller parts
Configuration for Large Queries
Fine-tune large query behavior:
from athena_client.settings import get_settings
settings = get_settings()
# Timeout configuration
settings.ATHENA_SEARCH_TIMEOUT_SECONDS = 60 # Search operations
settings.ATHENA_GRAPH_TIMEOUT_SECONDS = 90 # Graph operations
settings.ATHENA_RELATIONSHIPS_TIMEOUT_SECONDS = 60 # Relationship queries
# Pagination configuration
settings.ATHENA_DEFAULT_PAGE_SIZE = 50 # Default page size
settings.ATHENA_MAX_PAGE_SIZE = 1000 # Maximum page size
settings.ATHENA_LARGE_QUERY_THRESHOLD = 100 # Threshold for "large" queries
# Progress configuration
settings.ATHENA_SHOW_PROGRESS = True # Enable progress tracking
settings.ATHENA_PROGRESS_UPDATE_INTERVAL = 2.0 # Update interval (seconds)
Large Query Best Practices
# 1. Use specific search terms
results = athena.search("acute myocardial infarction") # Better than "heart attack"
# 2. Add filters to narrow results
results = athena.search("diabetes", domain="Condition", vocabulary="SNOMED")
# 3. Use smaller page sizes for large queries
results = athena.search("pain", size=20) # Instead of 100
# 4. Enable progress tracking for visibility
results = athena.search("cancer", show_progress=True)
# 5. Monitor and adjust timeout settings
athena = Athena(timeout=60) # Increase timeout for complex operations
Large Query Features
✅ Automatic timeout adjustment based on query complexity
✅ Progress tracking with ETA for long operations
✅ User-friendly warnings for potentially large queries
✅ Smart pagination with automatic validation
✅ Enhanced error messages with specific suggestions
✅ Memory-efficient processing for large result sets
✅ Configurable thresholds for different query types
CLI Usage
# Install CLI dependencies
pip install "athena-client[cli]"
# Search for concepts
athena search "aspirin"
# Get details for a specific concept
athena details 1127433
# Get a summary with various output formats
athena summary 1127433 --output yaml
Configuration
The client can be configured through:
- Constructor arguments
- Environment variables
- A
.envfile - Default values
# Explicit configuration
athena = Athena(
base_url="https://custom.athena.server/api/v1",
token="your-bearer-token",
timeout=15,
max_retries=5
)
Or use environment variables:
ATHENA_BASE_URL=https://custom.athena.server/api/v1
ATHENA_TOKEN=your-bearer-token
ATHENA_TIMEOUT_SECONDS=15
ATHENA_MAX_RETRIES=5
Advanced Query DSL
For complex queries, use the Query DSL:
from athena_client.query import Q
# Build complex queries
q = (Q.term("diabetes") & Q.term("type 2")) | Q.exact('"diabetic nephropathy"')
# Use with search
results = athena.search(q)
Property-Based Tests
We use Hypothesis for edge-case discovery. New core utilities or parsers must include at least one Hypothesis scenario.
Modern Installation & Packaging
This project uses the modern Python packaging standard with pyproject.toml for build and dependency management. You do not need to use setup.py for installation or development. Instead, use the following commands:
Install with pip (recommended)
pip install .
Or, for development (editable install with dev dependencies):
Note: For editable installs with extras, make sure you have recent versions of pip and setuptools:
pip install --upgrade pip setuptools
pip install -e '.[dev]'
Why pyproject.toml?
- All build, dependency, and metadata configuration is in
pyproject.toml. - Compatible with modern Python tooling (pip, build, poetry, etc).
setup.pyis only needed for legacy or advanced customizations.
For more details, see Packaging Python Projects.
Documentation
For complete documentation, visit: https://athena-client.readthedocs.io
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file athena_client-1.0.5.tar.gz.
File metadata
- Download URL: athena_client-1.0.5.tar.gz
- Upload date:
- Size: 64.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
657125e01a9f89cc62c32881f0fc94a3a1b3897340b1fb9c864d69289600f86a
|
|
| MD5 |
49db89fe16d1f4720be57eab0e7dcee4
|
|
| BLAKE2b-256 |
3d9642fd4bae6d38da657335bbb05a14d44d57a0b18575b160cc39cd3b80de29
|
Provenance
The following attestation bundles were made for athena_client-1.0.5.tar.gz:
Publisher:
publish.yml on aandresalvarez/athena_client
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
athena_client-1.0.5.tar.gz -
Subject digest:
657125e01a9f89cc62c32881f0fc94a3a1b3897340b1fb9c864d69289600f86a - Sigstore transparency entry: 243661567
- Sigstore integration time:
-
Permalink:
aandresalvarez/athena_client@4b174ef229a02ddfed2a6870fa7b8099d7335406 -
Branch / Tag:
refs/tags/v1.0.5 - Owner: https://github.com/aandresalvarez
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4b174ef229a02ddfed2a6870fa7b8099d7335406 -
Trigger Event:
push
-
Statement type:
File details
Details for the file athena_client-1.0.5-py3-none-any.whl.
File metadata
- Download URL: athena_client-1.0.5-py3-none-any.whl
- Upload date:
- Size: 41.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
061288552d62692a8fb4c22e73d378d64cb10a0a655615470a19914267cc9c46
|
|
| MD5 |
1b3b1eca012e012e6667af646c91f39a
|
|
| BLAKE2b-256 |
df428f12deb0cd2ed23af37092e8e04c767143915b0ff0f64b1d0fa9882b810a
|
Provenance
The following attestation bundles were made for athena_client-1.0.5-py3-none-any.whl:
Publisher:
publish.yml on aandresalvarez/athena_client
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
athena_client-1.0.5-py3-none-any.whl -
Subject digest:
061288552d62692a8fb4c22e73d378d64cb10a0a655615470a19914267cc9c46 - Sigstore transparency entry: 243661585
- Sigstore integration time:
-
Permalink:
aandresalvarez/athena_client@4b174ef229a02ddfed2a6870fa7b8099d7335406 -
Branch / Tag:
refs/tags/v1.0.5 - Owner: https://github.com/aandresalvarez
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4b174ef229a02ddfed2a6870fa7b8099d7335406 -
Trigger Event:
push
-
Statement type: