scibite-toolkit - python library for calling SciBite applications: TERMite, TExpress, SciBite Search, CENtree and Workbench. The library also enables processing of the JSON results from such requests
Project description
SciBite Toolkit
Python library for making API calls to SciBite's suite of products and processing the JSON responses.
Supported Products
- TERMite - Entity recognition and semantic enrichment (version 6.x)
- TERMite 7 - Next-generation entity recognition with modern OAuth2 authentication
- TExpress - Pattern-based entity relationship extraction
- CENtree - Ontology management, navigation, and integration
- SciBite Search - Semantic search, document and entity analytics
- Workbench - Dataset annotation and management
Installation
pip install scibite-toolkit
See versions on PyPI
Quick Start Examples
- TERMite 7 - Modern client with OAuth2
- TERMite 6 - Legacy client
- TExpress - Pattern matching
- SciBite Search
- CENtree - Ontology navigation
- Workbench
TERMite 7 Examples
TERMite 7 is the modern version with enhanced OAuth2 authentication and improved API.
OAuth2 Client Credentials (SaaS - Recommended)
For modern SaaS deployments using a separate authentication server:
from scibite_toolkit import termite7
# Initialize with context manager for automatic cleanup
with termite7.Termite7RequestBuilder() as t:
# Set URLs
t.set_url('https://termite.saas.scibite.com')
t.set_token_url('https://auth.saas.scibite.com')
# Authenticate with OAuth2 client credentials
if not t.set_oauth2('your_client_id', 'your_client_secret'):
print("Authentication failed!")
exit(1)
# Annotate text
t.set_entities('DRUG,INDICATION')
t.set_subsume(True)
t.set_text('Aspirin is used to treat headaches and reduce inflammation.')
response = t.annotate_text()
# Process the response
df = termite7.process_annotation_output(response)
print(df.head())
OAuth2 Password Grant (Legacy)
For on-premise deployments using username/password authentication:
from scibite_toolkit import termite7
t = termite7.Termite7RequestBuilder()
# Set main TERMite URL and token URL (same server for legacy)
t.set_url('https://termite.example.com')
t.set_token_url('https://termite.example.com')
# Authenticate with username and password
if not t.set_oauth2_legacy('client_id', 'username', 'password'):
print("Authentication failed!")
exit(1)
# Annotate a document
t.set_entities('INDICATION,DRUG')
t.set_parser_id('generic')
t.set_file('path/to/document.pdf')
response = t.annotate_document()
# Process the response
df = termite7.process_annotation_output(response)
print(df)
# Clean up file handles
t.close()
Get System Status
from scibite_toolkit import termite7
t = termite7.Termite7RequestBuilder()
t.set_url('https://termite.example.com')
t.set_token_url('https://auth.example.com')
t.set_oauth2('client_id', 'client_secret')
# Get system status
status = termite7.get_system_status(t.url, t.headers)
print(f"Server Version: {status['data']['serverVersion']}")
# Get available vocabularies
vocabs = termite7.get_vocabs(t.url, t.headers)
print(f"Available vocabularies: {len(vocabs['data'])}")
# Get runtime options
rtos = termite7.get_runtime_options(t.url, t.headers)
print(rtos)
TERMite 6 Examples
For legacy TERMite 6.x deployments.
SciBite Hosted (SaaS)
from scibite_toolkit import termite
# Initialize
t = termite.TermiteRequestBuilder()
# Configure
t.set_url('https://termite.saas.scibite.com')
t.set_saas_login_url('https://login.saas.scibite.com')
# Authenticate
t.set_auth_saas('username', 'password')
# Set runtime options
t.set_entities('INDICATION')
t.set_input_format('medline.xml')
t.set_output_format('json')
t.set_binary_content('path/to/file.xml')
t.set_subsume(True)
# Execute and process
response = t.execute()
df = termite.get_termite_dataframe(response)
print(df.head(3))
Local Instance (Customer Hosted)
from scibite_toolkit import termite
t = termite.TermiteRequestBuilder()
t.set_url('https://termite.local.example.com')
# Basic authentication for local instances
t.set_basic_auth('username', 'password')
# Configure and execute
t.set_entities('INDICATION')
t.set_input_format('medline.xml')
t.set_output_format('json')
t.set_binary_content('path/to/file.xml')
t.set_subsume(True)
response = t.execute()
df = termite.get_termite_dataframe(response)
print(df.head(3))
TExpress Examples
Pattern-based entity relationship extraction.
SciBite Hosted
from scibite_toolkit import texpress
t = texpress.TexpressRequestBuilder()
t.set_url('https://texpress.saas.scibite.com')
t.set_saas_login_url('https://login.saas.scibite.com')
t.set_auth_saas('username', 'password')
# Set pattern to find relationships
t.set_entities('INDICATION,DRUG')
t.set_pattern(':(DRUG):{0,5}:(INDICATION)') # Find DRUG within 5 words of INDICATION
t.set_input_format('medline.xml')
t.set_output_format('json')
t.set_binary_content('path/to/file.xml')
response = t.execute()
df = texpress.get_texpress_dataframe(response)
print(df.head())
Local Instance
from scibite_toolkit import texpress
t = texpress.TexpressRequestBuilder()
t.set_url('https://texpress.local.example.com')
t.set_basic_auth('username', 'password')
t.set_entities('INDICATION,DRUG')
t.set_pattern(':(INDICATION):{0,5}:(INDICATION)')
t.set_input_format('pdf')
t.set_output_format('json')
t.set_binary_content('/path/to/file.pdf')
response = t.execute()
df = texpress.get_texpress_dataframe(response)
print(df.head())
SciBite Search Example
Semantic search with entity-based queries and aggregations.
from scibite_toolkit import scibite_search
# Configure
s = scibite_search.SBSRequestBuilder()
s.set_url('https://yourdomain-search.saas.scibite.com/')
s.set_auth_url('https://yourdomain.saas.scibite.com/')
# Authenticate with OAuth2
s.set_oauth2('your_client_id', 'your_client_secret')
# Search documents
query = 'schema_id="clinical_trial" AND (title~INDICATION$D011565 AND DRUG$*)'
# Preferred: request specific fields using the new 'fields' parameter (legacy: 'additional_fields')
response = s.get_docs(query=query, markup=True, limit=100, fields=['*'])
# Get co-occurrence aggregations
# Find top 50 genes co-occurring with psoriasis
response = s.get_aggregates(
query='INDICATION$D011565',
vocabs=['HGNCGENE'],
limit=50
)
Note: Preferred parameter name is
fields. The legacyadditional_fieldsis still supported for backward compatibility. When both are provided,fieldstakes precedence.
CENtree Examples
Ontology navigation and search.
Modern Client (Recommended)
The modern centree_clients module provides better error handling, retries, and context manager support.
from scibite_toolkit.centree_clients import CENtreeReaderClient
# Use context manager for automatic cleanup
with CENtreeReaderClient(
base_url="https://centree.example.com",
bearer_token="your_token",
timeout=(3.0, None) # Quick connect, unlimited read
) as reader:
# Search by exact label
hits = reader.get_classes_by_exact_label("efo", "neuron")
print(f"Found {len(hits)} matches")
# Get ontology roots
roots = reader.get_root_entities("efo", "classes", size=10)
# Get paths from root to target (great for LLM grounding)
paths = reader.get_paths_from_root("efo", "MONDO_0007739", as_="labels")
for path in paths:
print(" → ".join(path))
# Or authenticate with OAuth2
from scibite_toolkit.centree_clients import CENtreeReaderClient
reader = CENtreeReaderClient(base_url="https://centree.example.com")
if reader.set_oauth2(client_id="...", client_secret="..."):
hits = reader.get_classes_by_exact_label("efo", "lung")
print(hits)
Workbench Example
Dataset management and annotation.
from scibite_toolkit import workbench
# Initialize
wb = workbench.WorkbenchRequestBuilder()
wb.set_url('https://workbench.example.com')
# Authenticate
wb.set_oauth2('client_id', 'username', 'password')
# Create dataset
wb.set_dataset_name('My Analysis Dataset')
wb.set_dataset_desc('Dataset for clinical trial analysis')
wb.create_dataset()
# Upload file
wb.set_file_input('path/to/data.xlsx')
wb.upload_file_to_dataset()
# Configure and run annotation
vocabs = [[5, 6], [8, 9]] # Vocabulary IDs
attrs = [200, 201] # Attribute IDs
wb.set_termite_config('', vocabs, attrs)
wb.auto_annotate_dataset()
Key Features
Context Manager Support (TERMite 7, CENtree Clients)
Modern clients support context managers for automatic resource cleanup:
with termite7.Termite7RequestBuilder() as t:
t.set_url('...')
# ... work with client ...
# File handles automatically closed
Error Handling
All OAuth2 methods return boolean status for easy error handling:
if not t.set_oauth2(client_id, client_secret):
print("Authentication failed - check credentials")
exit(1)
Logging
Enable detailed logging for debugging:
import logging
logging.basicConfig(level=logging.DEBUG)
# Or set per-client
t = termite7.Termite7RequestBuilder(log_level='DEBUG')
Session Management
All clients use requests.Session() for efficient connection pooling and automatic retry handling.
License
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scibite_toolkit-1.3.0.tar.gz.
File metadata
- Download URL: scibite_toolkit-1.3.0.tar.gz
- Upload date:
- Size: 101.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71ab5a50b9c7fa85488f78d96bd27d6dd4db1cbf93f1dcc4b244f0ef8b07494c
|
|
| MD5 |
3a7fc7dcfaeca158775c81c9cd97008e
|
|
| BLAKE2b-256 |
968061cd2a2fbe6606d0f4c1a00a86d342500a65010a1127a16d09c928e92881
|
File details
Details for the file scibite_toolkit-1.3.0-py3-none-any.whl.
File metadata
- Download URL: scibite_toolkit-1.3.0-py3-none-any.whl
- Upload date:
- Size: 110.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7ed93c96500c88284d196117da211d50e22530edb8c32baba0aa3d57cfe3dbe
|
|
| MD5 |
9d4dec12e56dbcca22256abff2822734
|
|
| BLAKE2b-256 |
c3c7696dd524b8a6e4f631799ef871a52dc9d87e7575a83388e88099e547d8b5
|