Professional SDK for DSF Label Adaptive Formula API
Project description
DSF Label SDK
Accelerate AI development with programmatic data classification. Reduce labeling costs and time with configurable heuristics.
Why DSF Label?
Manual data labeling is slow and expensive. This SDK transforms domain expertise into configurable heuristics that classify datasets at scale with confidence scoring.
Built on the DSF evaluation formula (weighted similarity with parameter scaling), ensuring consistency across all DSF products.
Core Concepts
Define weighted heuristics based on domain knowledge. The system evaluates data points against these rules, producing classification scores. Uncertain cases get flagged for review.
Installation
pip install dsf-label-sdk
Quick Start
Community Edition
from dsf_label import LabelSDK
sdk = LabelSDK()
# Build configuration
config = (sdk.create_config()
.add_field('feature_a', reference=True, params={'importance': 5.0})
.add_field('feature_b', reference=True, params={'importance': 4.0})
.add_field('feature_c', reference=50, params={'importance': 2.0})
)
# Classify data point
data = {
'feature_a': True,
'feature_b': True,
'feature_c': 45
}
result = sdk.evaluate(data, config)
print(f"Score: {result.score:.3f}")
print(f"Above threshold: {result.is_above_threshold}")
Note: For boolean fields, reference=True means expected value is True. Similarity computed as exact match.
Professional Edition
import pandas as pd
sdk = LabelSDK(
license_key='PRO-2026-12-31-XXXX',
tier='professional'
)
# Batch processing
df = pd.read_csv('unlabeled_data.csv')
results = sdk.batch_evaluate(
data_points=df.to_dict('records'),
config=config
)
# Access metrics
metrics = sdk.get_metrics()
print(f"Confidence level: {metrics['confidence_level']:.3f}")
print(f"Average score: {metrics['avg_score']:.3f}")
Enterprise Edition
sdk = LabelSDK(
license_key='ENT-2026-12-31-XXXX',
tier='enterprise',
mode='adaptive'
)
# Process large datasets with auto-calibration
for batch in dataset_batches:
results = sdk.batch_evaluate(batch, config)
# View adaptation metrics
metrics = sdk.get_metrics()
print(f"Evaluations: {metrics['evaluations']}")
print(f"Auto-calibrated: {metrics.get('auto_calibration', False)}")
Config Builder
sdk = LabelSDK()
# Chained calls
config = (sdk.create_config()
.add_field('metric_a', reference=20, params={'importance': 3.0})
.add_field('metric_b', reference=1.0, params={'importance': 2.5})
)
# Sequential calls
config = sdk.create_config()
for field_name, params in field_definitions.items():
config.add_field(field_name, **params)
Context Manager
with LabelSDK(license_key='...', tier='enterprise') as sdk:
result = sdk.evaluate(data, config)
metrics = sdk.get_metrics()
Error Handling
from dsf_label import LabelSDK, LicenseError, ValidationError
try:
sdk = LabelSDK(license_key='invalid', tier='professional')
result = sdk.evaluate(data, config)
except LicenseError as e:
sdk = LabelSDK() # Fallback to community
except ValidationError as e:
print(f"Invalid configuration: {e}")
Tier Comparison
| Feature | Community | Professional | Enterprise |
|---|---|---|---|
| Classifications/month | Unlimited† | Unlimited | Unlimited |
| Single Evaluation | ✅ | ✅ | ✅ |
| Batch Processing | ❌ | ✅ | ✅ |
| DataFrame Support | ❌ | ✅ | ✅ |
| Adaptive Thresholds | ❌ | ✅ | ✅ |
| Performance Metrics | ❌ | ✅ | ✅ Enhanced |
| Auto-Calibration | ❌ | ❌ | ✅ |
| Adaptive Modes | ❌ | ❌ | ✅ |
| Support | Community | Priority SLA |
†Subject to fair-use policies. Community tier free for evaluation. Production requires registration.
Enterprise Features
Auto-Calibration (Enterprise)
System optimizes heuristic parameters based on data patterns.
Adaptation Modes (Enterprise)
# Standard: Full history
sdk = LabelSDK(tier='enterprise', mode='standard')
# Adaptive: Recent patterns priority
sdk = LabelSDK(tier='enterprise', mode='adaptive')
Cache Management (Enterprise)
sdk.invalidate_cache() # Reset when patterns change
Configuration Guidelines
config = {
'field_name': {
'reference': expected_value,
'params': {'importance': <float>} # 0.0-5.0
}
}
Internally: All field contributions computed using DSF evaluation formula (weighted similarity with parameter scaling), ensuring consistency across all DSF SDKs.
Hybrid Integration
Integrate ML models as additional heuristics:
# Example: Load pre-trained models
bert_classifier = your_model_loader('sentiment') # Conceptual
xgboost_model = your_model_loader('risk')
# Hybrid configuration
config = {
'metric_standard': {'reference': 100, 'params': {'importance': 3.0}},
'bert_score': {'reference': 0.8, 'params': {'importance': 4.0}},
'xgb_score': {'reference': 0.3, 'params': {'importance': 4.5}}
}
# Process with ensemble
def process_item(item_data):
bert_pred = bert_classifier(item_data['text'])
xgb_pred = xgboost_model.predict(item_data['features'])
hybrid_data = {
'metric_standard': item_data['value'],
'bert_score': bert_pred,
'xgb_score': xgb_pred
}
return sdk.evaluate(hybrid_data, config)
API Reference
LabelSDK
Methods:
__init__(tier='community', license_key=None, mode='standard')evaluate(data, config)- Single evaluationbatch_evaluate(data_points, config)- Batch processing (Pro/Enterprise)create_config()- Config builderget_metrics()- Performance statistics (Pro/Enterprise)set_confidence_level(level)- Threshold adjustmentinvalidate_cache()- Reset cache (Enterprise)
EvaluationResult
Attributes:
score(float): Confidence score [0.0-1.0]tier(str): License tierconfidence_level(float): Current thresholdis_above_threshold(bool): Threshold comparisonmetrics(dict): Metrics (Pro/Enterprise)
ConfigBuilder
Methods:
add_field(name, reference, params={'importance': 1.0})remove_field(name)build()
Performance Metrics (Pro/Enterprise)
metrics = sdk.get_metrics()
print(f"Evaluations: {metrics['evaluations']}")
print(f"Average score: {metrics['avg_score']:.3f}")
print(f"Confidence level: {metrics['confidence_level']:.3f}")
# Enterprise metrics
if metrics.get('auto_calibration'):
print(f"Adapted fields: {metrics['adapted_fields']}")
Migration Example
Before:
for data_point in dataset:
label = human_annotator.label(data_point)
labeled_data.append((data_point, label))
After:
for data_point in dataset:
result = sdk.evaluate(data_point, config)
if result.score > 0.75:
labeled_data.append((data_point, 'POSITIVE'))
elif result.score < 0.35:
labeled_data.append((data_point, 'NEGATIVE'))
else:
review_queue.append(data_point)
Use Cases
Classification Tasks
config = {
'indicator_a': {'reference': True, 'params': {'importance': 5.0}},
'metric_b': {'reference': 0.8, 'params': {'importance': 3.0}},
'count_c': {'reference': 2, 'params': {'importance': 2.5}}
}
Quality Assessment
config = {
'quality_metric': {'reference': 0.8, 'params': {'importance': 4.0}},
'completeness': {'reference': 1.0, 'params': {'importance': 3.0}},
'consistency': {'reference': 0.9, 'params': {'importance': 3.5}}
}
FAQ
How accurate are classifications?
Accuracy depends on heuristic design. Well-designed configs achieve comparable results to human annotators at scale.
Can I use with active learning?
Yes. Use confidence scores to identify uncertain examples for human review.
Difference between modes?
- Standard: Uses full history for adaptation
- Adaptive: Prioritizes recent patterns (Enterprise only)
When to invalidate cache?
When data distribution changes significantly (new categories, seasonal shifts, different sources).
⚠️ Important Notes
Client Responsibility:
Clients must validate classifications and compliance. This SDK is a classification support tool and does not make autonomous decisions.
Data Processing:
All logic executes server-side. SDK exposes configuration interface only.
Model Outputs:
ML model inputs are client-provided and client-controlled.
Support
Documentation: https://dsfuptech.cloud
Issues: contacto@dsfuptech.cloud
Licensing: contacto@dsfuptech.cloud
Licensing
License Format:
- Professional:
PRO-YYYY-MM-DD-XXXX-XXXX - Enterprise:
ENT-YYYY-MM-DD-XXXX-XXXX
Contact: contacto@dsfuptech.cloud
📋 Credits
Technology Architect: Jaime Alexander Jimenez
© 2025 DSF UpTech. All rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dsf_label_sdk-2.0.0.tar.gz.
File metadata
- Download URL: dsf_label_sdk-2.0.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00d35f646619181da449501c9ae9db373bde91d97e718792dc2acac606a605b8
|
|
| MD5 |
c0eab42452ab98a26862b94f930ce21c
|
|
| BLAKE2b-256 |
b430a7bc8049a6a4a029f6ecc89571114f890ef51483836afbdb425ba77581a4
|
File details
Details for the file dsf_label_sdk-2.0.0-py3-none-any.whl.
File metadata
- Download URL: dsf_label_sdk-2.0.0-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f9c698cdcecd01b3ef897382659c8f7410a057e64eae94af63b602faacd5ad8
|
|
| MD5 |
a460a982cdcc833c2186bd5d4a76c3ed
|
|
| BLAKE2b-256 |
c44ad2d40821b43b71ea73a325e3d239d58be33cc478ac14038b2785052eeb94
|