Purpose-scoped ADK agents for SDC4 data operations — SMB Edition (local LLM via Ollama)
Project description
SDC Agents SMB
Purpose-scoped ADK agents for SDC4 data operations — SMB Edition.
Designed for personal and small/medium business usage. Uses a local LLM via Ollama instead of a Google API key, while connecting to the commercial SDCStudio SaaS backend for catalog, validation, and assembly APIs.
Positioning
| SDC Agents | SDC Agents SMB | SDC Agents Sovereign | |
|---|---|---|---|
| Target | Enterprise | Personal / SMB | Air-gapped / Regulated |
| Backend | SDCStudio SaaS | SDCStudio SaaS | SDCStudioSov (local) |
| LLM | Gemini (Google API key) | Local via Ollama | Local via Ollama |
| Google API Key | Required | Not required | Not required |
| BigQuery | Yes | No | No |
| Vertex AI Search | Yes | No | No |
| Wallet/Billing | Yes | Yes | No (site-licensed) |
Agents
8 purpose-scoped agents with 39+ tools (core + ToolsetHub plugins):
| Agent | Tools | Network | Datasource | Purpose |
|---|---|---|---|---|
| Catalog | 7 | HTTPS | None | Discover schemas, download artifacts and packages |
| Introspect | 6+ | None | Read-only | Extract datasource structure (SQL, CSV, JSON, MongoDB + ToolsetHub) |
| Mapping | 3 | None | None | Map columns to semantic components |
| Generator | 3 | None | Read-only | Produce XML instances from mapped data |
| Validation | 3 | HTTPS | None | Validate and sign XML via VaaS API |
| Distribution | 5 | Local | None | Route artifacts to Fuseki, Neo4j, REST, filesystem |
| Knowledge | 3 | None | Read-only | Ingest context into ChromaDB vector store |
| Assembly | 7 | HTTPS | None | Discover components, HITL review, assemble data models |
Introspect dynamically loads ToolsetHub plugins — install [notion], [sheets], or [airtable] for SMB-native datasource support.
Features
Core Pipeline
- Introspect datasources with 13-field standardized column analysis and 10 type inference patterns
- Discover matching catalog components with type compatibility scoring
- HITL review gate for billable minting operations — see costs before committing
- Assemble data models via the SDCStudio Assembly API (sync + async with hybrid polling)
- Download published data model packages (.zip with XSD, XML, JSON, JSON-LD, HTML, SHA1)
- Generate XML instances from mapped datasource records
- Validate instances against XSD 1.1 schemas via VaaS API (deterministic, not probabilistic)
- Distribute artifact packages to Fuseki, Neo4j, REST APIs, or filesystem
SMB-Native Datasources (ToolsetHub)
- Notion — database properties, relations, rollups, select options (
pip install sdc-agents-smb[notion]) - Google Sheets — headers, inferred column types, sheet metadata (
pip install sdc-agents-smb[sheets]) - Airtable — field types, linked records, formula/lookup fields (
pip install sdc-agents-smb[airtable]) - Community extensible — add HubSpot, QuickBooks, Salesforce by following the reference pattern
Automation
- Scheduler — cron-based pipeline automation via APScheduler (
sdc-agents schedule run) - Notifications — push status to Slack webhooks, Telegram bots, or SMTP email
- Pipeline templates — 7 bundled workflows (
sdc-agents pipeline run healthcare-csv -p datasource=patients)
Data Governance
- Schema drift detection — compare current structure against cached previous introspection, alerts on changes
- Data annotations — agents auto-detect anomalies (null violations, mixed date formats, sentinel values); users add manual notes; annotations persist across sessions and auto-merge into future introspections
- Cross-datasource lineage — track data flow from source through mapping, generation, validation, to distribution
- Compliance reports — generate JSON/Markdown/HTML evidence from audit + lineage logs (
sdc-agents compliance report) - Append-only audit — every tool call logged with credential redaction to
.sdc-cache/audit.jsonl
Integrations
- MCP server mode — serve any agent as an MCP server for Claude Desktop, Cursor, etc.
- Audit dashboard — web UI for browsing, filtering, and exporting audit records (
sdc-agents audit serve) - OpenClaw skill — 9-tool bridge exposing SDC tools to OpenClaw's messaging platform ecosystem
Quick Start
1. Install Ollama and pull a model
# Install Ollama: https://ollama.com/download
ollama pull gemma4:26b
2. Install SDC Agents SMB
pip install sdc-agents-smb
# Optional extras:
pip install sdc-agents-smb[knowledge] # PDF, DOCX, ChromaDB
pip install sdc-agents-smb[notion,sheets,airtable] # SMB datasources
pip install sdc-agents-smb[dashboard] # Audit dashboard web UI
3. Configure
cp sdc-agents.example.yaml sdc-agents.yaml
# Edit sdc-agents.yaml with your SDCStudio URL, API key, and datasources
4. Run
# MCP mode — serve an agent as an MCP server
sdc-agents serve --mcp catalog
sdc-agents serve --mcp introspect
# Check configuration and installed toolsets
sdc-agents info
sdc-agents toolset list
sdc-agents validate-config
# Run a pipeline template
sdc-agents pipeline list
sdc-agents pipeline run healthcare-csv -p datasource=patient_csv
# Start the scheduler
sdc-agents schedule list
sdc-agents schedule run
# View audit log and dashboard
sdc-agents audit show --last 24h
sdc-agents audit serve --port 8080
# Manage data annotations
sdc-agents annotate list-all
sdc-agents annotate add my_csv email "EU rows use comma decimal separator"
# Assembly review workflow
sdc-agents assembly list-pending
sdc-agents assembly review quarterly_model
sdc-agents assembly approve quarterly_model
# Generate compliance report
sdc-agents compliance report --format html --last 30d -o report.html
5. ADK mode (standalone agent)
from sdc_agents.agents.catalog import create_catalog_agent
from sdc_agents.common.config import load_config
config = load_config("sdc-agents.yaml")
agent = create_catalog_agent(config)
# model defaults to ollama_chat/gemma4:26b from config
Model Configuration
The default model is ollama_chat/gemma4:26b. Configure in sdc-agents.yaml:
model:
default: "ollama_chat/gemma4:26b"
ollama_base_url: "http://localhost:11434"
Tested Models
| Model | Size | Tool Calling | Notes |
|---|---|---|---|
gemma4:26b |
26B MoE | Native | Recommended default |
qwen3.5:32b |
32B | Native | Strong reasoning |
llama3.1:8b |
8B | Native | Lightweight option |
Any Ollama model with tool-calling support should work. Use the ollama_chat/ prefix for chat models.
Security Model
- Purpose scoping — each agent has a narrow tool set, no mega-agent
- Security isolation — no agent has both datasource access AND network access
- Private project enforcement — created components go to non-public SDCStudio projects only
- Read-only datasources — SQL write operations rejected; all introspection is read-only
- Credential redaction — audit logger redacts
connection,token,key,password,secret - Path confinement — validation/distribution restricted to configured output directory
- Append-only audit — every tool call logged to
.sdc-cache/audit.jsonl - ToolsetHub scope enforcement — plugins declare network hosts, datasource types, and file access; violations rejected at load time
Documentation
- Repository Guide — catalog of all SDC FOSS repos
- Architecture Overview — how the stack fits together
- ClawFeatures — competitive positioning vs OpenClaw
- PRD — product requirements document
License
Apache License 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sdc_agents_smb-4.0.0.tar.gz.
File metadata
- Download URL: sdc_agents_smb-4.0.0.tar.gz
- Upload date:
- Size: 124.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b6d00b179945d3ca37519235d81ccd343d43bb27d0687d908da1e7e56360284
|
|
| MD5 |
21590a9522d62ce76c25f69be8b7b5e8
|
|
| BLAKE2b-256 |
556638caf60300582df0d349845d4a416dfb5ad2aea6b0f9bbf216dcc981dce8
|
Provenance
The following attestation bundles were made for sdc_agents_smb-4.0.0.tar.gz:
Publisher:
release.yml on SemanticDataCharter/SDC_AgentsSMB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sdc_agents_smb-4.0.0.tar.gz -
Subject digest:
6b6d00b179945d3ca37519235d81ccd343d43bb27d0687d908da1e7e56360284 - Sigstore transparency entry: 1660931743
- Sigstore integration time:
-
Permalink:
SemanticDataCharter/SDC_AgentsSMB@9a3685ab45c5dacd248210707e045573414cc311 -
Branch / Tag:
refs/tags/v4.0.0 - Owner: https://github.com/SemanticDataCharter
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9a3685ab45c5dacd248210707e045573414cc311 -
Trigger Event:
push
-
Statement type:
File details
Details for the file sdc_agents_smb-4.0.0-py3-none-any.whl.
File metadata
- Download URL: sdc_agents_smb-4.0.0-py3-none-any.whl
- Upload date:
- Size: 107.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
262d3a51aa57c4fa7303a202e51a9d19b8c6fd8b4ea39bb088f36ae4a96c054b
|
|
| MD5 |
eedd24d18331f7b6badfadfd957cc48f
|
|
| BLAKE2b-256 |
0f35b96b02e7511547877c6790a1bb6d172d10bedd0eca7099e5c8da2ffc827c
|
Provenance
The following attestation bundles were made for sdc_agents_smb-4.0.0-py3-none-any.whl:
Publisher:
release.yml on SemanticDataCharter/SDC_AgentsSMB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sdc_agents_smb-4.0.0-py3-none-any.whl -
Subject digest:
262d3a51aa57c4fa7303a202e51a9d19b8c6fd8b4ea39bb088f36ae4a96c054b - Sigstore transparency entry: 1660931909
- Sigstore integration time:
-
Permalink:
SemanticDataCharter/SDC_AgentsSMB@9a3685ab45c5dacd248210707e045573414cc311 -
Branch / Tag:
refs/tags/v4.0.0 - Owner: https://github.com/SemanticDataCharter
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9a3685ab45c5dacd248210707e045573414cc311 -
Trigger Event:
push
-
Statement type: