Skip to main content

LangSmith observability for LLM and robot SDKs

Project description



One line. Full LangSmith observability for robot SDKs.

Quick Start: Robot Tracing

from unitree_sdk2py.go2.sport.sport_client import SportClient
from shadowdance import ShadowDance

# Your existing robot code
client = SportClient()
client.Init()

# ONE LINE - wrap with ShadowDance
client = ShadowDance(client)  # <- that's all you need!

# Everything else unchanged - now fully traced
client.StandUp()
client.Move(0.3, 0, 0)
client.Damp()

Every robot command is now a traced LangSmith event with full inputs, outputs, and timing.

Connect to LLMs: Code-as-Policies

Add LLM decision-making and trace the full stack (vision → planning → execution):

from shadowdance import ShadowDance
from openai import OpenAI

# Wrap your robot (as above)
robot = SportClient()
robot = ShadowDance(robot, run_type="tool")

# Wrap your LLM (ONE LINE)
llm = OpenAI()
llm = ShadowDance(llm, run_type="llm")

# Simple code-as-policies: LLM generates robot commands
task = "move forward and stop"
prompt = f"Generate robot commands for: {task}"

response = llm.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)

# Execute LLM-generated commands (traced!)
exec(response.choices[0].message.content)  # e.g., robot.Move(0.3, 0, 0)

Now in LangSmith you see the full chain: LLM reasoning → generated code → robot execution.

Architecture

Modern LLM-powered robots use a layered architecture:

┌─────────────────────────────────────────┐
│  Your Agent Code                        │
│  ShadowDance(agent, run_type="chain")   │
├─────────────────────────────────────────┤
│  LLM (OpenAI, etc.)                     │
│  ShadowDance(llm, run_type="llm")       │
│  "pick up box" → [move, grasp, lift]    │
├─────────────────────────────────────────┤
│  Robot SDK (Unitree, etc.)              │
│  ShadowDance(robot, run_type="tool")    │
│  Move(0.3, 0, 0), StandUp(), etc.       │
└─────────────────────────────────────────┘

Wrap each layer with ShadowDance → see the full decision chain in LangSmith.

Installation

pip install shadowdance

Setup

# Load environment variables (create .env with your keys)
source .env

The .env file contains:

# LangSmith tracing
LANGCHAIN_API_KEY=...
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=shadowdance

# OpenRouter (OpenAI-compatible API)
OPENAI_API_KEY=...
OPENAI_BASE_URL=https://openrouter.ai/api/v1

# Default model for vision and planning
DEFAULT_MODEL=openrouter/hunter-alpha

Using Different Models

Change DEFAULT_MODEL in .env to use different models:

Test Connection

python examples/test_openrouter.py

Examples

Full Demo: Code-as-Policies

python examples/code_as_policies.py

This demonstrates the complete Code-as-Policies approach:

  1. VLM analyzes image → detects white box at [0.0, 0.1, 0.72]
  2. LLM generates Python code → robot.move_to(...), robot.close_gripper(...)
  3. Safe executor runs code → robot picks up box
  4. ShadowDance traces everything → debug in LangSmith
Task: "Pick up the white box"
  ↓
Vision: white_box detected at [0.0, 0.1, 0.72]
  ↓  
LLM: Generates 4-line Python program
  ↓
Robot: move_to → close_gripper → move_to (SUCCESS)

Run Types

LangSmith has different run types for better dashboard filtering:

Run Type Use Case Example
"llm" LLM/VLM API calls OpenAI, Anthropic, vision models
"tool" Function/tool calls Robot commands, API wrappers
"chain" Orchestration logic Agents, multi-step workflows
"retriever" Document retrieval RAG systems, vector stores
"embedding" Embedding generation Text embeddings
"prompt" Prompt formatting Custom prompt templates
# LLM calls
client = ShadowDance(OpenAI(), run_type="llm")

# Robot/tool calls
client = ShadowDance(SportClient(), run_type="tool")

# Agent orchestration
agent = ShadowDance(MyAgent(), run_type="chain")

Datasets & Experiments

Use ShadowDance with LangSmith datasets for robot evaluation and regression testing:

from shadowdance import ShadowDance

# Log all executions to a dataset
robot = ShadowDance(
    SportClient(), 
    run_type="tool", 
    log_to_dataset="robot-tasks"  # Creates dataset automatically
)

# Every command is logged as an example
robot.StandUp()      # ✓ Logged with inputs, outputs, success
robot.Move(0.3, 0, 0)  # ✓ Logged with duration, result

In LangSmith:

  1. Go to Datasets & Experiments tab
  2. Find robot-tasks dataset with all executions
  3. Create experiments to compare robot versions
  4. Run regression tests on code changes

Example: Evaluate robot configurations

python examples/robot_evaluation.py

This creates datasets (robot-eval-v1, robot-eval-v2) and compares task success rates across configurations.

Code-as-Policies (Full Demo)

Modern LLM robot architecture: VLM → LLM → Code → Robot

python examples/code_as_policies.py

This demonstrates the Code-as-Policies approach:

  1. VLM analyzes image → detects white box at [0.0, 0.1, 0.72]
  2. LLM generates Python code → robot.move_to(...), robot.close_gripper(...)
  3. Safe executor runs code → robot picks up box
  4. ShadowDance traces everything → debug in LangSmith
Task: "Pick up the white box"
  ↓
Vision: white_box detected at [0.0, 0.1, 0.72]
  ↓  
LLM: Generates 4-line Python program
  ↓
Robot: move_to → close_gripper → move_to (SUCCESS)

Example output in LangSmith

Run: robot_session
  └── Move(vx=0.3, vy=0, vyaw=0)        12ms  ✓
  └── StandUp()                          8ms  ✓
  └── Move(vx=0, vy=0.3, vyaw=0)        11ms  ✓
  └── Damp()                             9ms  ✓

View your traces at smith.langchain.com

Testing

# Run unit tests
python test_shadowdance.py

# Run with virtual robot
python examples/with_virtual_robot.py

API

ShadowDance(client)

Wraps a client object with LangSmith tracing.

Args:

  • client: The Unitree SDK client object to wrap

Returns:

  • A proxy object that intercepts all method calls

Example:

wrapped = ShadowDance(client)
wrapped.Move(0.3, 0, 0)  # Traced as "Move" in LangSmith

File structure

./shadowdance.py              # Main implementation
./test_shadowdance.py         # Unit tests
./examples/
├── basic.py                  # Basic usage
├── error_handling.py         # Error handling demo
├── virtual_robot.py          # Virtual robot server
└── with_virtual_robot.py     # Virtual robot + LangSmith demo
./pyproject.toml              # Package configuration
./requirements.txt            # Dependencies
./.env                        # LangSmith credentials (gitignored)

Why

The Unitree SDK has no logging, no observability, no way to know why your robot did what it did. LangSmith fixes that. This wrapper connects them with one line of code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shadowdance-0.2.0.tar.gz (43.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shadowdance-0.2.0-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file shadowdance-0.2.0.tar.gz.

File metadata

  • Download URL: shadowdance-0.2.0.tar.gz
  • Upload date:
  • Size: 43.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for shadowdance-0.2.0.tar.gz
Algorithm Hash digest
SHA256 efd2cf5386bf75342f9ff16f5b6dcb3061a3232cdaa8704f957e93d582b0e65a
MD5 948b52451de6cf8033ddb9c94d95958d
BLAKE2b-256 5472fc2a0187fac882ebd5049143f1aa06a3d4dc07482008e4967e26e4ada206

See more details on using hashes here.

File details

Details for the file shadowdance-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: shadowdance-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for shadowdance-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb313b944d570e6e2a72e7ea53460a39bf1bb3a70c6138faa3671daf76e08a26
MD5 249970528d7c37ea22a4d9a0c8e8d585
BLAKE2b-256 0b9ad082ee0db59f98cfbc568c08f06d1689ba4e25a890a1136d8acbf31051a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page