Skip to main content

LangChain integration for SHIP Protocol - AI coding reliability metrics

Project description

ship-langchain

SHIP Score PyPI version Python Version License

LangChain Integration for SHIP Protocol - AI Coding Reliability Tools

Know if your AI coding task will succeed before you run it.

Why SHIP + LangChain?

70% of AI coding tasks fail. SHIP tells your agent the probability of success before execution.

from ship_langchain import SHIPAssessTool
from langchain.agents import create_react_agent

# Add SHIP assessment to your agent's toolkit
tools = [SHIPAssessTool(), ...other_tools]

# Now your agent can check reliability before modifying code
# Agent: "Let me assess this task first..."
# SHIP: "Score: 85 (A) - High confidence this will work"

Installation

pip install ship-langchain

Quick Start

As a Tool

from ship_langchain import SHIPAssessTool, SHIPFeedbackTool
from langchain.agents import AgentExecutor, create_react_agent
from langchain_openai import ChatOpenAI

# Create tools
ship_assess = SHIPAssessTool()
ship_feedback = SHIPFeedbackTool()

# Add to your agent
tools = [ship_assess, ship_feedback]

# The agent can now:
# 1. Assess code before modification
# 2. Submit feedback after task completion

As a Chain

from ship_langchain import SHIPAssessmentChain

chain = SHIPAssessmentChain()

result = await chain.invoke({
    "code": "def hello(): print('world')",
    "prompt": "Add type hints and docstring",
    "language": "python",
})

print(f"Score: {result.score} ({result.grade})")
print(f"Confidence: {result.confidence}")
print(f"Recommendations: {result.recommendations}")

With Callbacks

from ship_langchain import SHIPCallbackHandler

# Track all SHIP metrics automatically
handler = SHIPCallbackHandler()

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    callbacks=[handler],
)

# Run your agent
result = agent_executor.invoke({"input": "Refactor this code..."})

# Get SHIP summary
summary = handler.get_summary()
print(f"Average SHIP Score: {summary['average_score']}")
print(f"Assessments: {summary['total_assessments']}")

Tools Reference

SHIPAssessTool

Assess code reliability before AI modification.

Input:

  • code: The code to assess
  • prompt: Task description
  • language: Programming language (default: "python")
  • file_path: Virtual file path (default: "main.py")

Output:

  • ship_score: 0-100 reliability score
  • grade: Letter grade (A+ to F)
  • confidence, focus, context, efficiency: Component scores
  • recommendations: Top improvement suggestions

SHIPFeedbackTool

Submit feedback on task outcomes to improve future predictions.

Input:

  • request_id: From assessment response
  • task_completed: Whether task succeeded
  • first_attempt_success: Success on first try
  • total_attempts: Number of attempts needed

SHIPQuickAssessTool

Fast assessment with minimal input.

Input:

  • code: Code snippet
  • prompt: Task description

Output:

  • Simple "SHIP Score: X (Grade)" string

SHIPHealthTool

Check API availability before batch operations.

Philosophy

This integration follows Talebian/Antifragile principles:

  1. Never Crash: Tools return degraded responses instead of exceptions
  2. Self-Learning: Feedback loop improves predictions over time
  3. Observable: Callback handler tracks all metrics
  4. Composable: Works with any LLM and agent setup

Grade Interpretation

Grade Score What It Means
A+ 95-100 95%+ success rate - Ship with confidence
A 85-94 85%+ success rate - Reliable
B 70-84 70%+ success rate - Good, minor risks
C 50-69 50%+ success rate - Proceed with caution
D 30-49 30%+ success rate - High risk
F 0-29 <30% success rate - Likely to fail

Example Agent Workflow

# 1. Agent receives coding task
"Add error handling to the payment processor"

# 2. Agent uses SHIPAssessTool first
ship_assess.run({
    "code": payment_processor_code,
    "prompt": "Add error handling",
})
# Result: Score 72 (B) - "Missing type information reduces confidence"

# 3. Agent decides based on score
if score >= 70:
    # Proceed with modification
    ...
else:
    # Request more context or simplify task
    ...

# 4. After completion, agent submits feedback
ship_feedback.run({
    "request_id": "req-123",
    "task_completed": True,
    "first_attempt_success": True,
})

Links

License

MIT License - see LICENSE for details.


Built with love by VibeAtlas - Making AI coding reliable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ship_langchain-0.1.0b2.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ship_langchain-0.1.0b2-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file ship_langchain-0.1.0b2.tar.gz.

File metadata

  • Download URL: ship_langchain-0.1.0b2.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for ship_langchain-0.1.0b2.tar.gz
Algorithm Hash digest
SHA256 c5dfc397c6c150290bff715f2b6aa0fc478ec321840ae8e1f5cb1e5591c6b19a
MD5 f00826e15898fba579c7ef4fbd77211d
BLAKE2b-256 b376e96311bf153181ac31a5c417625c70dd0d53454e618acfddbf8ad9b69f37

See more details on using hashes here.

File details

Details for the file ship_langchain-0.1.0b2-py3-none-any.whl.

File metadata

File hashes

Hashes for ship_langchain-0.1.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 04ca6174d0c8c64b365dce7ff6eb56240c2a4bde27cee581ed27a0ff5acb61ec
MD5 659bb6993ac980104b497a37be455cc7
BLAKE2b-256 6b7927afc7bdfa775160cd441b0bbcaf4d9b15ff7851d3654cbfa184919f6fc5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page