Skip to main content

An integration package connecting Nimble and LangChain

Project description

langchain-nimble

Production-grade LangChain integration for Nimble's Web Search & Content Extraction API

PyPI version Python 3.10+ License: MIT

langchain-nimble provides powerful web search and content extraction capabilities for LangChain applications. Built on the official Nimble Python SDK, it offers both retrievers and tools for seamless integration with LangChain agents and chains.

Features

  • Dual Interface: Retrievers for chains, Tools for agents
  • 🔍 Search Depth Levels: lite (metadata), fast (Enterprise), deep (full content)
  • 🤖 LLM Answers: Optional AI-generated answer summaries
  • 🎯 Focus Modes: Specialized search (general, news, location, shopping, geo, social)
  • 🛍️ AI-Powered WSA: Web Search Agents for shopping, geo, and social media
  • Time Range Filtering: Quick recency filters (hour, day, week, month, year)
  • 📅 Date Filtering: Search by specific date ranges
  • 🌐 Domain Control: Include/exclude specific domains
  • Full Async Support: Both sync and async implementations
  • 🔄 Smart Retry Logic: Built-in retry via Nimble SDK
  • 📊 Markdown Output: Clean markdown content from any page

Installation

pip install -U langchain-nimble

Quick Start

1. Get Your API Key

Sign up at Nimbleway to get your API key.

2. Set Environment Variable

export NIMBLE_API_KEY="your-api-key-here"

Or pass it directly: NimbleSearchRetriever(api_key="your-key")

3. Basic Usage

from langchain_nimble import NimbleSearchRetriever

# Create a retriever
retriever = NimbleSearchRetriever(max_results=5)

# Search (sync or async with ainvoke)
documents = retriever.invoke("latest developments in AI")

for doc in documents:
    print(f"{doc.metadata['title']}\n{doc.metadata['url']}\n")

Retrievers

Retrievers return LangChain Document objects, ideal for RAG pipelines and chains.

NimbleSearchRetriever

Basic Search

from langchain_nimble import NimbleSearchRetriever

# Lite search - returns metadata only (default)
retriever = NimbleSearchRetriever(
    max_results=5,
    search_depth="lite"
)
docs = retriever.invoke("Python best practices 2024")

Deep Search

Fetch full page content from each result:

retriever = NimbleSearchRetriever(
    max_results=3,
    search_depth="deep"  # Full page content extraction
)
docs = retriever.invoke("comprehensive guide to FastAPI")

Advanced Filtering

# Domain filtering
retriever = NimbleSearchRetriever(
    max_results=5,
    include_domains=["python.org", "docs.python.org"],
    exclude_domains=["pinterest.com"]
)

# Date filtering
retriever = NimbleSearchRetriever(
    max_results=10,
    start_date="2024-01-01",
    end_date="2024-12-31",
    focus="news"
)

# Time range filtering
recent_retriever = NimbleSearchRetriever(
    time_range="week"  # hour, day, week, month, year
)

# Focus-based search
news_retriever = NimbleSearchRetriever(focus="news")
location_retriever = NimbleSearchRetriever(focus="location")
shopping_retriever = NimbleSearchRetriever(focus="shopping")  # AI-powered WSA

LLM Answer Generation

Get AI-generated answers:

retriever = NimbleSearchRetriever(
    max_results=5,
    include_answer=True
)
docs = retriever.invoke("What is the capital of France?")

# First doc contains the LLM answer if available
if docs and docs[0].metadata.get("entity_type") == "answer":
    print(f"Answer: {docs[0].page_content}")

NimbleExtractRetriever

Extract content from specific URLs:

from langchain_nimble import NimbleExtractRetriever

retriever = NimbleExtractRetriever()
docs = retriever.invoke("https://www.python.org/about/")

# With render wait for dynamic content
retriever = NimbleExtractRetriever(
    driver="vx8",      # Optional: vx6, vx8, vx8-pro, vx10, vx10-pro, vx12, vx12-pro
    wait=3000,         # Wait for dynamic content (ms) - uses browser_actions
)

Tools for Agents

Tools provide structured input schemas for agent integration.

NimbleSearchTool

from langchain_nimble import NimbleSearchTool
from langchain.agents import create_agent

# Create agent with search tool
search_tool = NimbleSearchTool()
agent = create_agent(
    model="claude-haiku-4-5",
    tools=[search_tool]
)

# Agent searches the web
response = agent.invoke({
    "messages": [{"role": "user", "content": "What are the latest developments in quantum computing?"}]
})

NimbleExtractTool

from langchain_nimble import NimbleExtractTool

extract_tool = NimbleExtractTool()

# Extract a URL - returns markdown string
result = extract_tool.invoke({
    "url": "https://www.langchain.com/"
})

Multi-Tool Agent

from langchain_nimble import NimbleSearchTool, NimbleExtractTool
from langchain.agents import create_agent

search_tool = NimbleSearchTool()
extract_tool = NimbleExtractTool()

agent = create_agent(
    model="claude-haiku-4-5",
    tools=[search_tool, extract_tool]
)

# Agent can search, then extract specific URLs
response = agent.invoke({
    "messages": [{"role": "user", "content": "Find recent LangChain articles and summarize the top one"}]
})

Parameter Reference

Search Parameters (NimbleSearchRetriever & NimbleSearchTool)

Parameter Type Default Description
api_key str | None None API key (or set NIMBLE_API_KEY)
max_results int 3 / 10* Number of results (1-100). Alias: num_results
focus str "general" Search focus mode
search_depth str "lite" Search depth: lite, fast (Enterprise), deep
include_answer bool False LLM answer summary
time_range str None Recency filter - hour, day, week, month, year
include_domains list[str] None Domain whitelist
exclude_domains list[str] None Domain blacklist
start_date str None Filter after date (YYYY-MM-DD or YYYY)
end_date str None Filter before date (YYYY-MM-DD or YYYY)
locale str "en" Language/locale (e.g., fr, es)
country str "US" Country code (e.g., UK, FR)

* Defaults differ: Retriever uses max_results=3, search_depth="lite"; Tool uses max_results=10, search_depth="lite"

Extract Parameters (NimbleExtractRetriever)

Parameter Type Default Description
api_key str | None None API key (or set NIMBLE_API_KEY)
driver str | None None Browser driver: vx6, vx8, vx8-pro, vx10, vx10-pro, vx12, vx12-pro
wait int | None None Render wait in milliseconds (uses browser_actions)
locale str "en" Language/locale
country str "US" Country code

NimbleExtractTool

The extract tool accepts a single url parameter and returns the page content as a markdown string.

Response Formats

Document Structure (Retrievers)

Document(
    page_content="Full content...",
    metadata={
        "title": "Page Title",
        "url": "https://example.com",
        "description": "Page description...",
        "position": 1,
        "entity_type": "organic"  # or "answer"
    }
)

Search Tool Response (JSON)

{
    "results": [
        {
            "title": "Title",
            "url": "https://...",
            "description": "...",
            "content": "Full content...",
            "metadata": {
                "position": 1,
                "entity_type": "organic"
            }
        }
    ]
}

Extract Tool Response (String)

The extract tool returns a markdown string directly.

Best Practices

Search Depth Levels

Use search_depth="deep" for:

  • RAG applications needing full context
  • Content analysis and summarization
  • In-depth research tasks

Use search_depth="lite" (default) for:

  • Quick lookups
  • Getting lists of URLs
  • When you'll extract specific URLs later

Use search_depth="fast" for (Enterprise only):

  • Production workloads needing rich content at low latency

Tools vs. Retrievers

Retrievers: Use in chains, RAG pipelines, vector store integration Tools: Use with agents that need dynamic search control

Filtering Tips

  • Academic research: include_domains=["edu", "scholar.google.com"]
  • Documentation: include_domains=["docs.python.org", "readthedocs.io"]
  • Remove noise: exclude_domains=["pinterest.com", "facebook.com"]
  • Recent news: start_date="2024-01-01", focus="news"
  • Historical: start_date="2020", end_date="2021"

Error Handling

The SDK handles retries automatically. For custom error handling:

from langchain_nimble import NimbleSearchRetriever

retriever = NimbleSearchRetriever()

try:
    docs = retriever.invoke("query")
except ValueError as e:
    print(f"API error: {e}")

Performance Tips

  1. Use async (ainvoke) for concurrent requests
  2. Request only needed results (max_results)
  3. Let API auto-select driver, or use lower driver levels (vx6/vx8) unless advanced rendering needed
  4. Avoid wait parameter for static content

Examples & Documentation

Contributing

Contributions welcome! Please submit Pull Requests.

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/name)
  3. Commit changes (git commit -m 'Add feature')
  4. Push branch (git push origin feature/name)
  5. Open Pull Request

Support

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_nimble-3.0.0.tar.gz (195.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_nimble-3.0.0-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file langchain_nimble-3.0.0.tar.gz.

File metadata

  • Download URL: langchain_nimble-3.0.0.tar.gz
  • Upload date:
  • Size: 195.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.6

File hashes

Hashes for langchain_nimble-3.0.0.tar.gz
Algorithm Hash digest
SHA256 59c98ac6def2930aa2cb7dfd2db3a7e5549ca4dc5204aa9fd5b44c4781054507
MD5 1e357f53f377ccd24d460bc1ecab1bdf
BLAKE2b-256 2fe7a457a7fd1291ec4f2009316255f3367788fe46aa7877ceaae37c0d3801ff

See more details on using hashes here.

File details

Details for the file langchain_nimble-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_nimble-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e53e965716aa3af6af609312a0aa08889355ba9527e04dff914f408286fe852
MD5 1dfe2d4646156fa61ae0deae254200c3
BLAKE2b-256 28ad417dea0df11bc8081808a48c11a8d7e0837a6acb412e0a34e8233aeb3ab2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page