Skip to main content

LangChain integration for Anakin API - web scraping, AI search, and intelligent data extraction

Project description

langchain-anakin

LangChain integration for Anakin AI - powerful web scraping, AI-powered search, and intelligent data extraction tools for your LangChain applications.

PyPI version License: MIT

Features

  • 🔧 Three Powerful LangChain Tools:
    • AnakinScrapeTool - Scrape websites and extract structured data
    • AnakinSearchTool - AI-powered search with Perplexity AI
    • AnakinAgenticSearchTool - Advanced multi-stage AI pipeline for comprehensive research
  • 🔐 Simple authentication with API key
  • 📦 Easy integration with LangChain agents and chains
  • ⚡ Asynchronous operations with automatic polling
  • 🎯 Type-safe with Pydantic models
  • 📖 Comprehensive documentation and examples

Installation

pip install langchain-anakin

Quick Start

1. Get Your API Key

Sign up at Anakin.io to get your API key.

2. Use the Tools

from langchain_anakin import AnakinScrapeTool, AnakinSearchTool, AnakinAgenticSearchTool

# Initialize tools with your API key
api_key = "your-anakin-api-key"

scrape_tool = AnakinScrapeTool(api_key=api_key)
search_tool = AnakinSearchTool(api_key=api_key)
agentic_tool = AnakinAgenticSearchTool(api_key=api_key)

# Use with LangChain agents
from langchain.agents import initialize_agent, AgentType
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(temperature=0)

tools = [scrape_tool, search_tool, agentic_tool]

agent = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True
)

# Ask the agent to use the tools
result = agent.run(
    "Search for the latest AI trends and scrape the top 3 sources"
)

Tools Overview

1. AnakinScrapeTool

Scrape websites and extract content including HTML, markdown, and structured data.

from langchain_anakin import AnakinScrapeTool

tool = AnakinScrapeTool(api_key="your-api-key")

# Direct invocation
result = tool.invoke({
    "url": "https://example.com/product",
    "country": "us",
    "force_fresh": False
})

print(result)

Input Parameters:

  • url (required): Website URL to scrape
  • country (optional): Proxy country code (default: "us")
  • force_fresh (optional): Bypass cache (default: False)
  • max_wait_time (optional): Max seconds to wait (default: 300)
  • poll_interval (optional): Seconds between checks (default: 3)

Use Cases:

  • Extract product information from e-commerce sites
  • Scrape article content for analysis
  • Monitor website changes
  • Gather structured data from web pages

2. AnakinSearchTool

Perform AI-powered searches with instant answers and citations.

from langchain_anakin import AnakinSearchTool

tool = AnakinSearchTool(api_key="your-api-key")

# Direct invocation
result = tool.invoke({
    "query": "What are the latest developments in quantum computing?",
    "max_results": 5
})

print(result)

Input Parameters:

  • query (required): Search question or query
  • max_results (optional): Maximum results to return (default: 5)

Use Cases:

  • Research and fact-checking
  • Competitive intelligence
  • Real-time information gathering
  • Content research
  • Answer complex questions

3. AnakinAgenticSearchTool

Advanced multi-stage AI pipeline that automatically searches, scrapes, and extracts structured data.

from langchain_anakin import AnakinAgenticSearchTool

tool = AnakinAgenticSearchTool(api_key="your-api-key")

# Direct invocation
result = tool.invoke({
    "prompt": "Find pricing information for top 5 project management tools",
    "use_browser": True
})

print(result)

Input Parameters:

  • prompt (required): Research prompt or question
  • use_browser (optional): Use browser for scraping (default: True)
  • max_wait_time (optional): Max seconds to wait (default: 600)
  • poll_interval (optional): Seconds between checks (default: 5)

Use Cases:

  • Comprehensive market research
  • Competitive analysis with structured data
  • Lead generation with enrichment
  • Automated data collection for reports
  • Multi-source information synthesis

Advanced Examples

Example 1: Web Research Agent

from langchain_anakin import AnakinSearchTool, AnakinScrapeTool
from langchain_openai import ChatOpenAI
from langchain.agents import initialize_agent, AgentType

api_key = "your-anakin-api-key"

# Initialize tools
search_tool = AnakinSearchTool(api_key=api_key)
scrape_tool = AnakinScrapeTool(api_key=api_key)

# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = initialize_agent(
    tools=[search_tool, scrape_tool],
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True
)

# Run research task
result = agent.run(
    "Research the top 3 AI companies in 2026 and provide detailed information about each"
)

Example 2: Market Research with Agentic Search

from langchain_anakin import AnakinAgenticSearchTool
from langchain_openai import ChatOpenAI
from langchain.agents import initialize_agent, AgentType

# Initialize tool
agentic_tool = AnakinAgenticSearchTool(api_key="your-anakin-api-key")

# Create specialized agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = initialize_agent(
    tools=[agentic_tool],
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,
    verbose=True
)

# Perform comprehensive research
result = agent.run(
    "Compare pricing, features, and user reviews of Salesforce, HubSpot, and Zoho CRM"
)

Example 3: Custom Chain with Multiple Tools

from langchain_anakin import (
    AnakinScrapeTool,
    AnakinSearchTool,
    AnakinAgenticSearchTool
)
from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate

# Initialize all tools
api_key = "your-anakin-api-key"
scrape = AnakinScrapeTool(api_key=api_key)
search = AnakinSearchTool(api_key=api_key)
agentic = AnakinAgenticSearchTool(api_key=api_key)

# Use tools in a custom workflow
# 1. Search for topics
search_result = search.invoke({
    "query": "Best AI tools for developers in 2026"
})

# 2. Extract URLs from search and scrape them
# (In a real scenario, you'd parse URLs from search_result)
scrape_result = scrape.invoke({
    "url": "https://example.com/ai-tools"
})

# 3. Do comprehensive analysis
agentic_result = agentic.invoke({
    "prompt": "Analyze features and pricing of the AI tools mentioned"
})

Configuration

Custom Base URL

If you're using a self-hosted or custom Anakin API endpoint:

from langchain_anakin import AnakinScrapeTool

tool = AnakinScrapeTool(
    api_key="your-api-key",
    base_url="https://your-custom-api.com"
)

Tool Timeouts and Polling

Adjust timeouts and polling intervals for long-running operations:

# For scraping with custom timeout
result = scrape_tool.invoke({
    "url": "https://example.com",
    "max_wait_time": 600,  # 10 minutes
    "poll_interval": 5  # Check every 5 seconds
})

# For agentic search with custom settings
result = agentic_tool.invoke({
    "prompt": "Research prompt",
    "max_wait_time": 900,  # 15 minutes
    "poll_interval": 10  # Check every 10 seconds
})

Error Handling

All tools handle errors gracefully and return descriptive error messages:

from langchain_anakin import AnakinScrapeTool

tool = AnakinScrapeTool(api_key="your-api-key")

try:
    result = tool.invoke({"url": "https://invalid-url"})
except Exception as e:
    print(f"Error: {e}")

API Reference

AnakinScrapeTool

class AnakinScrapeTool(BaseTool):
    """Tool for scraping websites using the Anakin API."""
    
    name: str = "anakin_scrape_url"
    api_key: str  # Required
    base_url: str = "https://api.anakin.io"  # Optional

AnakinSearchTool

class AnakinSearchTool(BaseTool):
    """Tool for AI-powered searches using the Anakin API."""
    
    name: str = "anakin_ai_search"
    api_key: str  # Required
    base_url: str = "https://api.anakin.io"  # Optional

AnakinAgenticSearchTool

class AnakinAgenticSearchTool(BaseTool):
    """Tool for advanced agentic searches using the Anakin API."""
    
    name: str = "anakin_agentic_search"
    api_key: str  # Required
    base_url: str = "https://api.anakin.io"  # Optional

Development

Setup Development Environment

# Clone the repository
git clone https://github.com/Anakin-Inc/langchain-anakin.git
cd langchain-anakin

# Install dependencies
pip install -r requirements-dev.txt

# Install package in editable mode
pip install -e .

Running Tests

pytest

Code Formatting

black langchain_anakin/
ruff check langchain_anakin/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Support

License

MIT License - see the LICENSE file for details.

Related Projects

Changelog

0.1.0 (2026-01-23)

  • Initial release
  • Added AnakinScrapeTool for web scraping
  • Added AnakinSearchTool for AI-powered search
  • Added AnakinAgenticSearchTool for agentic research
  • Full LangChain integration with BaseTool
  • Comprehensive documentation and examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_anakin-0.1.0.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_anakin-0.1.0-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file langchain_anakin-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_anakin-0.1.0.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for langchain_anakin-0.1.0.tar.gz
Algorithm Hash digest
SHA256 939e8c3eb3330b3b6484372133a5215a0ef3e607d2e3a0087f10a467fccc152a
MD5 5946df43d3365d458c08eebeb1ee13b1
BLAKE2b-256 71c6206b3a7bdbe16b15777f75f566e5b44776410e91e67e8da163f9d9e80096

See more details on using hashes here.

File details

Details for the file langchain_anakin-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_anakin-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 00db347268223e0785b7eab17df9f9b54d76f4dfacf47cbdee21dd77a726d243
MD5 3515f531811bc4c91859e8b0e3635bc5
BLAKE2b-256 8ed9e204d9f2022e1d29c0e468c1b801b420b982391d7344774c7cff4a4eadb5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page