LangChain integration for Anakin API - web scraping, AI search, and intelligent data extraction
Project description
langchain-anakin
LangChain integration for Anakin AI - powerful web scraping, AI-powered search, and intelligent data extraction tools for your LangChain applications.
Features
- 🔧 Three Powerful LangChain Tools:
AnakinScrapeTool- Scrape websites and extract structured dataAnakinSearchTool- AI-powered search with Perplexity AIAnakinAgenticSearchTool- Advanced multi-stage AI pipeline for comprehensive research
- 🔐 Simple authentication with API key
- 📦 Easy integration with LangChain agents and chains
- ⚡ Asynchronous operations with automatic polling
- 🎯 Type-safe with Pydantic models
- 📖 Comprehensive documentation and examples
Installation
pip install langchain-anakin
Quick Start
1. Get Your API Key
Sign up at Anakin.io to get your API key.
2. Use the Tools
from langchain_anakin import AnakinScrapeTool, AnakinSearchTool, AnakinAgenticSearchTool
# Initialize tools with your API key
api_key = "your-anakin-api-key"
scrape_tool = AnakinScrapeTool(api_key=api_key)
search_tool = AnakinSearchTool(api_key=api_key)
agentic_tool = AnakinAgenticSearchTool(api_key=api_key)
# Use with LangChain agents
from langchain.agents import initialize_agent, AgentType
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(temperature=0)
tools = [scrape_tool, search_tool, agentic_tool]
agent = initialize_agent(
tools=tools,
llm=llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True
)
# Ask the agent to use the tools
result = agent.run(
"Search for the latest AI trends and scrape the top 3 sources"
)
Tools Overview
1. AnakinScrapeTool
Scrape websites and extract content including HTML, markdown, and structured data.
from langchain_anakin import AnakinScrapeTool
tool = AnakinScrapeTool(api_key="your-api-key")
# Direct invocation
result = tool.invoke({
"url": "https://example.com/product",
"country": "us",
"force_fresh": False
})
print(result)
Input Parameters:
url(required): Website URL to scrapecountry(optional): Proxy country code (default: "us")force_fresh(optional): Bypass cache (default: False)max_wait_time(optional): Max seconds to wait (default: 300)poll_interval(optional): Seconds between checks (default: 3)
Use Cases:
- Extract product information from e-commerce sites
- Scrape article content for analysis
- Monitor website changes
- Gather structured data from web pages
2. AnakinSearchTool
Perform AI-powered searches with instant answers and citations.
from langchain_anakin import AnakinSearchTool
tool = AnakinSearchTool(api_key="your-api-key")
# Direct invocation
result = tool.invoke({
"query": "What are the latest developments in quantum computing?",
"max_results": 5
})
print(result)
Input Parameters:
query(required): Search question or querymax_results(optional): Maximum results to return (default: 5)
Use Cases:
- Research and fact-checking
- Competitive intelligence
- Real-time information gathering
- Content research
- Answer complex questions
3. AnakinAgenticSearchTool
Advanced multi-stage AI pipeline that automatically searches, scrapes, and extracts structured data.
from langchain_anakin import AnakinAgenticSearchTool
tool = AnakinAgenticSearchTool(api_key="your-api-key")
# Direct invocation
result = tool.invoke({
"prompt": "Find pricing information for top 5 project management tools",
"use_browser": True
})
print(result)
Input Parameters:
prompt(required): Research prompt or questionuse_browser(optional): Use browser for scraping (default: True)max_wait_time(optional): Max seconds to wait (default: 600)poll_interval(optional): Seconds between checks (default: 5)
Use Cases:
- Comprehensive market research
- Competitive analysis with structured data
- Lead generation with enrichment
- Automated data collection for reports
- Multi-source information synthesis
Advanced Examples
Example 1: Web Research Agent
from langchain_anakin import AnakinSearchTool, AnakinScrapeTool
from langchain_openai import ChatOpenAI
from langchain.agents import initialize_agent, AgentType
api_key = "your-anakin-api-key"
# Initialize tools
search_tool = AnakinSearchTool(api_key=api_key)
scrape_tool = AnakinScrapeTool(api_key=api_key)
# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = initialize_agent(
tools=[search_tool, scrape_tool],
llm=llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True
)
# Run research task
result = agent.run(
"Research the top 3 AI companies in 2026 and provide detailed information about each"
)
Example 2: Market Research with Agentic Search
from langchain_anakin import AnakinAgenticSearchTool
from langchain_openai import ChatOpenAI
from langchain.agents import initialize_agent, AgentType
# Initialize tool
agentic_tool = AnakinAgenticSearchTool(api_key="your-anakin-api-key")
# Create specialized agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = initialize_agent(
tools=[agentic_tool],
llm=llm,
agent=AgentType.OPENAI_FUNCTIONS,
verbose=True
)
# Perform comprehensive research
result = agent.run(
"Compare pricing, features, and user reviews of Salesforce, HubSpot, and Zoho CRM"
)
Example 3: Custom Chain with Multiple Tools
from langchain_anakin import (
AnakinScrapeTool,
AnakinSearchTool,
AnakinAgenticSearchTool
)
from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
# Initialize all tools
api_key = "your-anakin-api-key"
scrape = AnakinScrapeTool(api_key=api_key)
search = AnakinSearchTool(api_key=api_key)
agentic = AnakinAgenticSearchTool(api_key=api_key)
# Use tools in a custom workflow
# 1. Search for topics
search_result = search.invoke({
"query": "Best AI tools for developers in 2026"
})
# 2. Extract URLs from search and scrape them
# (In a real scenario, you'd parse URLs from search_result)
scrape_result = scrape.invoke({
"url": "https://example.com/ai-tools"
})
# 3. Do comprehensive analysis
agentic_result = agentic.invoke({
"prompt": "Analyze features and pricing of the AI tools mentioned"
})
Configuration
Custom Base URL
If you're using a self-hosted or custom Anakin API endpoint:
from langchain_anakin import AnakinScrapeTool
tool = AnakinScrapeTool(
api_key="your-api-key",
base_url="https://your-custom-api.com"
)
Tool Timeouts and Polling
Adjust timeouts and polling intervals for long-running operations:
# For scraping with custom timeout
result = scrape_tool.invoke({
"url": "https://example.com",
"max_wait_time": 600, # 10 minutes
"poll_interval": 5 # Check every 5 seconds
})
# For agentic search with custom settings
result = agentic_tool.invoke({
"prompt": "Research prompt",
"max_wait_time": 900, # 15 minutes
"poll_interval": 10 # Check every 10 seconds
})
Error Handling
All tools handle errors gracefully and return descriptive error messages:
from langchain_anakin import AnakinScrapeTool
tool = AnakinScrapeTool(api_key="your-api-key")
try:
result = tool.invoke({"url": "https://invalid-url"})
except Exception as e:
print(f"Error: {e}")
API Reference
AnakinScrapeTool
class AnakinScrapeTool(BaseTool):
"""Tool for scraping websites using the Anakin API."""
name: str = "anakin_scrape_url"
api_key: str # Required
base_url: str = "https://api.anakin.io" # Optional
AnakinSearchTool
class AnakinSearchTool(BaseTool):
"""Tool for AI-powered searches using the Anakin API."""
name: str = "anakin_ai_search"
api_key: str # Required
base_url: str = "https://api.anakin.io" # Optional
AnakinAgenticSearchTool
class AnakinAgenticSearchTool(BaseTool):
"""Tool for advanced agentic searches using the Anakin API."""
name: str = "anakin_agentic_search"
api_key: str # Required
base_url: str = "https://api.anakin.io" # Optional
Development
Setup Development Environment
# Clone the repository
git clone https://github.com/Anakin-Inc/langchain-anakin.git
cd langchain-anakin
# Install dependencies
pip install -r requirements-dev.txt
# Install package in editable mode
pip install -e .
Running Tests
pytest
Code Formatting
black langchain_anakin/
ruff check langchain_anakin/
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Support
- 📖 API Documentation
- 💬 Community Forum
- 🐛 Report Issues
- 📧 Email: anakin@anakininc.com
License
MIT License - see the LICENSE file for details.
Related Projects
- n8n-nodes-anakin-scraper - n8n integration
- Anakin API - Official API documentation
Changelog
0.1.0 (2026-01-23)
- Initial release
- Added
AnakinScrapeToolfor web scraping - Added
AnakinSearchToolfor AI-powered search - Added
AnakinAgenticSearchToolfor agentic research - Full LangChain integration with BaseTool
- Comprehensive documentation and examples
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_anakin-0.1.0.tar.gz.
File metadata
- Download URL: langchain_anakin-0.1.0.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
939e8c3eb3330b3b6484372133a5215a0ef3e607d2e3a0087f10a467fccc152a
|
|
| MD5 |
5946df43d3365d458c08eebeb1ee13b1
|
|
| BLAKE2b-256 |
71c6206b3a7bdbe16b15777f75f566e5b44776410e91e67e8da163f9d9e80096
|
File details
Details for the file langchain_anakin-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langchain_anakin-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00db347268223e0785b7eab17df9f9b54d76f4dfacf47cbdee21dd77a726d243
|
|
| MD5 |
3515f531811bc4c91859e8b0e3635bc5
|
|
| BLAKE2b-256 |
8ed9e204d9f2022e1d29c0e468c1b801b420b982391d7344774c7cff4a4eadb5
|