A comprehensive collection of async tools for web scraping, searching, and data extraction using the Firecrawl API

These details have not been verified by PyPI

Project links

Project description

Firecrawl Tools

A comprehensive collection of async tools for web scraping, searching, and data extraction using the Firecrawl API. Built with LangChain for seamless integration with AI applications.

Features

URL Scraping: Extract content from single URLs with multiple format options
Web Search: Search the web and optionally scrape search results
Website Mapping: Discover all indexed URLs on a website
Structured Data Extraction: Extract specific information using LLM capabilities
Deep Research: Conduct comprehensive web research with intelligent crawling
Website Crawling: Asynchronous crawling of entire websites
Crawl Status Monitoring: Track and manage crawl jobs

Installation

pip install firecrawl-tools

Quick Start

import asyncio
from firecrawl_tools import FirecrawlTools

# Initialize with your API key
tools = FirecrawlTools(api_key="your_firecrawl_api_key")

# Get individual tools
scrape_tool = await tools.get_scrape_tool()
search_tool = await tools.get_search_tool()

# Use the tools
content = await scrape_tool.ainvoke({
    "url": "https://example.com",
    "formats": ["markdown"],
    "only_main_content": True
})

Available Tools

1. URL Scraping Tool

Extract content from a single URL with advanced options.

scrape_tool = await tools.get_scrape_tool()
result = await scrape_tool.ainvoke({
    "url": "https://example.com",
    "formats": ["markdown", "html"],
    "only_main_content": True,
    "wait_for": 2000,
    "mobile": False
})

2. Web Search Tool

Search the web and optionally extract content from results.

search_tool = await tools.get_search_tool()
results = await search_tool.ainvoke({
    "query": "Python web scraping",
    "limit": 5,
    "scrape_options": {
        "formats": ["markdown"],
        "onlyMainContent": True
    }
})

3. Website Mapping Tool

Discover all indexed URLs on a website.

map_tool = await tools.get_map_tool()
urls = await map_tool.ainvoke({
    "url": "https://example.com",
    "include_subdomains": True,
    "limit": 100
})

4. Structured Data Extraction Tool

Extract specific information using LLM capabilities.

extract_tool = await tools.get_extract_tool()
data = await extract_tool.ainvoke({
    "urls": ["https://example.com"],
    "prompt": "Extract all product names and prices",
    "schema": {
        "type": "object",
        "properties": {
            "products": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "price": {"type": "string"}
                    }
                }
            }
        }
    }
})

5. Deep Research Tool

Conduct comprehensive web research.

research_tool = await tools.get_research_tool()
analysis = await research_tool.ainvoke({
    "query": "Latest developments in AI",
    "max_depth": 3,
    "time_limit": 120,
    "max_urls": 50
})

6. Website Crawling Tool

Crawl entire websites asynchronously.

crawl_tool = await tools.get_crawl_tool()
job_id = await crawl_tool.ainvoke({
    "url": "https://example.com",
    "max_depth": 2,
    "limit": 100,
    "allow_external_links": False
})

7. Crawl Status Tool

Check the status of crawl jobs.

status_tool = await tools.get_status_tool()
status = await status_tool.ainvoke({
    "crawl_id": "your_crawl_job_id"
})

ReAct Agent Integration

Firecrawl Tools work seamlessly with LangChain's ReAct agents, allowing you to build intelligent applications that automatically choose the right tool for each task.

Basic ReAct Agent Setup

import asyncio
from langchain_openai import ChatOpenAI
from langchain.agents import initialize_agent, AgentType
from firecrawl_tools import FirecrawlTools

async def create_react_agent():
    # Initialize Firecrawl tools
    tools = FirecrawlTools(api_key="your_firecrawl_api_key")
    tools_dict = await tools.get_tools_dict()
    tool_list = list(tools_dict.values())
    
    # Initialize OpenAI LLM
    llm = ChatOpenAI(
        openai_api_key="your_openai_api_key",
        temperature=0,
        model="gpt-4o-mini"
    )
    
    # Create ReAct agent
    agent = initialize_agent(
        tool_list,
        llm,
        agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
        verbose=True,
        max_iterations=5,
        handle_parsing_errors=True,
    )
    
    return agent

# Use the agent
agent = await create_react_agent()
result = await agent.ainvoke(
    "Find the main topic of https://example.com and summarize it in 2 sentences."
)

Example Queries

The ReAct agent can handle various natural language queries:

"What are the latest news headlines on cricbuzz.com?"
"Extract all product names and prices from https://example.com"
"Search for information about Python web scraping and provide a summary."
"Map all URLs on https://example.com and list the top 5 pages."

The agent automatically chooses the appropriate Firecrawl tool (scrape, search, extract, map, etc.) based on the query.

Complete Example

See examples/react_agent_example.py for a complete working example with multiple queries and error handling.

Configuration

You can configure the tools using environment variables or by passing configuration directly:

# Using environment variable
export FIRECRAWL_API_KEY="your_api_key"

# Or pass configuration directly
tools = FirecrawlTools(api_key="your_api_key")

Error Handling

All tools include comprehensive error handling and will raise ToolException with descriptive error messages:

from langchain_core.tools import ToolException

try:
    result = await scrape_tool.ainvoke({"url": "https://example.com"})
except ToolException as e:
    print(f"Error: {e}")

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Documentation: https://github.com/ichbineshan/firecrawl-tools-py
Issues: https://github.com/ichbineshan/firecrawl-tools-py/issues
Discussions: https://github.com/ichbineshan/firecrawl-tools-py/discussions

Changelog

See CHANGELOG.md for a list of changes and version history.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 29, 2025

0.1.0

Jun 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

firecrawl_tools-0.1.1.tar.gz (22.4 kB view details)

Uploaded Jun 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

firecrawl_tools-0.1.1-py3-none-any.whl (15.1 kB view details)

Uploaded Jun 29, 2025 Python 3

File details

Details for the file firecrawl_tools-0.1.1.tar.gz.

File metadata

Download URL: firecrawl_tools-0.1.1.tar.gz
Upload date: Jun 29, 2025
Size: 22.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for firecrawl_tools-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`1808a5dedeb7eed6afd468fcf70b98f6f80d3037fe5de30c6a4dc0ae50e3e10f`
MD5	`a2f8b6d2278edc64ff92b6c86f841b5b`
BLAKE2b-256	`9f9c238eeb7ac5f1c7a5235e3b9afe4a49d5aa4faae5bd26a19fdcb5d5209341`

See more details on using hashes here.

File details

Details for the file firecrawl_tools-0.1.1-py3-none-any.whl.

File metadata

Download URL: firecrawl_tools-0.1.1-py3-none-any.whl
Upload date: Jun 29, 2025
Size: 15.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for firecrawl_tools-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`842302b9bff4cdf837f0bd24937fe5f3c7d04be06475aeffb499d7566af6ceb5`
MD5	`f795f6b10b71c69eadc48288ffec01c4`
BLAKE2b-256	`c161da498377cce800913771c065b6b9105c70be5c5bcf6ecfc5d30cd0eeb768`

See more details on using hashes here.

firecrawl-tools 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Firecrawl Tools

Features

Installation

Quick Start

Available Tools

1. URL Scraping Tool

2. Web Search Tool

3. Website Mapping Tool

4. Structured Data Extraction Tool

5. Deep Research Tool

6. Website Crawling Tool

7. Crawl Status Tool

ReAct Agent Integration

Basic ReAct Agent Setup

Example Queries

Complete Example

Configuration

Error Handling

Contributing

License

Support

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes