CrewAI tool for loading web pages using Tzafon's headless browser infrastructure
Project description
crewai-tzafon
A CrewAI tool for loading web pages using Tzafon's cloud-based headless browser infrastructure.
crewai-tzafon enables your CrewAI agents to scrape modern web applications with full JavaScript rendering, handling SPAs and dynamically loaded content seamlessly.
Features
- Cloud-Based Headless Browser: Powered by Tzafon's remote browser instances
- JavaScript Support: Naturally handles SPAs and dynamically loaded content
- Easy Integration: Works seamlessly with CrewAI's tool system
- Flexible Output: Extract either text content or raw HTML
- Production Ready: Built on Playwright for reliable web scraping
Installation
pip install crewai-tzafon
Note: This package requires Playwright for connecting to the remote browser.
Configuration
To use this package, you need a Tzafon API Key.
- Sign up or log in at tzafon.ai to get your API key.
- Set it as an environment variable (recommended):
export TZAFON_API_KEY="your_api_key_here"
Alternatively, you can pass the API key directly when initializing the tool.
For more details, see the Tzafon documentation.
Usage
Basic Usage
from crewai import Agent, Task, Crew
from crewai_tzafon import TzafonLoadTool
# Initialize the tool
tzafon_tool = TzafonLoadTool()
# Create an agent with the tool
web_researcher = Agent(
role="Web Researcher",
goal="Extract information from web pages",
backstory="Expert at web research and data extraction",
tools=[tzafon_tool],
verbose=True
)
# Create a task
research_task = Task(
description="Research and summarize the content from https://example.com",
expected_output="A comprehensive summary of the webpage content",
agent=web_researcher
)
# Create and run the crew
crew = Crew(
agents=[web_researcher],
tasks=[research_task]
)
result = crew.kickoff()
print(result)
With Custom API Key
from crewai_tzafon import TzafonLoadTool
# Initialize with explicit API key
tool = TzafonLoadTool(api_key="your_api_key_here")
Extract HTML Instead of Text
from crewai import Agent, Task, Crew
from crewai_tzafon import TzafonLoadTool
# The tool accepts parameters in the task description
tzafon_tool = TzafonLoadTool()
html_extractor = Agent(
role="HTML Extractor",
goal="Extract raw HTML from web pages",
backstory="Specialist in extracting structured HTML content",
tools=[tzafon_tool],
verbose=True
)
task = Task(
description="Extract the HTML from https://example.com",
expected_output="Raw HTML content of the webpage",
agent=html_extractor
)
crew = Crew(agents=[html_extractor], tasks=[task])
result = crew.kickoff()
Advanced Example: Multi-Page Research
from crewai import Agent, Task, Crew, Process
from crewai_tzafon import TzafonLoadTool
# Initialize tool
tzafon_tool = TzafonLoadTool()
# Create researcher agent
researcher = Agent(
role="Senior Web Researcher",
goal="Conduct comprehensive research across multiple web sources",
backstory="Expert researcher with deep knowledge of information extraction",
tools=[tzafon_tool],
verbose=True
)
# Create analyst agent
analyst = Agent(
role="Data Analyst",
goal="Analyze and synthesize research findings",
backstory="Skilled at finding patterns and insights in data",
verbose=True
)
# Research task
research_task = Task(
description="""
Research the following topics:
1. Latest AI developments from https://openai.com/blog
2. AI research trends from https://arxiv.org
Extract key information from each source.
""",
expected_output="Comprehensive research findings from multiple sources",
agent=researcher
)
# Analysis task
analysis_task = Task(
description="Analyze the research findings and identify key trends and insights",
expected_output="Detailed analysis report with key trends and recommendations",
agent=analyst
)
# Create crew with sequential process
crew = Crew(
agents=[researcher, analyst],
tasks=[research_task, analysis_task],
process=Process.sequential,
verbose=True
)
result = crew.kickoff()
print(result)
API Reference
TzafonLoadTool
A CrewAI tool for loading web pages using Tzafon's headless browser.
Initialization Parameters
| Parameter | Type | Description |
|---|---|---|
api_key |
Optional[str] |
Tzafon API key. Defaults to TZAFON_API_KEY env var. |
Tool Input Schema
When your agent uses this tool, it accepts the following parameters:
| Parameter | Type | Description |
|---|---|---|
url |
str |
The URL of the web page to load (required). |
text_content |
bool |
If True (default), extracts visible text. If False, returns raw HTML. |
Returns
str: The extracted page content (either text or HTML).
How It Works
- Tool Initialization: The TzafonLoadTool connects to Tzafon's API using your API key
- Browser Creation: When invoked, it creates a cloud-based browser instance
- Page Loading: Uses Playwright to connect to the browser and navigate to the URL
- Content Extraction: Waits for the page to fully render, then extracts content
- Cleanup: Automatically terminates the browser instance after extraction
Examples Repository
For more examples and use cases, check out:
Troubleshooting
Common Issues
API Key Not Found
ValueError: Tzafon API key is required
Solution: Set the TZAFON_API_KEY environment variable or pass api_key when initializing the tool.
Connection Timeout If you experience timeouts, ensure:
- Your API key is valid and active
- You have a stable internet connection
- The target URL is accessible
Playwright Not Installed
ModuleNotFoundError: No module named 'playwright'
Solution: Install playwright: pip install playwright
License
This project is licensed under the MIT License.
Links
Support
For issues, questions, or contributions:
- Open an issue on GitHub
- Check the documentation
- Join the Tzafon community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crewai_tzafon-1.0.0.tar.gz.
File metadata
- Download URL: crewai_tzafon-1.0.0.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25b7c76ed8d58175ae57480fc8dbcc65c8f1d21b242f9ccd0805ab9c6023a84d
|
|
| MD5 |
978d093b74ed754a240efb97cd4aa54d
|
|
| BLAKE2b-256 |
af96e8f6d1aaf8e3eb4146d1f128c4b7efabd18a2e6094b36d423b74b9c3d693
|
File details
Details for the file crewai_tzafon-1.0.0-py3-none-any.whl.
File metadata
- Download URL: crewai_tzafon-1.0.0-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4fa2150377f4437f166566ea10cf6aaa9a245694eebe936e2a38478bdf73bc3
|
|
| MD5 |
acaf0d2bcb28057d73d5f5434d8167fe
|
|
| BLAKE2b-256 |
132995e0588141e0388a1c2839f6c364b06078e5f696654dce3f09513c2ac7c3
|