Bright Data tools for LangChain
Project description
🌟 langchain-brightdata
Access powerful web data capabilities for your AI agents with Bright Data! 🚀
📋 Overview
This package provides LangChain integrations for Bright Data's suite of web data collection tools, allowing your AI agents to:
- 🔍 Collect search engine results with geo-targeting
- 🌐 Access websites that might be geo-restricted or protected by anti-bot systems
- 📊 Extract structured data from popular websites like Amazon, LinkedIn, and more
Perfect for AI agents that need real-time web data!
🛠️ Installation
pip install langchain-brightdata
🔑 Setup
You'll need a Bright Data API key to use these tools. Set it as an environment variable:
import os
os.environ["BRIGHT_DATA_API_KEY"] = "your-api-key"
Or pass it directly when initializing tools:
from langchain_brightdata import BrightDataSERP
tool = BrightDataSERP(bright_data_api_key="your-api-key")
🧰 Available Tools
🔍 BrightDataSERP
Perform search engine queries with customizable geo-targeting, device type, and language settings.
from langchain_brightdata import BrightDataSERP
# Basic usage
serp_tool = BrightDataSERP(bright_data_api_key="your-api-key")
results = serp_tool.invoke("latest AI research papers")
# Advanced usage with parameters
results = serp_tool.invoke({
"query": "best electric vehicles",
"country": "de", # Get results as if searching from Germany
"language": "de", # Get results in German
"search_type": "shop", # Get shopping results
"device_type": "mobile", # Simulate a mobile device
"results_count": 15
})
🎛️ Customization Options
Parameter | Type | Description |
---|---|---|
query |
str | The search query to perform |
search_engine |
str | Search engine to use (default: "google") |
country |
str | Two-letter country code for localized results (default: "us") |
language |
str | Two-letter language code (default: "en") |
results_count |
int | Number of results to return (default: 10) |
search_type |
str | Type of search: None (web), "isch" (images), "shop", "nws" (news), "jobs" |
device_type |
str | Device type: None (desktop), "mobile", "ios", "android" |
parse_results |
bool | Whether to return structured JSON (default: False) |
🌐 BrightDataUnlocker
Access ANY public website that might be geo-restricted or protected by anti-bot systems.
from langchain_brightdata import BrightDataUnlocker
# Basic usage
unlocker_tool = BrightDataUnlocker(bright_data_api_key="your-api-key")
result = unlocker_tool.invoke("https://example.com")
# Advanced usage with parameters
result = unlocker_tool.invoke({
"url": "https://example.com/region-restricted-content",
"country": "gb", # Access as if from Great Britain
"data_format": "markdown", # Get content in markdown format
"zone": "unlocker" # Use the unlocker zone
})
🎛️ Customization Options
Parameter | Type | Description |
---|---|---|
url |
str | The URL to access |
format |
str | Format of the response content (default: "raw") |
country |
str | Two-letter country code for geo-specific access (e.g., "us", "gb") |
zone |
str | Bright Data zone to use (default: "unblocker") |
data_format |
str | Output format: None (HTML), "markdown", or "screenshot" |
📊 BrightDataWebScraperAPI
Extract structured data from 100+ popular domains, including Amazon, LinkedIn, and more.
from langchain_brightdata import BrightDataWebScraperAPI
# Initialize the tool
scraper_tool = BrightDataWebScraperAPI(bright_data_api_key="your-api-key")
# Extract Amazon product data
results = scraper_tool.invoke({
"url": "https://www.amazon.com/dp/B08L5TNJHG",
"dataset_type": "amazon_product"
})
# Extract LinkedIn profile data
linkedin_results = scraper_tool.invoke({
"url": "https://www.linkedin.com/in/satyanadella/",
"dataset_type": "linkedin_person_profile"
})
🎛️ Customization Options
Parameter | Type | Description |
---|---|---|
url |
str | The URL to extract data from |
dataset_type |
str | Type of dataset to use (e.g., "amazon_product") |
zipcode |
str | Optional zipcode for location-specific data |
📂 Available Dataset Types
Dataset Type | Description |
---|---|
amazon_product |
Extract detailed Amazon product data |
amazon_product_reviews |
Extract Amazon product reviews |
linkedin_person_profile |
Extract LinkedIn person profile data |
linkedin_company_profile |
Extract LinkedIn company profile data |
📚 Additional Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file langchain_brightdata-0.1.3.tar.gz
.
File metadata
- Download URL: langchain_brightdata-0.1.3.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
a18ae6e03ec88f0ba19955ea6305a0d7d144a779acd311e7309282537037c514
|
|
MD5 |
ee675a306d3ba3eb184d67c8aac7053a
|
|
BLAKE2b-256 |
a729855fdc3147ba4db42bc7fd1f1a7979345422b00e405a043096a7cf9943ee
|
File details
Details for the file langchain_brightdata-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: langchain_brightdata-0.1.3-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
568ad486ca6790f99349ccf6ee4ad2c1bd37d3df4ac9003bb73b42b0e7527dcb
|
|
MD5 |
8607c0e73ad4c5b79c41d6caff9d700c
|
|
BLAKE2b-256 |
a8c52733cbb72d3e9cc247acdbcbb5e2d8ef9561f6ce72c8564f1a82e5641926
|