LangChain integration for Bright Data web scraping and SERP APIs

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Meirk

These details have not been verified by PyPI

Project description

langchain-brightdata

LangChain integration for Bright Data's web data APIs

Installation • Quick Start • Tools • Configuration • Resources

Overview

langchain-brightdata provides LangChain tools for Bright Data's web data APIs, enabling your AI agents to:

Search - Query search engines with geo-targeting and language customization
Unlock - Access geo-restricted or bot-protected websites
Scrape - Extract structured data from Amazon, LinkedIn, and 100+ domains

Installation

pip install langchain-brightdata

Requirements: Python 3.9+

Quick Start

1. Get your API key

2. Set up authentication

import os
os.environ["BRIGHT_DATA_API_KEY"] = "your-api-key"

Or pass it directly:

from langchain_brightdata import BrightDataSERP
tool = BrightDataSERP(bright_data_api_key="your-api-key")

3. Use with LangChain agents

from langchain_brightdata import BrightDataSERP, BrightDataUnlocker, BrightDataWebScraperAPI
from langchain.agents import initialize_agent, AgentType
from langchain_openai import ChatOpenAI

# Initialize tools
tools = [
    BrightDataSERP(),
    BrightDataUnlocker(),
    BrightDataWebScraperAPI()
]

# Create agent
llm = ChatOpenAI(model="gpt-4")
agent = initialize_agent(tools, llm, agent=AgentType.OPENAI_FUNCTIONS)

# Run
agent.run("Search for the latest AI news and summarize the top result")

Tools

BrightDataSERP

Search engine results with geo-targeting and customization.

from langchain_brightdata import BrightDataSERP

serp = BrightDataSERP()

# Simple search
results = serp.invoke("latest AI research")

# Advanced search
results = serp.invoke({
    "query": "electric vehicles",
    "country": "de",
    "language": "de",
    "search_type": "news",
    "results_count": 20
})

Parameters

Parameter	Type	Default	Description
`query`	str	required	Search query
`zone`	str	`"serp"`	Bright Data zone name
`search_engine`	str	`"google"`	Search engine (`google`, `bing`, `yahoo`)
`country`	str	`"us"`	Two-letter country code
`language`	str	`"en"`	Two-letter language code
`results_count`	int	`10`	Number of results (max 100)
`search_type`	str	`None`	`None` (web), `"isch"` (images), `"shop"`, `"nws"` (news), `"jobs"`
`device_type`	str	`None`	`None` (desktop), `"mobile"`, `"ios"`, `"android"`
`parse_results`	bool	`False`	Return structured JSON

BrightDataUnlocker

Access any public website, bypassing geo-restrictions and bot protection.

from langchain_brightdata import BrightDataUnlocker

unlocker = BrightDataUnlocker()

# Simple access
content = unlocker.invoke("https://example.com")

# With options
content = unlocker.invoke({
    "url": "https://example.com/restricted",
    "country": "gb",
    "data_format": "markdown"
})

Parameters

Parameter	Type	Default	Description
`url`	str	required	URL to access
`zone`	str	`"unlocker"`	Bright Data zone name
`country`	str	`None`	Two-letter country code
`data_format`	str	`None`	`None` (HTML), `"markdown"`, `"screenshot"`

BrightDataWebScraperAPI

Extract structured data from popular websites.

from langchain_brightdata import BrightDataWebScraperAPI

scraper = BrightDataWebScraperAPI()

# Amazon product
product = scraper.invoke({
    "url": "https://www.amazon.com/dp/B08L5TNJHG",
    "dataset_type": "amazon_product"
})

# LinkedIn profile
profile = scraper.invoke({
    "url": "https://www.linkedin.com/in/satyanadella/",
    "dataset_type": "linkedin_person_profile"
})

Parameters

Parameter	Type	Default	Description
`url`	str	required	URL to scrape
`dataset_type`	str	required	Type of data to extract
`zipcode`	str	`None`	Zipcode for location-specific data

Supported Dataset Types (44 Datasets)

E-Commerce (10 datasets)

Type	Description	Required Inputs
`amazon_product`	Product details, pricing, specs	`url` (with /dp/)
`amazon_product_reviews`	Customer reviews and ratings	`url` (with /dp/)
`amazon_product_search`	Search results from Amazon	`keyword`, `url`
`walmart_product`	Walmart product data	`url` (with /ip/)
`walmart_seller`	Walmart seller information	`url`
`ebay_product`	eBay product data	`url`
`homedepot_products`	Home Depot product data	`url`
`zara_products`	Zara product data	`url`
`etsy_products`	Etsy product data	`url`
`bestbuy_products`	Best Buy product data	`url`

LinkedIn (5 datasets)

Type	Description	Required Inputs
`linkedin_person_profile`	Professional profile data	`url`
`linkedin_company_profile`	Company information	`url`
`linkedin_job_listings`	Job listing details	`url`
`linkedin_posts`	Post content and engagement	`url`
`linkedin_people_search`	Search for people	`url`, `first_name`, `last_name`

Business Intelligence (2 datasets)

Type	Description	Required Inputs
`crunchbase_company`	Company funding, investors, metrics	`url`
`zoominfo_company_profile`	B2B company intelligence	`url`

Instagram (4 datasets)

Type	Description	Required Inputs
`instagram_profiles`	Profile data and stats	`url`
`instagram_posts`	Post content and engagement	`url`
`instagram_reels`	Reel content and metrics	`url`
`instagram_comments`	Comments on posts	`url`

Facebook (4 datasets)

Type	Description	Required Inputs
`facebook_posts`	Post content and engagement	`url`
`facebook_marketplace_listings`	Marketplace listing data	`url`
`facebook_company_reviews`	Company reviews	`url`, `num_of_reviews`
`facebook_events`	Event details	`url`

TikTok (4 datasets)

Type	Description	Required Inputs
`tiktok_profiles`	Profile data and stats	`url`
`tiktok_posts`	Video content and metrics	`url`
`tiktok_shop`	Shop product data	`url`
`tiktok_comments`	Comments on videos	`url`

YouTube (3 datasets)

Type	Description	Required Inputs
`youtube_profiles`	Channel profile data	`url`
`youtube_videos`	Video content and metrics	`url`
`youtube_comments`	Comments on videos	`url`, `num_of_comments` (default: 10)

Google (3 datasets)

Type	Description	Required Inputs
`google_maps_reviews`	Business reviews from Maps	`url`, `days_limit` (default: 3)
`google_shopping`	Shopping product data	`url`
`google_play_store`	App store data	`url`

Other Platforms (9 datasets)

Type	Description	Required Inputs
`apple_app_store`	iOS app data	`url`
`x_posts`	X (Twitter) post data	`url`
`reddit_posts`	Reddit post data	`url`
`github_repository_file`	GitHub file content	`url`
`yahoo_finance_business`	Financial business data	`url`
`reuter_news`	News article data	`url`
`zillow_properties_listing`	Real estate listing data	`url`
`booking_hotel_listings`	Hotel listing data	`url`

Configuration

Zone Configuration

Bright Data uses "zones" to manage different API configurations. You can set the zone at initialization or per-request.

Setting zone at initialization

from langchain_brightdata import BrightDataSERP, BrightDataUnlocker

# SERP with custom zone
serp = BrightDataSERP(
    bright_data_api_key="your-api-key",
    zone="my_serp_zone"
)

# Unlocker with custom zone
unlocker = BrightDataUnlocker(
    bright_data_api_key="your-api-key",
    zone="my_unlocker_zone"
)

Setting zone per-request

# Override zone for a specific request
results = serp.invoke({
    "query": "AI news",
    "zone": "different_zone"
})

Default zones

Tool	Default Zone
`BrightDataSERP`	`serp`
`BrightDataUnlocker`	`unlocker`

Note: Zone names must match the zones configured in your Bright Data dashboard.

Resources

License

MIT License - see LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Meirk

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jan 1, 2026

0.1.9

Dec 31, 2025

0.1.8

Dec 31, 2025

0.1.7

Dec 31, 2025

0.1.6

Dec 31, 2025

0.1.5

Dec 31, 2025

0.1.4

Dec 31, 2025

0.1.3

May 4, 2025

0.1.2

Apr 28, 2025

0.1.1

Apr 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_brightdata-0.2.0.tar.gz (13.8 kB view details)

Uploaded Jan 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langchain_brightdata-0.2.0-py3-none-any.whl (16.4 kB view details)

Uploaded Jan 1, 2026 Python 3

File details

Details for the file langchain_brightdata-0.2.0.tar.gz.

File metadata

Download URL: langchain_brightdata-0.2.0.tar.gz
Upload date: Jan 1, 2026
Size: 13.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langchain_brightdata-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`baec311fda3eeed6a8e5c83adf7a61616c2ee64b81aea70a12b5f3f82b3e15a4`
MD5	`c2111a0228c4e743bb1d8fc456ab0ebd`
BLAKE2b-256	`fcc401de93a196e83ed348f80fa55723d286f94d9a18a0237ec499f241d7b0e6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_brightdata-0.2.0.tar.gz:

Publisher: publish.yml on luminati-io/langchain-brightdata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_brightdata-0.2.0.tar.gz
- Subject digest: baec311fda3eeed6a8e5c83adf7a61616c2ee64b81aea70a12b5f3f82b3e15a4
- Sigstore transparency entry: 786987063
- Sigstore integration time: Jan 1, 2026
Source repository:
- Permalink: luminati-io/langchain-brightdata@8ce8df93158dabb0ae9fd39f7695236865216a4c
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/luminati-io
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8ce8df93158dabb0ae9fd39f7695236865216a4c
- Trigger Event: push

File details

Details for the file langchain_brightdata-0.2.0-py3-none-any.whl.

File metadata

Download URL: langchain_brightdata-0.2.0-py3-none-any.whl
Upload date: Jan 1, 2026
Size: 16.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langchain_brightdata-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a4db50e9a52e723a32141d577e6ad8070b6bd33ffde16e32933e513ce12fbb92`
MD5	`c2f932d652f41bd20418f3dfde1ddfac`
BLAKE2b-256	`4ba4162786e2ff1eeef819cc217f38d2e9f176c06c09cd4b7a9d6ce6649697d5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_brightdata-0.2.0-py3-none-any.whl:

Publisher: publish.yml on luminati-io/langchain-brightdata

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_brightdata-0.2.0-py3-none-any.whl
- Subject digest: a4db50e9a52e723a32141d577e6ad8070b6bd33ffde16e32933e513ce12fbb92
- Sigstore transparency entry: 786987066
- Sigstore integration time: Jan 1, 2026
Source repository:
- Permalink: luminati-io/langchain-brightdata@8ce8df93158dabb0ae9fd39f7695236865216a4c
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/luminati-io
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8ce8df93158dabb0ae9fd39f7695236865216a4c
- Trigger Event: push

langchain-brightdata 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

langchain-brightdata

Overview

Installation

Quick Start

1. Get your API key

2. Set up authentication

3. Use with LangChain agents

Tools

BrightDataSERP

Parameters

BrightDataUnlocker

Parameters

BrightDataWebScraperAPI

Parameters

Supported Dataset Types (44 Datasets)

Configuration

Zone Configuration

Setting zone at initialization

Setting zone per-request

Default zones

Resources

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance