langchain-scrapeless

An integration package connecting Scrapeless and LangChain

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

scrapeless_team

These details have not been verified by PyPI

Project links

Scrapeless Homepage

Project description

LangChain Scrapeless: an all-in-one, highly scalable web scraping toolkit for enterprises and developers that also integrates with LangChain’s AI tools. Maintained by Scrapeless.

Scrapeless | Documentation | LangChain

langchain-scrapeless is designed for seamless integration with LangChain, enabling you to:

Run custom scraping tasks using your own crawlers or scraping logic.
Automate data extraction and processing workflows in Python.
Manage and interact with datasets produced by your scraping jobs.
Access scraping and data handling capabilities as LangChain tools, making them easy to compose with LLM-powered chains and agents.

📦 Installation

pip install langchain-scrapeless

✅ Prerequisites

You should configure the credentials for the Scrapeless API in your environment variables.

SCRAPELESS_API_KEY: Your Scrapeless API key.

If you don't have an API key, you can register at here and learn how to get your API key in Scrapeless documentation.

🛠️ Available Tools

🔍 DeepSerp

🌐 ScrapelessDeepSerpGoogleSearchTool

Perform Google search queries and get the results.

from langchain_scrapeless import ScrapelessDeepSerpGoogleSearchTool

tool = ScrapelessDeepSerpGoogleSearchTool()

# Basic usage
# result = tool.invoke("I want to know Scrapeless")
# print(result)

# Advanced usage
result = tool.invoke({
    "q": "Scrapeless",
    "hl": "en",
    "google_domain": "google.com"
})
print(result)

# With LangChain
from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessDeepSerpGoogleSearchTool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI()

tool = ScrapelessDeepSerpGoogleSearchTool()

# Use the tool with an agent
tools = [tool]
agent = create_react_agent(llm, tools)

for chunk in agent.stream(
        {"messages": [("human", "I want to what is Scrapeless")]},
        stream_mode="values"
):
    chunk["messages"][-1].pretty_print()

You can visit here to learn more customizations options.

🌐 ScrapelessDeepSerpGoogleTrendsTool

Perform Google trends queries and get the results.

from langchain_scrapeless import ScrapelessDeepSerpGoogleTrendsTool

tool = ScrapelessDeepSerpGoogleTrendsTool()

# Basic usage
# result = tool.invoke("Funny 2048,negamon monster trainer")
# print(result)

# Advanced usage
result = tool.invoke({
    "q": "Scrapeless",
    "data_type": "related_topics",
    "hl": "en"
})
print(result)

# With LangChain
from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessDeepSerpGoogleTrendsTool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI()

tool = ScrapelessDeepSerpGoogleTrendsTool()

# Use the tool with an agent
tools = [tool]
agent = create_react_agent(llm, tools)

for chunk in agent.stream(
        {"messages": [("human", "I want to know the iphone keyword trends")]},
        stream_mode="values"
):
    chunk["messages"][-1].pretty_print()

You can visit here to learn more customizations options.

🔓 ScrapelessUniversalScrapingTool

Access any website at scale and say goodbye to blocks.

from langchain_scrapeless import ScrapelessUniversalScrapingTool

tool = ScrapelessUniversalScrapingTool()

# Basic usage
# result = tool.invoke("https://example.com")
# print(result)

# Advanced usage
result = tool.invoke({
    "url": "https://exmaple.com",
    "response_type": "markdown"
})
print(result)

# With LangChain
from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessUniversalScrapingTool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI()

tool = ScrapelessUniversalScrapingTool()

# Use the tool with an agent
tools = [tool]
agent = create_react_agent(llm, tools)

for chunk in agent.stream(
        {"messages": [("human", "Use the scrapeless scraping tool to fetch https://www.scrapeless.com/en and extract the h1 tag.")]},
        stream_mode="values"
):
    chunk["messages"][-1].pretty_print()

You can visit here to learn more customizations options.

🕷️ Crawler

🌐 ScrapelessCrawlerCrawlTool

Crawl a website and its linked pages to extract comprehensive data

from langchain_scrapeless import ScrapelessCrawlerCrawlTool

tool = ScrapelessCrawlerCrawlTool()

# Basic
# result = tool.invoke("https://example.com")
# print(result)

# Advanced usage
result = tool.invoke({
    "url": "https://exmaple.com",
    "limit": 4
})
print(result)

# With LangChain
from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessCrawlerCrawlTool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI()

tool = ScrapelessCrawlerCrawlTool()

# Use the tool with an agent
tools = [tool]
agent = create_react_agent(llm, tools)

for chunk in agent.stream(
        {"messages": [("human", "Use the scrapeless crawler crawl tool to crawl the website https://example.com and output the markdown content as a string.")]},
        stream_mode="values"
):
    chunk["messages"][-1].pretty_print()

You can visit here to learn more customizations options.

🌐 ScrapelessCrawlerScrapeTool

Extract data from a single or multiple webpages.

from langchain_scrapeless import ScrapelessCrawlerScrapeTool

tool = ScrapelessCrawlerScrapeTool()

result = tool.invoke({
    "urls": ["https://exmaple.com", "https://www.scrapeless.com/en"],
    "formats": ["markdown"]
})
print(result)

# With LangChain
from langchain_openai import ChatOpenAI
from langchain_scrapeless import ScrapelessCrawlerScrapeTool
from langgraph.prebuilt import create_react_agent

llm = ChatOpenAI()

tool = ScrapelessCrawlerScrapeTool()

# Use the tool with an agent
tools = [tool]
agent = create_react_agent(llm, tools)

for chunk in agent.stream(
        {"messages": [("human", "Use the scrapeless crawler scrape tool to get the website content of https://example.com and output the html content as a string.")]},
        stream_mode="values"
):
    chunk["messages"][-1].pretty_print()

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

scrapeless_team

These details have not been verified by PyPI

Project links

Scrapeless Homepage

Release history Release notifications | RSS feed

This version

0.1.3

Jul 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_scrapeless-0.1.3.tar.gz (18.0 kB view details)

Uploaded Jul 17, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langchain_scrapeless-0.1.3-py3-none-any.whl (26.2 kB view details)

Uploaded Jul 17, 2025 Python 3

File details

Details for the file langchain_scrapeless-0.1.3.tar.gz.

File metadata

Download URL: langchain_scrapeless-0.1.3.tar.gz
Upload date: Jul 17, 2025
Size: 18.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langchain_scrapeless-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`7eb799342c875b8074016cf2beec57a594763392e3110643263111b0abc35f59`
MD5	`d7132e9c1fa545ce4c3a010692fcecef`
BLAKE2b-256	`427f65d0aa635bdeab98bfb8c36745079846f47547a9cc955a8750cdea152a5e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_scrapeless-0.1.3.tar.gz:

Publisher: publish.yml on scrapeless-ai/langchain-scrapeless

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_scrapeless-0.1.3.tar.gz
- Subject digest: 7eb799342c875b8074016cf2beec57a594763392e3110643263111b0abc35f59
- Sigstore transparency entry: 280646761
- Sigstore integration time: Jul 17, 2025
Source repository:
- Permalink: scrapeless-ai/langchain-scrapeless@953d3a9194dfec9bf46a70492adde239afead6a8
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/scrapeless-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@953d3a9194dfec9bf46a70492adde239afead6a8
- Trigger Event: release

File details

Details for the file langchain_scrapeless-0.1.3-py3-none-any.whl.

File metadata

Download URL: langchain_scrapeless-0.1.3-py3-none-any.whl
Upload date: Jul 17, 2025
Size: 26.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langchain_scrapeless-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`29f4f49f8d7a3017e7e311454c5b71cba76845c2e8a29a4508486bd7284a592a`
MD5	`51a765ed51ab4d19047d168488cc7790`
BLAKE2b-256	`929407bcdc6caf7652963d165a7740b420639466b621b67a265a9d42d09f5dae`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_scrapeless-0.1.3-py3-none-any.whl:

Publisher: publish.yml on scrapeless-ai/langchain-scrapeless

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_scrapeless-0.1.3-py3-none-any.whl
- Subject digest: 29f4f49f8d7a3017e7e311454c5b71cba76845c2e8a29a4508486bd7284a592a
- Sigstore transparency entry: 280646769
- Sigstore integration time: Jul 17, 2025
Source repository:
- Permalink: scrapeless-ai/langchain-scrapeless@953d3a9194dfec9bf46a70492adde239afead6a8
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/scrapeless-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@953d3a9194dfec9bf46a70492adde239afead6a8
- Trigger Event: release

langchain-scrapeless 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

📦 Installation

✅ Prerequisites

🛠️ Available Tools

🔍 DeepSerp

🌐 ScrapelessDeepSerpGoogleSearchTool

🌐 ScrapelessDeepSerpGoogleTrendsTool

🔓 ScrapelessUniversalScrapingTool

🕷️ Crawler

🌐 ScrapelessCrawlerCrawlTool

🌐 ScrapelessCrawlerScrapeTool

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance