Scrapling is an undetectable, powerful, flexible, high-performance Python library that makes Web Scraping easy and effortless as it should be!

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

D4Vinci

These details have not been verified by PyPI

Project links

Documentation

Project description

Easy, effortless Web Scraping as it should be!

Selection methods · Choosing a fetcher · CLI · MCP mode · Migrating from Beautifulsoup

Stop fighting anti-bot systems. Stop rewriting selectors after every website update.

Scrapling isn't just another Web Scraping library. It's the first adaptive scraping library that learns from website changes and evolves with them. While other libraries break when websites update their structure, Scrapling automatically relocates your elements and keeps your scrapers running.

Built for the modern Web, Scrapling features its own rapid parsing engine and fetchers to handle all Web Scraping challenges you face or will face. Built by Web Scrapers for Web Scrapers and regular users, there's something for everyone.

>> from scrapling.fetchers import Fetcher, AsyncFetcher, StealthyFetcher, DynamicFetcher
>> StealthyFetcher.adaptive = True
# Fetch websites' source under the radar!
>> page = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True)
>> print(page.status)
200
>> products = page.css('.product', auto_save=True)  # Scrape data that survives website design changes!
>> # Later, if the website structure changes, pass `adaptive=True`
>> products = page.css('.product', adaptive=True)  # and Scrapling still finds them!

Sponsors

_{Do you want to show your ad here? Click here and choose the tier that suites you!}

Key Features

Advanced Websites Fetching with Session Support

HTTP Requests: Fast and stealthy HTTP requests with the Fetcher class. Can impersonate browsers' TLS fingerprint, headers, and use HTTP3.
Dynamic Loading: Fetch dynamic websites with full browser automation through the DynamicFetcher class supporting Playwright's Chromium and Google's Chrome.
Anti-bot Bypass: Advanced stealth capabilities with StealthyFetcher and fingerprint spoofing. Can easily bypass all types of Cloudflare's Turnstile/Interstitial with automation.
Session Management: Persistent session support with FetcherSession, StealthySession, and DynamicSession classes for cookie and state management across requests.
Async Support: Complete async support across all fetchers and dedicated async session classes.

Adaptive Scraping & AI Integration

🔄 Smart Element Tracking: Relocate elements after website changes using intelligent similarity algorithms.
🎯 Smart Flexible Selection: CSS selectors, XPath selectors, filter-based search, text search, regex search, and more.
🔍 Find Similar Elements: Automatically locate elements similar to found elements.
🤖 MCP Server to be used with AI: Built-in MCP server for AI-assisted Web Scraping and data extraction. The MCP server features powerful, custom capabilities that leverage Scrapling to extract targeted content before passing it to the AI (Claude/Cursor/etc), thereby speeding up operations and reducing costs by minimizing token usage. (demo video)

High-Performance & battle-tested Architecture

🚀 Lightning Fast: Optimized performance outperforming most Python scraping libraries.
🔋 Memory Efficient: Optimized data structures and lazy loading for a minimal memory footprint.
⚡ Fast JSON Serialization: 10x faster than the standard library.
🏗️ Battle tested: Not only does Scrapling have 92% test coverage and full type hints coverage, but it has been used daily by hundreds of Web Scrapers over the past year.

Developer/Web Scraper Friendly Experience

🎯 Interactive Web Scraping Shell: Optional built-in IPython shell with Scrapling integration, shortcuts, and new tools to speed up Web Scraping scripts development, like converting curl requests to Scrapling requests and viewing requests results in your browser.
🚀 Use it directly from the Terminal: Optionally, you can use Scrapling to scrape a URL without writing a single code!
🛠️ Rich Navigation API: Advanced DOM traversal with parent, sibling, and child navigation methods.
🧬 Enhanced Text Processing: Built-in regex, cleaning methods, and optimized string operations.
📝 Auto Selector Generation: Generate robust CSS/XPath selectors for any element.
🔌 Familiar API: Similar to Scrapy/BeautifulSoup with the same pseudo-elements used in Scrapy/Parsel.
📘 Complete Type Coverage: Full type hints for excellent IDE support and code completion.
🔋 Ready Docker image: With each release, a Docker image containing all browsers is automatically built and pushed.

Getting Started

Basic Usage

from scrapling.fetchers import Fetcher, StealthyFetcher, DynamicFetcher
from scrapling.fetchers import FetcherSession, StealthySession, DynamicSession

# HTTP requests with session support
with FetcherSession(impersonate='chrome') as session:  # Use latest version of Chrome's TLS fingerprint
    page = session.get('https://quotes.toscrape.com/', stealthy_headers=True)
    quotes = page.css('.quote .text::text')

# Or use one-off requests
page = Fetcher.get('https://quotes.toscrape.com/')
quotes = page.css('.quote .text::text')

# Advanced stealth mode (Keep the browser open until you finish)
with StealthySession(headless=True, solve_cloudflare=True) as session:
    page = session.fetch('https://nopecha.com/demo/cloudflare', google_search=False)
    data = page.css('#padded_content a')

# Or use one-off request style, it opens the browser for this request, then closes it after finishing
page = StealthyFetcher.fetch('https://nopecha.com/demo/cloudflare')
data = page.css('#padded_content a')
    
# Full browser automation (Keep the browser open until you finish)
with DynamicSession(headless=True, disable_resources=False, network_idle=True) as session:
    page = session.fetch('https://quotes.toscrape.com/', load_dom=False)
    data = page.xpath('//span[@class="text"]/text()')  # XPath selector if you prefer it

# Or use one-off request style, it opens the browser for this request, then closes it after finishing
page = DynamicFetcher.fetch('https://quotes.toscrape.com/')
data = page.css('.quote .text::text')

[!NOTE] There's a wonderful guide to get you started quickly with Scrapling here written by The Web Scraping Club. In case you find it easier to get you started than the documentation website.

Advanced Parsing & Navigation

from scrapling.fetchers import Fetcher

# Rich element selection and navigation
page = Fetcher.get('https://quotes.toscrape.com/')

# Get quotes with multiple selection methods
quotes = page.css('.quote')  # CSS selector
quotes = page.xpath('//div[@class="quote"]')  # XPath
quotes = page.find_all('div', {'class': 'quote'})  # BeautifulSoup-style
# Same as
quotes = page.find_all('div', class_='quote')
quotes = page.find_all(['div'], class_='quote')
quotes = page.find_all(class_='quote')  # and so on...
# Find element by text content
quotes = page.find_by_text('quote', tag='div')

# Advanced navigation
first_quote = page.css_first('.quote')
quote_text = first_quote.css('.text::text')
quote_text = page.css('.quote').css_first('.text::text')  # Chained selectors
quote_text = page.css_first('.quote .text').text  # Using `css_first` is faster than `css` if you want the first element
author = first_quote.next_sibling.css('.author::text')
parent_container = first_quote.parent

# Element relationships and similarity
similar_elements = first_quote.find_similar()
below_elements = first_quote.below_elements()

You can use the parser right away if you don't want to fetch websites like below:

from scrapling.parser import Selector

page = Selector("<html>...</html>")

And it works precisely the same way!

Async Session Management Examples

import asyncio
from scrapling.fetchers import FetcherSession, AsyncStealthySession, AsyncDynamicSession

async with FetcherSession(http3=True) as session:  # `FetcherSession` is context-aware and can work in both sync/async patterns
    page1 = session.get('https://quotes.toscrape.com/')
    page2 = session.get('https://quotes.toscrape.com/', impersonate='firefox135')

# Async session usage
async with AsyncStealthySession(max_pages=2) as session:
    tasks = []
    urls = ['https://example.com/page1', 'https://example.com/page2']
    
    for url in urls:
        task = session.fetch(url)
        tasks.append(task)
    
    print(session.get_pool_stats())  # Optional - The status of the browser tabs pool (busy/free/error)
    results = await asyncio.gather(*tasks)
    print(session.get_pool_stats())

CLI & Interactive Shell

Scrapling v0.3 includes a powerful command-line interface:

Launch the interactive Web Scraping shell

scrapling shell

Extract pages to a file directly without programming (Extracts the content inside the body tag by default). If the output file ends with .txt, then the text content of the target will be extracted. If it ends in .md, it will be a Markdown representation of the HTML content; if it ends in .html, it will be the HTML content itself.

scrapling extract get 'https://example.com' content.md
scrapling extract get 'https://example.com' content.txt --css-selector '#fromSkipToProducts' --impersonate 'chrome'  # All elements matching the CSS selector '#fromSkipToProducts'
scrapling extract fetch 'https://example.com' content.md --css-selector '#fromSkipToProducts' --no-headless
scrapling extract stealthy-fetch 'https://nopecha.com/demo/cloudflare' captchas.html --css-selector '#padded_content a' --solve-cloudflare

[!NOTE] There are many additional features, but we want to keep this page concise, such as the MCP server and the interactive Web Scraping Shell. Check out the full documentation here

Performance Benchmarks

Scrapling isn't just powerful—it's also blazing fast, and the updates since version 0.3 have delivered exceptional performance improvements across all operations. The following benchmarks compare Scrapling's parser with other popular libraries.

Text Extraction Speed Test (5000 nested elements)

#	Library	Time (ms)	vs Scrapling
1	Scrapling	1.99	1.0x
2	Parsel/Scrapy	2.01	1.01x
3	Raw Lxml	2.5	1.256x
4	PyQuery	22.93	~11.5x
5	Selectolax	80.57	~40.5x
6	BS4 with Lxml	1541.37	~774.6x
7	MechanicalSoup	1547.35	~777.6x
8	BS4 with html5lib	3410.58	~1713.9x

Element Similarity & Text Search Performance

Scrapling's adaptive element finding capabilities significantly outperform alternatives:

Library	Time (ms)	vs Scrapling
Scrapling	2.46	1.0x
AutoScraper	13.3	5.407x

All benchmarks represent averages of 100+ runs. See benchmarks.py for methodology.

Installation

Scrapling requires Python 3.10 or higher:

pip install scrapling

Starting with v0.3.2, this installation only includes the parser engine and its dependencies, without any fetchers or commandline dependencies.

Optional Dependencies

If you are going to use any of the extra features below, the fetchers, or their classes, you will need to install fetchers' dependencies and their browser dependencies as follows:
```
pip install "scrapling[fetchers]"

scrapling install
```
This downloads all browsers, along with their system dependencies and fingerprint manipulation dependencies.
Extra features:
- Install the MCP server feature:
```
pip install "scrapling[ai]"
```
- Install shell features (Web Scraping shell and the extract command):
```
pip install "scrapling[shell]"
```
- Install everything:
```
pip install "scrapling[all]"
```
Remember that you need to install the browser dependencies with scrapling install after any of these extras (if you didn't already)

Docker

You can also install a Docker image with all extras and browsers with the following command from DockerHub:

docker pull pyd4vinci/scrapling

Or download it from the GitHub registry:

docker pull ghcr.io/d4vinci/scrapling:latest

This image is automatically built and pushed using GitHub Actions and the repository's main branch.

Contributing

We welcome contributions! Please read our contributing guidelines before getting started.

Disclaimer

[!CAUTION] This library is provided for educational and research purposes only. By using this library, you agree to comply with local and international data scraping and privacy laws. The authors and contributors are not responsible for any misuse of this software. Always respect the terms of service of websites and robots.txt files.

License

This work is licensed under the BSD-3-Clause License.

Acknowledgments

This project includes code adapted from:

Parsel (BSD License)—Used for translator submodule

Thanks and References

Daijro's brilliant work on BrowserForge and Camoufox
Vinyzu's brilliant work on Botright and PatchRight
brotector for browser detection bypass techniques
fakebrowser and BotBrowser for fingerprinting research

Designed & crafted with ❤️ by Karim Shoair.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

D4Vinci

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

0.3.14

Jan 3, 2026

0.3.13

Jan 1, 2026

0.3.12

Dec 18, 2025

0.3.11

Dec 3, 2025

0.3.10

Nov 26, 2025

0.3.9

Nov 17, 2025

0.3.8

Oct 27, 2025

0.3.7

Oct 12, 2025

0.3.6

Oct 1, 2025

0.3.5

Sep 20, 2025

0.3.4

Sep 16, 2025

0.3.3

Sep 15, 2025

0.3.2

Sep 15, 2025

0.3.1

Sep 2, 2025

0.3

Sep 1, 2025

0.2.99

Apr 8, 2025

0.2.98

Mar 17, 2025

0.2.97

Mar 15, 2025

0.2.96

Mar 5, 2025

0.2.95

Feb 25, 2025

0.2.94

Feb 22, 2025

0.2.93

Jan 31, 2025

0.2.92

Dec 26, 2024

0.2.91

Dec 19, 2024

0.2.9

Dec 16, 2024

0.2.8

Nov 30, 2024

0.2.7

Nov 26, 2024

0.2.6

Nov 24, 2024

0.2.5

Nov 23, 2024

0.2.4

Nov 20, 2024

0.2.3

Nov 19, 2024

0.2.2

Nov 16, 2024

0.2.1

Nov 15, 2024

0.2

Nov 11, 2024

0.1.2

Oct 16, 2024

0.1.1

Oct 14, 2024

0.1

Oct 13, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapling-0.3.14.tar.gz (96.4 kB view details)

Uploaded Jan 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scrapling-0.3.14-py3-none-any.whl (104.9 kB view details)

Uploaded Jan 3, 2026 Python 3

File details

Details for the file scrapling-0.3.14.tar.gz.

File metadata

Download URL: scrapling-0.3.14.tar.gz
Upload date: Jan 3, 2026
Size: 96.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scrapling-0.3.14.tar.gz
Algorithm	Hash digest
SHA256	`f0b85bd4b575505faba25f337c7a38c1a9e6ce11ad37eac3273cad8bc3f6284e`
MD5	`968c88a33e3faca2e0330327d4098399`
BLAKE2b-256	`63f4321da5f0f4c6d18fdec9a094b0fbff50614e83299517fda98ebc1a211593`

See more details on using hashes here.

Provenance

The following attestation bundles were made for scrapling-0.3.14.tar.gz:

Publisher: release-and-publish.yml on D4Vinci/Scrapling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: scrapling-0.3.14.tar.gz
- Subject digest: f0b85bd4b575505faba25f337c7a38c1a9e6ce11ad37eac3273cad8bc3f6284e
- Sigstore transparency entry: 789824361
- Sigstore integration time: Jan 3, 2026
Source repository:
- Permalink: D4Vinci/Scrapling@de79fe80bb410651211dedf2c4793809afd60435
- Branch / Tag: refs/heads/main
- Owner: https://github.com/D4Vinci
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-and-publish.yml@de79fe80bb410651211dedf2c4793809afd60435
- Trigger Event: pull_request

File details

Details for the file scrapling-0.3.14-py3-none-any.whl.

File metadata

Download URL: scrapling-0.3.14-py3-none-any.whl
Upload date: Jan 3, 2026
Size: 104.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scrapling-0.3.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`86e706afe6d7bfaaf471c3058be6dc65e71adb77d73272238dad40d2485c8532`
MD5	`d62f9d93d06976adaf3f7dcfa358ab22`
BLAKE2b-256	`7b9698ee8240873990b756092dc158a7f1e3e093f8b68f2814fbda754a31c395`

See more details on using hashes here.

Provenance

The following attestation bundles were made for scrapling-0.3.14-py3-none-any.whl:

Publisher: release-and-publish.yml on D4Vinci/Scrapling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: scrapling-0.3.14-py3-none-any.whl
- Subject digest: 86e706afe6d7bfaaf471c3058be6dc65e71adb77d73272238dad40d2485c8532
- Sigstore transparency entry: 789824363
- Sigstore integration time: Jan 3, 2026
Source repository:
- Permalink: D4Vinci/Scrapling@de79fe80bb410651211dedf2c4793809afd60435
- Branch / Tag: refs/heads/main
- Owner: https://github.com/D4Vinci
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-and-publish.yml@de79fe80bb410651211dedf2c4793809afd60435
- Trigger Event: pull_request

scrapling 0.3.14

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Sponsors

Key Features

Advanced Websites Fetching with Session Support

Adaptive Scraping & AI Integration

High-Performance & battle-tested Architecture

Developer/Web Scraper Friendly Experience

Getting Started

Basic Usage

Advanced Parsing & Navigation

Async Session Management Examples

CLI & Interactive Shell

Performance Benchmarks

Text Extraction Speed Test (5000 nested elements)

Element Similarity & Text Search Performance

Installation

Optional Dependencies

Docker

Contributing

Disclaimer

License

Acknowledgments

Thanks and References

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance