NebulaScrape — Ultra-powerful HTTP scraping library with smart bypass, async support, and modular transport.

These details have not been verified by PyPI

Project links

Homepage

Project description

NebulaScrape

NebulaScrape is a production-grade Python HTTP scraping library built for the modern web. It combines a modular transport system, intelligent session analysis, browser-realistic fingerprinting, async support, and a powerful WAF bypass engine into a single clean API.

Overview
Installation
Quick Start
Fingerprint Profiles
Smart Retry Engine
Session Intelligence Layer
Modular Transport System
Async Support
Built-in Metrics
Plugin System
WAF Bypass Engine
Advanced Usage
API Reference
Configuration Reference

Overview

NebulaScrape was designed to solve the hardest problem in modern web scraping: getting a real HTTP response from a server that actively tries to block automated clients.

Most scraping libraries send requests that are trivially identifiable as bots. They have wrong TLS fingerprints, wrong header order, no sec-ch-ua fields, no browser timing patterns, and no ability to recover intelligently from blocks. NebulaScrape was built from the ground up to solve all of these problems at once.

What Makes NebulaScrape Different

TLS fingerprint spoofing. Every modern WAF inspects the TLS ClientHello. NebulaScrape sends the exact cipher suite list, ECDH curve, and TLS extension order that real Chrome 120 sends. A plain requests session sends a fingerprint that gets flagged immediately.

Ordered, realistic HTTP headers. Browsers send headers in a specific order that WAFs check. NebulaScrape uses OrderedDict-based profiles that match real browser traffic captures, including the correct sec-ch-ua, sec-ch-ua-mobile, sec-ch-ua-platform, Sec-Fetch-* fields.

Smart retry decisions. When a request fails with 403, 429, or 503, NebulaScrape does not blindly retry. It analyzes the response, reads the Retry-After header, calculates exponential backoff with jitter, decides whether to rotate the session fingerprint or rebuild the connection, and escalates to a more capable transport if needed.

Session intelligence. Every response is analyzed for WAF vendor signatures. The library tells you whether you hit a Cloudflare IUAM page, a DataDome challenge, a PerimeterX block, rate limiting, or a redirect loop, and attaches a risk score from 0 to 100 to each response.

Transport escalation. Under high-protection targets, NebulaScrape automatically escalates from standard HTTP/1.1 with TLS spoofing, to HTTP/2 via httpx, to full browser impersonation via curl_cffi. No code change required.

Async-first. NebulaScraper ships with a native asyncio client that supports all the same features, including retry logic, intelligence analysis, metrics, and plugins.

Installation

Minimum requirements: Python 3.8+

Install the core library:

pip install nebulascrape

Install with headless browser impersonation support (required for Cloudflare Turnstile, Managed Challenge, and the most aggressive WAF protections):

pip install "nebulascrape[headless]"

Install from source:

git clone https://github.com/6x-u/nebulascrape.git
cd nebulascrape
pip install -e .

Dependencies

Package	Purpose	Required
`requests` >= 2.9.2	Base HTTP transport	Yes
`requests_toolbelt` >= 0.9.1	Request debugging	Yes
`pyparsing` >= 2.4.7	JS challenge parsing	Yes
`httpx[http2]` >= 0.24.0	HTTP/2 transport	Yes
`h2` >= 4.0.0	HTTP/2 protocol	Yes
`aiohttp` >= 3.8.0	Async fallback transport	Yes
`curl_cffi` >= 0.5.0	Browser impersonation	Optional (headless)
`brotli` >= 1.0.9	Brotli decompression	Optional

Quick Start

The simplest way to use NebulaScrape is through the Client class. It handles everything internally.

from nebulascrape import Client

client = Client(profile="chrome_windows", auto_retry=True)
response = client.get("https://target.com")

print(response.status_code)
print(response.meta["challenge_type"])
print(response.meta["risk_score"])
print(response.meta["metrics"]["latency_ms"])

For full session control, use NebulaScraper directly:

from nebulascrape import NebulaScraper

scraper = NebulaScraper(
    profile="chrome_windows",
    auto_retry=True,
    max_retries=5,
    mode="auto",
    interpreter="native",
    debug=False,
)

response = scraper.get("https://target.com")
print(response.meta)

For token extraction:

from nebulascrape import get_tokens

tokens, user_agent = get_tokens("https://target.com")
print(tokens)
print(user_agent)

Fingerprint Profiles

NebulaScrape ships with four pre-built browser fingerprint profiles. Each profile contains a real User-Agent string, browser-realistic headers in the correct order, a matching TLS cipher suite list, and the correct ECDH curve.

Profile Name	Browser	Platform	sec-ch-ua-mobile
`chrome_windows`	Chrome 120	Windows 10 x64	false
`chrome_linux`	Chrome 120	Linux x86_64	false
`firefox`	Firefox 121	Windows 10	N/A
`mobile`	Chrome 120	Android 13	true

Using a Profile

from nebulascrape import Client

# Use any built-in profile
client = Client(profile="chrome_linux")
client = Client(profile="firefox")
client = Client(profile="mobile")

Inspecting a Profile

from nebulascrape.fingerprints import get_profile, available_profiles

print(available_profiles())
# ['chrome_windows', 'chrome_linux', 'firefox', 'mobile']

profile = get_profile("chrome_windows")
print(profile["user_agent"])
print(profile["headers"])
print(profile["cipher_suite"])

Why Header Order Matters

A standard requests session sends headers in an arbitrary order. Real browsers always send headers in a fixed, browser-specific order. WAFs such as Datadome and Kasada inspect header order as a primary bot signal. NebulaScrape uses OrderedDict to enforce the correct order for every profile:

User-Agent
Accept
Accept-Language
Accept-Encoding
sec-ch-ua
sec-ch-ua-mobile
sec-ch-ua-platform
Upgrade-Insecure-Requests
Sec-Fetch-Dest
Sec-Fetch-Mode
Sec-Fetch-Site
Sec-Fetch-User

This matches the exact order captured from a real Chrome 120 browser session.

Smart Retry Engine

The SmartRetryEngine replaces naive retry loops with a response-aware retry decision system. It analyzes each failed response and decides the appropriate action based on the error type, attempt count, and session intelligence.

How It Works

Every response is passed through analyze_response(), which returns a RetryDecision containing:

action — what to do next (pass, wait and retry, rotate session, rebuild connection, switch transport, or abort)
backoff_seconds — how long to wait before retrying
rotate_session — whether to change the User-Agent and fingerprint
rebuild_connection — whether to tear down and rebuild the connection pool
switch_transport — whether to escalate to a higher-tier transport

Per-Status Logic

HTTP 403 Forbidden

Indicates fingerprint detection or IP block. The engine rotates the browser fingerprint and session identity, waits a random jitter interval between 2 and 8 seconds to simulate human behavior, and rebuilds the connection on the third attempt to clear any connection-level state the server may be tracking.

HTTP 429 Too Many Requests

Indicates rate limiting. The engine first reads the Retry-After response header and uses that value if present, adding a small random jitter. If no header is present, it calculates exponential backoff: 1.5 * 2^attempt seconds, capped at 120 seconds. Session rotation activates from the second attempt onward.

HTTP 503 Service Unavailable

Indicates the connection itself may be flagged. The engine rebuilds the connection pool immediately and escalates to a higher transport tier from the second attempt onward.

HTTP 407, 408, 502, 504, 52x

Treated as transient infrastructure errors. Exponential backoff applies, capped at 90 seconds.

Intelligence-driven retry

If SessionIntelligence detects a challenge or high-risk response (even on a 200), the retry engine uses the intel result to decide whether to rotate, switch transport, or escalate.

Configuration

from nebulascrape import Client

client = Client(
    auto_retry=True,
    max_retries=7,  # default is 5
)

from nebulascrape.retry_engine import SmartRetryEngine

engine = SmartRetryEngine(max_retries=10, base_backoff=2.0)

Session Intelligence Layer

The SessionIntelligence class analyzes every response and classifies the WAF vendor and challenge type. This information is attached to response.meta on every request.

Challenge Types

Challenge Type	Description
`none`	Clean response, no challenge detected
`cf_iuam`	Cloudflare I'm Under Attack Mode (v1 JS challenge)
`cf_captcha`	Cloudflare hCaptcha / reCaptcha challenge
`cf_turnstile`	Cloudflare Turnstile (v3 challenge)
`cf_managed`	Cloudflare Managed Challenge
`cf_block_1020`	Cloudflare firewall rule block (error 1020)
`datadome`	DataDome bot detection challenge
`perimeterx`	PerimeterX / HUMAN Security block
`kasada`	Kasada protection challenge
`akamai`	Akamai Bot Manager challenge
`imperva`	Imperva / Incapsula protection
`shape`	F5 Shape Security protection
`rate_limited`	Generic rate limiting (429 or Retry-After header)
`js_required`	Page requires JavaScript execution
`redirect_loop`	Detected circular redirect chain

Risk Score

The risk score is an integer from 0 to 100 representing how likely the response represents a blocking or detection event:

Score Range	Interpretation
0	Clean response
1-30	WAF present but not triggered
31-60	Rate limiting or soft block
61-80	Active JS or captcha challenge
81-100	Hard block, firewall, or advanced WAF challenge

Reading the Meta

from nebulascrape import Client

client = Client(profile="chrome_windows", auto_retry=True)
response = client.get("https://target.com")

print(response.meta["challenge_type"])   # "cf_iuam" / "datadome" / "none" / ...
print(response.meta["risk_score"])       # 0 - 100
print(response.meta["waf_vendor"])       # "cloudflare" / "akamai" / "none" / ...
print(response.meta["retry_recommended"])
print(response.meta["rotate_session"])
print(response.meta["details"])          # {"retry_after": None, "cf_ray": "...", "status_code": 200}

Direct Usage

from nebulascrape.session_intel import SessionIntelligence
import requests

resp = requests.get("https://some-protected-site.com")
intel = SessionIntelligence()
result = intel.analyze(resp)

print(result.challenge_type)
print(result.risk_score)
print(result.waf_vendor)

Modular Transport System

NebulaScrape uses a three-tier transport system. Each tier provides a higher level of browser mimicry. The TransportManager can automatically escalate through tiers when lower tiers accumulate failures.

Transport Tiers

Tier 1 — TransportHTTP

Standard HTTPS over HTTP/1.1 with TLS fingerprint spoofing. Uses a custom HTTPAdapter that builds an SSL context with the exact cipher suite list, ECDH curve, and TLS version range from the selected fingerprint profile. This matches the JA3/JA4 fingerprint of real Chrome or Firefox and passes most WAF TLS fingerprint checks.

Tier 2 — TransportHTTP2

HTTP/2 transport backed by httpx. Sends the correct SETTINGS frame, WINDOW_UPDATE values, and pseudo-header order (:method :authority :scheme :path) that match real Chrome HTTP/2 fingerprints. Many sites block HTTP/1.1 clients that cannot negotiate HTTP/2.

Tier 3 — TransportHeadless

Full browser impersonation using curl_cffi. This sends traffic that is byte-for-byte indistinguishable from the target browser at the TLS and HTTP/2 layers using libcurl compiled with BoringSSL. Used as a last resort for Cloudflare Turnstile, Managed Challenge, Kasada, and similar advanced protections.

Modes

from nebulascrape import Client

# Auto: starts at HTTP1, escalates to HTTP2, then Headless on repeated failures
client = Client(mode="auto")

# Force a specific transport
client = Client(mode="http1")
client = Client(mode="http2")
client = Client(mode="headless")

Manual Transport Control

from nebulascrape.transports import TransportManager, TransportHTTP, TransportHTTP2, TransportHeadless

manager = TransportManager(profile_name="chrome_windows", mode="auto")
manager.mount_on(scraper_session)
manager.escalate(scraper_session)  # manually escalate one tier
manager.rebuild(scraper_session)   # rebuild the current transport

Async Support

NebulaScrape provides a native asyncio client through AsyncNebulaScraper (also exported as AsyncClient). It supports the same profile system, retry engine, session intelligence, metrics, and plugin hooks as the synchronous client.

The async client uses httpx.AsyncClient with HTTP/2 enabled as its primary backend, and falls back to aiohttp if httpx is not available.

Basic Async Usage

import asyncio
from nebulascrape import AsyncClient

async def main():
    client = AsyncClient(profile="chrome_windows", auto_retry=True)
    response = await client.get("https://target.com")
    print(response.status_code)
    print(response.meta)
    await client.close()

asyncio.run(main())

Context Manager

import asyncio
from nebulascrape import AsyncClient

async def main():
    async with AsyncClient(profile="chrome_linux", auto_retry=True, max_retries=5) as client:
        r1 = await client.get("https://httpbin.org/get")
        r2 = await client.post("https://httpbin.org/post", json={"key": "value"})
        print(r1.status_code, r2.status_code)

asyncio.run(main())

Concurrent Requests

import asyncio
from nebulascrape import AsyncClient

async def fetch(client, url):
    r = await client.get(url)
    return r.status_code, r.meta["risk_score"]

async def main():
    async with AsyncClient(profile="chrome_windows") as client:
        urls = [
            "https://httpbin.org/get",
            "https://httpbin.org/headers",
            "https://httpbin.org/ip",
        ]
        results = await asyncio.gather(*[fetch(client, u) for u in urls])
        for status, risk in results:
            print(f"Status: {status}  Risk: {risk}")

asyncio.run(main())

Built-in Metrics

Every response returned by NebulaScrape contains a meta["metrics"] dictionary with timing and retry information collected during the request lifecycle.

Per-Request Metrics

Field	Type	Description
`latency_ms`	float	Total request duration in milliseconds
`tls_handshake_ms`	float	Approximate TLS handshake time in milliseconds
`retry_count`	int	Number of retries made for this request
`redirect_depth`	int	Number of redirects followed
`transport_used`	str	Which transport tier was active (`http1`, `http2`, `async_http2`)

Reading Metrics

from nebulascrape import Client

client = Client(profile="chrome_windows", auto_retry=True)
response = client.get("https://httpbin.org/get")

m = response.meta["metrics"]
print(f"Latency:   {m['latency_ms']} ms")
print(f"Handshake: {m['tls_handshake_ms']} ms")
print(f"Retries:   {m['retry_count']}")
print(f"Redirects: {m['redirect_depth']}")
print(f"Transport: {m['transport_used']}")

Session-Level Aggregate Metrics

from nebulascrape import Client

client = Client(profile="chrome_windows")

for url in ["https://httpbin.org/get", "https://httpbin.org/headers"]:
    client.get(url)

stats = client.metrics
print(f"Total requests:    {stats['total_requests']}")
print(f"Average latency:   {stats['avg_latency_ms']} ms")
print(f"Max latency:       {stats['max_latency_ms']} ms")
print(f"Total retries:     {stats['total_retries']}")
print(f"Challenges solved: {stats['challenges_solved']}")

Plugin System

NebulaScrape includes a plugin registry that allows you to attach custom behavior to the request lifecycle without modifying the core library. All plugins inherit from BasePlugin and can hook into pre-request, post-request, challenge detection, and retry events.

Built-in Plugins

RateLimitPlugin

Adds adaptive pre-request delays based on request rate. Detects burst patterns and automatically increases delays. Respects Retry-After headers on 429 responses.

from nebulascrape import Client
from nebulascrape.plugins.rate_limit_handler import RateLimitPlugin

client = Client(profile="chrome_windows")
client.register_plugin(RateLimitPlugin(
    min_delay=0.3,
    max_delay=2.0,
    burst_threshold=10,
))

HeaderOptimizerPlugin

Ensures browser-realistic headers are applied to every request, merging them with any user-supplied headers while preserving the correct order. Adjusts Sec-Fetch headers automatically for POST requests.

from nebulascrape import Client
from nebulascrape.plugins.header_optimizer import HeaderOptimizerPlugin

client = Client(profile="chrome_windows")
client.register_plugin(HeaderOptimizerPlugin(profile_name="chrome_windows"))

ProxyManagerPlugin

Manages a pool of proxy servers with automatic rotation on failure. Tracks per-proxy failure counts and rotates after two consecutive failures on the same proxy.

from nebulascrape import Client
from nebulascrape.plugins.proxy_manager import ProxyManagerPlugin

proxies = [
    "http://user:pass@proxy1:8080",
    "http://user:pass@proxy2:8080",
    "http://user:pass@proxy3:8080",
]

client = Client(profile="chrome_windows")
client.register_plugin(ProxyManagerPlugin(
    proxies=proxies,
    rotate_on_fail=True,
    rotate_on_status=[403, 429, 503],
))

Writing a Custom Plugin

from nebulascrape.plugins import BasePlugin
from nebulascrape import Client

class LoggingPlugin(BasePlugin):
    name = "logging_plugin"
    priority = 5  # lower number = runs first

    def on_pre_request(self, scraper, method, url, kwargs):
        print(f"REQUEST  {method} {url}")
        return kwargs

    def on_post_request(self, scraper, response, kwargs):
        print(f"RESPONSE {response.status_code} - risk={response.meta.get('risk_score', 'n/a')}")
        return response

    def on_retry(self, scraper, attempt, decision):
        print(f"RETRY {attempt} - reason: {decision.reason} - waiting {decision.backoff_seconds:.1f}s")

client = Client(profile="chrome_windows", auto_retry=True)
client.register_plugin(LoggingPlugin())

response = client.get("https://httpbin.org/get")

Plugin Hook Reference

Hook	When it runs	Return value
`on_pre_request(scraper, method, url, kwargs)`	Before every request	Modified kwargs dict
`on_post_request(scraper, response, kwargs)`	After every response	response object
`on_challenge_detected(scraper, response, intel_result)`	When a challenge is found	bool
`on_retry(scraper, attempt, decision)`	Before each retry sleep	None

WAF Bypass Engine

NebulaScrape's bypass capabilities are integrated across multiple layers of the library. There is no single "bypass" function. Instead, bypass is the result of the fingerprint, transport, intelligence, and retry systems working together.

Cloudflare

I'm Under Attack Mode (v1)

Detected by inspecting the response body for the characteristic jsch trace image and challenge form. The library extracts the challenge parameters, waits a browser-realistic delay (parsed from the page's own JavaScript, with jitter added), solves the JavaScript challenge using the native interpreter, submits the solution as a POST request, and follows the redirect to retrieve the real page. The cf_clearance cookie is then retained in the session for future requests.

Turnstile

Detected by looking for cf-turnstile or challenges.cloudflare.com/turnstile in the response. When this challenge is detected, the library raises TurnstileChallengeError and recommends using TransportHeadless with curl_cffi, which passes the Turnstile check at the TLS and HTTP/2 fingerprint layer without requiring a browser.

Managed Challenge and v2

Detected by inspecting the CDN CGI orchestration endpoint pattern. Escalation to the headless transport is recommended.

Cloudflare Firewall 1020

Detected and raised as CloudflareCode1020. This is an IP-level block that requires a proxy rotation.

Multi-WAF Detection

The SessionIntelligence layer detects the following vendors using header and body signature matching:

WAF	Detection Method
Cloudflare	`Server: cloudflare` header + body patterns
DataDome	`dd_sitekey`, `datadome.co` cookie domains
PerimeterX	`_pxdk` cookie, `PerimeterX` body references
Kasada	`kasada`, `kpsdk` body references
Akamai	`_abck`, `ak_bmsc` cookies, sensor_data
Imperva	`incap_ses_`, `visid_incap_` cookies
Shape Security	`shape.io`, `x-shape-` headers

TLS Fingerprint Spoofing

Standard Python ssl sends a TLS fingerprint (JA3) that is trivially identifiable as a non-browser client. NebulaScrape replaces the default SSL context with one that:

Sets the cipher suite list to match Chrome 120's exact order
Sets the ECDH curve to prime256v1
Sets TLS minimum version to TLS 1.2 and maximum to TLS 1.3
Preserves the correct TLS extension set

This produces a JA3 fingerprint that matches a real Chrome browser.

Advanced Usage

All Options Together

from nebulascrape import Client
from nebulascrape.plugins.rate_limit_handler import RateLimitPlugin
from nebulascrape.plugins.proxy_manager import ProxyManagerPlugin
from nebulascrape.plugins.header_optimizer import HeaderOptimizerPlugin

client = Client(
    profile="chrome_windows",
    auto_retry=True,
    max_retries=7,
    mode="auto",
    interpreter="native",
    debug=False,
)

client.register_plugin(RateLimitPlugin(min_delay=0.5, max_delay=3.0))
client.register_plugin(HeaderOptimizerPlugin(profile_name="chrome_windows"))
client.register_plugin(ProxyManagerPlugin(proxies=["http://proxy1:8080"]))

response = client.get("https://target.com", timeout=30)

print("Status:    ", response.status_code)
print("Challenge: ", response.meta["challenge_type"])
print("Risk:      ", response.meta["risk_score"])
print("WAF:       ", response.meta["waf_vendor"])
print("Latency:   ", response.meta["metrics"]["latency_ms"], "ms")
print("Retries:   ", response.meta["metrics"]["retry_count"])

Using the Low-Level NebulaScraper

from nebulascrape import NebulaScraper

scraper = NebulaScraper(
    browser={"browser": "chrome", "platform": "windows", "desktop": True},
    auto_retry=True,
    max_retries=5,
    mode="auto",
    captcha={"provider": "2captcha", "api_key": "YOUR_KEY"},
    solveDepth=3,
    doubleDown=True,
    delay=None,
)

response = scraper.get("https://target.com")
cookies = response.cookies
tokens = scraper.cookies.get("cf_clearance")

Passing Cookies or Proxies

from nebulascrape import Client

client = Client(profile="chrome_windows")

# Proxies
response = client.get("https://target.com", proxies={
    "http": "http://proxy:8080",
    "https": "http://proxy:8080",
})

# Custom cookies
response = client.get("https://target.com", cookies={
    "session_id": "abc123",
})

# Custom headers (merged with profile headers)
response = client.get("https://target.com", headers={
    "Referer": "https://google.com",
    "X-Custom-Header": "value",
})

Integrating with Existing Sessions

import requests
from nebulascrape import NebulaScraper

existing_session = requests.Session()
existing_session.headers.update({"Authorization": "Bearer token123"})

scraper = NebulaScraper.create_scraper(
    sess=existing_session,
    profile="chrome_linux",
    auto_retry=True,
)

response = scraper.get("https://api.target.com/data")

Captcha Integration

from nebulascrape import Client

client = Client(
    profile="chrome_windows",
    captcha={
        "provider": "2captcha",
        "api_key": "YOUR_2CAPTCHA_KEY",
    }
)

response = client.get("https://cloudflare-captcha-site.com")

Supported captcha providers: 2captcha, anticaptcha, capmonster, capsolver, 9kw, deathbycaptcha.

Getting Cloudflare Tokens

from nebulascrape import get_tokens, get_cookie_string

tokens, user_agent = get_tokens("https://cloudflare-protected-site.com")
print("cf_clearance:", tokens["cf_clearance"])
print("User-Agent:  ", user_agent)

cookie_string, user_agent = get_cookie_string("https://cloudflare-protected-site.com")
print("Cookie:", cookie_string)

API Reference

`Client`

Client(
    profile="chrome_windows",
    auto_retry=True,
    max_retries=5,
    mode="auto",
    captcha={},
    interpreter="native",
    debug=False,
    **kwargs
)

Parameter	Type	Default	Description
`profile`	str	`chrome_windows`	Fingerprint profile to use
`auto_retry`	bool	`True`	Enable smart retry engine
`max_retries`	int	`5`	Maximum retry attempts
`mode`	str	`auto`	Transport mode (`auto`, `http1`, `http2`, `headless`)
`captcha`	dict	`{}`	Captcha provider configuration
`interpreter`	str	`native`	JS interpreter for challenge solving
`debug`	bool	`False`	Enable request/response debugging output

Methods: get(url, **kwargs), post(url, **kwargs), put(url, **kwargs), delete(url, **kwargs), request(method, url, **kwargs), register_plugin(plugin), session (property), metrics (property)

`AsyncClient` / `AsyncNebulaScraper`

AsyncClient(
    profile="chrome_windows",
    auto_retry=True,
    max_retries=5,
    debug=False,
    **kwargs
)

Methods: await get(url, **kwargs), await post(url, **kwargs), await put(url, **kwargs), await delete(url, **kwargs), await request(method, url, **kwargs), register_plugin(plugin), await close(), supports async with.

`NebulaScraper`

Extends requests.Session. All requests.Session methods are available.

Additional parameters on top of Client:

Parameter	Type	Default	Description
`browser`	dict or None	None	Browser dict with keys `browser`, `platform`, `desktop`, `mobile`
`solveDepth`	int	`3`	Maximum Cloudflare challenge solve loops
`doubleDown`	bool	`True`	Double request on captcha to check if cfuid is enough
`delay`	float or None	None	Manual Cloudflare challenge delay in seconds
`disableCloudflareV1`	bool	`False`	Disable built-in Cloudflare v1 bypass
`requestPreHook`	callable	None	Function called before each request
`requestPostHook`	callable	None	Function called after each response
`source_address`	str or tuple	None	Bind to a specific local IP
`ssl_context`	ssl.SSLContext	None	Custom SSL context

`response.meta` Fields

Field	Type	Description
`challenge_type`	str	Detected challenge type (see challenge type table)
`waf_vendor`	str	Detected WAF vendor
`risk_score`	int	Risk score 0-100
`retry_recommended`	bool	Whether retry is suggested
`rotate_session`	bool	Whether session rotation is suggested
`switch_transport`	bool	Whether transport escalation is suggested
`details`	dict	Raw details: retry_after, cf_ray, status_code
`metrics`	dict	latency_ms, tls_handshake_ms, retry_count, redirect_depth, transport_used

Configuration Reference

Fingerprint Profiles

Profile	User-Agent snippet	Platform
`chrome_windows`	Chrome/120.0.0.0 ... Windows NT 10.0	Windows
`chrome_linux`	Chrome/120.0.0.0 ... X11; Linux x86_64	Linux
`firefox`	Firefox/121.0 ... Windows NT 10.0	Windows
`mobile`	Chrome/120.0.6099.144 Mobile ... Android 13	Android

Transport Modes

Mode	Backend	HTTP Version	TLS Spoof	Impersonation Level
`http1`	requests	HTTP/1.1	JA3 cipher suite	High
`http2`	httpx	HTTP/2	JA3 + H2 SETTINGS	Very High
`headless`	curl_cffi	HTTP/2	Full BoringSSL	Maximum
`auto`	escalating	depends	depends	Adaptive

JavaScript Interpreters

Interpreter	Requirement	Description
`native`	None (built-in)	Pure Python JS evaluation for simple challenges
`js2py`	`pip install js2py`	Full JavaScript runtime
`nodejs`	Node.js installed	Executes via Node.js subprocess
`chakracore`	ChakraCore binary	Microsoft JS engine
`v8`	V8 binary	Google V8 JS engine

Author

Field	Value
Developer	MERO
Contact	TG@QP4M
GitHub	github.com/6x-u
License	MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.1

Feb 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nebulascrape-0.0.1.tar.gz (118.5 kB view details)

Uploaded Feb 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nebulascrape-0.0.1-py2.py3-none-any.whl (118.0 kB view details)

Uploaded Feb 20, 2026 Python 2Python 3

File details

Details for the file nebulascrape-0.0.1.tar.gz.

File metadata

Download URL: nebulascrape-0.0.1.tar.gz
Upload date: Feb 20, 2026
Size: 118.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for nebulascrape-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`36e9526f21d3d09bdfc47006911452de637cdd0d689fe69efe8febf57b9892b0`
MD5	`cf5f654cb7eb077a4ec067b05ea8abdd`
BLAKE2b-256	`ac603b377da005a3e051a78b61bc3e2ceff28a6bc956106f5b8101b079067709`

See more details on using hashes here.

File details

Details for the file nebulascrape-0.0.1-py2.py3-none-any.whl.

File metadata

Download URL: nebulascrape-0.0.1-py2.py3-none-any.whl
Upload date: Feb 20, 2026
Size: 118.0 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for nebulascrape-0.0.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`160e3b894bf6bdf3e7a354c3713e93fbdd2713fd2649736abc83892175216694`
MD5	`11d9d474dbb5efc394f1bf10d19c6735`
BLAKE2b-256	`49d36711770cdcd794a42cfd8f5142f30d3570346ad440db7ff46332039bf1c9`

See more details on using hashes here.

nebulascrape 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

NebulaScrape

Table of Contents

Overview

What Makes NebulaScrape Different

Installation

Dependencies

Quick Start

Fingerprint Profiles

Using a Profile

Inspecting a Profile

Why Header Order Matters

Smart Retry Engine

How It Works

Per-Status Logic

Configuration

Session Intelligence Layer

Challenge Types

Risk Score

Reading the Meta

Direct Usage

Modular Transport System

Transport Tiers

Modes

Manual Transport Control

Async Support

Basic Async Usage

Context Manager

Concurrent Requests

Built-in Metrics

Per-Request Metrics

Reading Metrics

Session-Level Aggregate Metrics

Plugin System

Built-in Plugins

Writing a Custom Plugin

Plugin Hook Reference

WAF Bypass Engine

Cloudflare

Multi-WAF Detection

TLS Fingerprint Spoofing

Advanced Usage

All Options Together

Using the Low-Level NebulaScraper

Passing Cookies or Proxies

Integrating with Existing Sessions

Captcha Integration

Getting Cloudflare Tokens

API Reference

Client

AsyncClient / AsyncNebulaScraper

NebulaScraper

response.meta Fields

Configuration Reference

Fingerprint Profiles

Transport Modes

JavaScript Interpreters

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

`Client`

`AsyncClient` / `AsyncNebulaScraper`

`NebulaScraper`

`response.meta` Fields