NebulaScrape — Ultra-powerful HTTP scraping library with smart bypass, async support, and modular transport.
Project description
NebulaScrape
NebulaScrape is a production-grade Python HTTP scraping library built for the modern web. It combines a modular transport system, intelligent session analysis, browser-realistic fingerprinting, async support, and a powerful WAF bypass engine into a single clean API.
Installation | Quick Start | Profiles | Retry Engine | Session Intel | Transports | Async | Metrics | Plugins | API Reference
Table of Contents
- Overview
- Installation
- Quick Start
- Fingerprint Profiles
- Smart Retry Engine
- Session Intelligence Layer
- Modular Transport System
- Async Support
- Built-in Metrics
- Plugin System
- WAF Bypass Engine
- Advanced Usage
- API Reference
- Configuration Reference
Overview
NebulaScrape was designed to solve the hardest problem in modern web scraping: getting a real HTTP response from a server that actively tries to block automated clients.
Most scraping libraries send requests that are trivially identifiable as bots. They have wrong TLS fingerprints, wrong header order, no sec-ch-ua fields, no browser timing patterns, and no ability to recover intelligently from blocks. NebulaScrape was built from the ground up to solve all of these problems at once.
What Makes NebulaScrape Different
TLS fingerprint spoofing. Every modern WAF inspects the TLS ClientHello. NebulaScrape sends the exact cipher suite list, ECDH curve, and TLS extension order that real Chrome 120 sends. A plain requests session sends a fingerprint that gets flagged immediately.
Ordered, realistic HTTP headers. Browsers send headers in a specific order that WAFs check. NebulaScrape uses OrderedDict-based profiles that match real browser traffic captures, including the correct sec-ch-ua, sec-ch-ua-mobile, sec-ch-ua-platform, Sec-Fetch-* fields.
Smart retry decisions. When a request fails with 403, 429, or 503, NebulaScrape does not blindly retry. It analyzes the response, reads the Retry-After header, calculates exponential backoff with jitter, decides whether to rotate the session fingerprint or rebuild the connection, and escalates to a more capable transport if needed.
Session intelligence. Every response is analyzed for WAF vendor signatures. The library tells you whether you hit a Cloudflare IUAM page, a DataDome challenge, a PerimeterX block, rate limiting, or a redirect loop, and attaches a risk score from 0 to 100 to each response.
Transport escalation. Under high-protection targets, NebulaScrape automatically escalates from standard HTTP/1.1 with TLS spoofing, to HTTP/2 via httpx, to full browser impersonation via curl_cffi. No code change required.
Async-first. NebulaScraper ships with a native asyncio client that supports all the same features, including retry logic, intelligence analysis, metrics, and plugins.
Installation
Minimum requirements: Python 3.8+
Install the core library:
pip install nebulascrape
Install with headless browser impersonation support (required for Cloudflare Turnstile, Managed Challenge, and the most aggressive WAF protections):
pip install "nebulascrape[headless]"
Install from source:
git clone https://github.com/6x-u/nebulascrape.git
cd nebulascrape
pip install -e .
Dependencies
| Package | Purpose | Required |
|---|---|---|
requests >= 2.9.2 |
Base HTTP transport | Yes |
requests_toolbelt >= 0.9.1 |
Request debugging | Yes |
pyparsing >= 2.4.7 |
JS challenge parsing | Yes |
httpx[http2] >= 0.24.0 |
HTTP/2 transport | Yes |
h2 >= 4.0.0 |
HTTP/2 protocol | Yes |
aiohttp >= 3.8.0 |
Async fallback transport | Yes |
curl_cffi >= 0.5.0 |
Browser impersonation | Optional (headless) |
brotli >= 1.0.9 |
Brotli decompression | Optional |
Quick Start
The simplest way to use NebulaScrape is through the Client class. It handles everything internally.
from nebulascrape import Client
client = Client(profile="chrome_windows", auto_retry=True)
response = client.get("https://target.com")
print(response.status_code)
print(response.meta["challenge_type"])
print(response.meta["risk_score"])
print(response.meta["metrics"]["latency_ms"])
For full session control, use NebulaScraper directly:
from nebulascrape import NebulaScraper
scraper = NebulaScraper(
profile="chrome_windows",
auto_retry=True,
max_retries=5,
mode="auto",
interpreter="native",
debug=False,
)
response = scraper.get("https://target.com")
print(response.meta)
For token extraction:
from nebulascrape import get_tokens
tokens, user_agent = get_tokens("https://target.com")
print(tokens)
print(user_agent)
Fingerprint Profiles
NebulaScrape ships with four pre-built browser fingerprint profiles. Each profile contains a real User-Agent string, browser-realistic headers in the correct order, a matching TLS cipher suite list, and the correct ECDH curve.
| Profile Name | Browser | Platform | sec-ch-ua-mobile |
|---|---|---|---|
chrome_windows |
Chrome 120 | Windows 10 x64 | false |
chrome_linux |
Chrome 120 | Linux x86_64 | false |
firefox |
Firefox 121 | Windows 10 | N/A |
mobile |
Chrome 120 | Android 13 | true |
Using a Profile
from nebulascrape import Client
# Use any built-in profile
client = Client(profile="chrome_linux")
client = Client(profile="firefox")
client = Client(profile="mobile")
Inspecting a Profile
from nebulascrape.fingerprints import get_profile, available_profiles
print(available_profiles())
# ['chrome_windows', 'chrome_linux', 'firefox', 'mobile']
profile = get_profile("chrome_windows")
print(profile["user_agent"])
print(profile["headers"])
print(profile["cipher_suite"])
Why Header Order Matters
A standard requests session sends headers in an arbitrary order. Real browsers always send headers in a fixed, browser-specific order. WAFs such as Datadome and Kasada inspect header order as a primary bot signal. NebulaScrape uses OrderedDict to enforce the correct order for every profile:
User-Agent
Accept
Accept-Language
Accept-Encoding
sec-ch-ua
sec-ch-ua-mobile
sec-ch-ua-platform
Upgrade-Insecure-Requests
Sec-Fetch-Dest
Sec-Fetch-Mode
Sec-Fetch-Site
Sec-Fetch-User
This matches the exact order captured from a real Chrome 120 browser session.
Smart Retry Engine
The SmartRetryEngine replaces naive retry loops with a response-aware retry decision system. It analyzes each failed response and decides the appropriate action based on the error type, attempt count, and session intelligence.
How It Works
Every response is passed through analyze_response(), which returns a RetryDecision containing:
action— what to do next (pass, wait and retry, rotate session, rebuild connection, switch transport, or abort)backoff_seconds— how long to wait before retryingrotate_session— whether to change the User-Agent and fingerprintrebuild_connection— whether to tear down and rebuild the connection poolswitch_transport— whether to escalate to a higher-tier transport
Per-Status Logic
HTTP 403 Forbidden
Indicates fingerprint detection or IP block. The engine rotates the browser fingerprint and session identity, waits a random jitter interval between 2 and 8 seconds to simulate human behavior, and rebuilds the connection on the third attempt to clear any connection-level state the server may be tracking.
HTTP 429 Too Many Requests
Indicates rate limiting. The engine first reads the Retry-After response header and uses that value if present, adding a small random jitter. If no header is present, it calculates exponential backoff: 1.5 * 2^attempt seconds, capped at 120 seconds. Session rotation activates from the second attempt onward.
HTTP 503 Service Unavailable
Indicates the connection itself may be flagged. The engine rebuilds the connection pool immediately and escalates to a higher transport tier from the second attempt onward.
HTTP 407, 408, 502, 504, 52x
Treated as transient infrastructure errors. Exponential backoff applies, capped at 90 seconds.
Intelligence-driven retry
If SessionIntelligence detects a challenge or high-risk response (even on a 200), the retry engine uses the intel result to decide whether to rotate, switch transport, or escalate.
Configuration
from nebulascrape import Client
client = Client(
auto_retry=True,
max_retries=7, # default is 5
)
from nebulascrape.retry_engine import SmartRetryEngine
engine = SmartRetryEngine(max_retries=10, base_backoff=2.0)
Session Intelligence Layer
The SessionIntelligence class analyzes every response and classifies the WAF vendor and challenge type. This information is attached to response.meta on every request.
Challenge Types
| Challenge Type | Description |
|---|---|
none |
Clean response, no challenge detected |
cf_iuam |
Cloudflare I'm Under Attack Mode (v1 JS challenge) |
cf_captcha |
Cloudflare hCaptcha / reCaptcha challenge |
cf_turnstile |
Cloudflare Turnstile (v3 challenge) |
cf_managed |
Cloudflare Managed Challenge |
cf_block_1020 |
Cloudflare firewall rule block (error 1020) |
datadome |
DataDome bot detection challenge |
perimeterx |
PerimeterX / HUMAN Security block |
kasada |
Kasada protection challenge |
akamai |
Akamai Bot Manager challenge |
imperva |
Imperva / Incapsula protection |
shape |
F5 Shape Security protection |
rate_limited |
Generic rate limiting (429 or Retry-After header) |
js_required |
Page requires JavaScript execution |
redirect_loop |
Detected circular redirect chain |
Risk Score
The risk score is an integer from 0 to 100 representing how likely the response represents a blocking or detection event:
| Score Range | Interpretation |
|---|---|
| 0 | Clean response |
| 1-30 | WAF present but not triggered |
| 31-60 | Rate limiting or soft block |
| 61-80 | Active JS or captcha challenge |
| 81-100 | Hard block, firewall, or advanced WAF challenge |
Reading the Meta
from nebulascrape import Client
client = Client(profile="chrome_windows", auto_retry=True)
response = client.get("https://target.com")
print(response.meta["challenge_type"]) # "cf_iuam" / "datadome" / "none" / ...
print(response.meta["risk_score"]) # 0 - 100
print(response.meta["waf_vendor"]) # "cloudflare" / "akamai" / "none" / ...
print(response.meta["retry_recommended"])
print(response.meta["rotate_session"])
print(response.meta["details"]) # {"retry_after": None, "cf_ray": "...", "status_code": 200}
Direct Usage
from nebulascrape.session_intel import SessionIntelligence
import requests
resp = requests.get("https://some-protected-site.com")
intel = SessionIntelligence()
result = intel.analyze(resp)
print(result.challenge_type)
print(result.risk_score)
print(result.waf_vendor)
Modular Transport System
NebulaScrape uses a three-tier transport system. Each tier provides a higher level of browser mimicry. The TransportManager can automatically escalate through tiers when lower tiers accumulate failures.
Transport Tiers
Tier 1 — TransportHTTP
Standard HTTPS over HTTP/1.1 with TLS fingerprint spoofing. Uses a custom HTTPAdapter that builds an SSL context with the exact cipher suite list, ECDH curve, and TLS version range from the selected fingerprint profile. This matches the JA3/JA4 fingerprint of real Chrome or Firefox and passes most WAF TLS fingerprint checks.
Tier 2 — TransportHTTP2
HTTP/2 transport backed by httpx. Sends the correct SETTINGS frame, WINDOW_UPDATE values, and pseudo-header order (:method :authority :scheme :path) that match real Chrome HTTP/2 fingerprints. Many sites block HTTP/1.1 clients that cannot negotiate HTTP/2.
Tier 3 — TransportHeadless
Full browser impersonation using curl_cffi. This sends traffic that is byte-for-byte indistinguishable from the target browser at the TLS and HTTP/2 layers using libcurl compiled with BoringSSL. Used as a last resort for Cloudflare Turnstile, Managed Challenge, Kasada, and similar advanced protections.
Modes
from nebulascrape import Client
# Auto: starts at HTTP1, escalates to HTTP2, then Headless on repeated failures
client = Client(mode="auto")
# Force a specific transport
client = Client(mode="http1")
client = Client(mode="http2")
client = Client(mode="headless")
Manual Transport Control
from nebulascrape.transports import TransportManager, TransportHTTP, TransportHTTP2, TransportHeadless
manager = TransportManager(profile_name="chrome_windows", mode="auto")
manager.mount_on(scraper_session)
manager.escalate(scraper_session) # manually escalate one tier
manager.rebuild(scraper_session) # rebuild the current transport
Async Support
NebulaScrape provides a native asyncio client through AsyncNebulaScraper (also exported as AsyncClient). It supports the same profile system, retry engine, session intelligence, metrics, and plugin hooks as the synchronous client.
The async client uses httpx.AsyncClient with HTTP/2 enabled as its primary backend, and falls back to aiohttp if httpx is not available.
Basic Async Usage
import asyncio
from nebulascrape import AsyncClient
async def main():
client = AsyncClient(profile="chrome_windows", auto_retry=True)
response = await client.get("https://target.com")
print(response.status_code)
print(response.meta)
await client.close()
asyncio.run(main())
Context Manager
import asyncio
from nebulascrape import AsyncClient
async def main():
async with AsyncClient(profile="chrome_linux", auto_retry=True, max_retries=5) as client:
r1 = await client.get("https://httpbin.org/get")
r2 = await client.post("https://httpbin.org/post", json={"key": "value"})
print(r1.status_code, r2.status_code)
asyncio.run(main())
Concurrent Requests
import asyncio
from nebulascrape import AsyncClient
async def fetch(client, url):
r = await client.get(url)
return r.status_code, r.meta["risk_score"]
async def main():
async with AsyncClient(profile="chrome_windows") as client:
urls = [
"https://httpbin.org/get",
"https://httpbin.org/headers",
"https://httpbin.org/ip",
]
results = await asyncio.gather(*[fetch(client, u) for u in urls])
for status, risk in results:
print(f"Status: {status} Risk: {risk}")
asyncio.run(main())
Built-in Metrics
Every response returned by NebulaScrape contains a meta["metrics"] dictionary with timing and retry information collected during the request lifecycle.
Per-Request Metrics
| Field | Type | Description |
|---|---|---|
latency_ms |
float | Total request duration in milliseconds |
tls_handshake_ms |
float | Approximate TLS handshake time in milliseconds |
retry_count |
int | Number of retries made for this request |
redirect_depth |
int | Number of redirects followed |
transport_used |
str | Which transport tier was active (http1, http2, async_http2) |
Reading Metrics
from nebulascrape import Client
client = Client(profile="chrome_windows", auto_retry=True)
response = client.get("https://httpbin.org/get")
m = response.meta["metrics"]
print(f"Latency: {m['latency_ms']} ms")
print(f"Handshake: {m['tls_handshake_ms']} ms")
print(f"Retries: {m['retry_count']}")
print(f"Redirects: {m['redirect_depth']}")
print(f"Transport: {m['transport_used']}")
Session-Level Aggregate Metrics
from nebulascrape import Client
client = Client(profile="chrome_windows")
for url in ["https://httpbin.org/get", "https://httpbin.org/headers"]:
client.get(url)
stats = client.metrics
print(f"Total requests: {stats['total_requests']}")
print(f"Average latency: {stats['avg_latency_ms']} ms")
print(f"Max latency: {stats['max_latency_ms']} ms")
print(f"Total retries: {stats['total_retries']}")
print(f"Challenges solved: {stats['challenges_solved']}")
Plugin System
NebulaScrape includes a plugin registry that allows you to attach custom behavior to the request lifecycle without modifying the core library. All plugins inherit from BasePlugin and can hook into pre-request, post-request, challenge detection, and retry events.
Built-in Plugins
RateLimitPlugin
Adds adaptive pre-request delays based on request rate. Detects burst patterns and automatically increases delays. Respects Retry-After headers on 429 responses.
from nebulascrape import Client
from nebulascrape.plugins.rate_limit_handler import RateLimitPlugin
client = Client(profile="chrome_windows")
client.register_plugin(RateLimitPlugin(
min_delay=0.3,
max_delay=2.0,
burst_threshold=10,
))
HeaderOptimizerPlugin
Ensures browser-realistic headers are applied to every request, merging them with any user-supplied headers while preserving the correct order. Adjusts Sec-Fetch headers automatically for POST requests.
from nebulascrape import Client
from nebulascrape.plugins.header_optimizer import HeaderOptimizerPlugin
client = Client(profile="chrome_windows")
client.register_plugin(HeaderOptimizerPlugin(profile_name="chrome_windows"))
ProxyManagerPlugin
Manages a pool of proxy servers with automatic rotation on failure. Tracks per-proxy failure counts and rotates after two consecutive failures on the same proxy.
from nebulascrape import Client
from nebulascrape.plugins.proxy_manager import ProxyManagerPlugin
proxies = [
"http://user:pass@proxy1:8080",
"http://user:pass@proxy2:8080",
"http://user:pass@proxy3:8080",
]
client = Client(profile="chrome_windows")
client.register_plugin(ProxyManagerPlugin(
proxies=proxies,
rotate_on_fail=True,
rotate_on_status=[403, 429, 503],
))
Writing a Custom Plugin
from nebulascrape.plugins import BasePlugin
from nebulascrape import Client
class LoggingPlugin(BasePlugin):
name = "logging_plugin"
priority = 5 # lower number = runs first
def on_pre_request(self, scraper, method, url, kwargs):
print(f"REQUEST {method} {url}")
return kwargs
def on_post_request(self, scraper, response, kwargs):
print(f"RESPONSE {response.status_code} - risk={response.meta.get('risk_score', 'n/a')}")
return response
def on_retry(self, scraper, attempt, decision):
print(f"RETRY {attempt} - reason: {decision.reason} - waiting {decision.backoff_seconds:.1f}s")
client = Client(profile="chrome_windows", auto_retry=True)
client.register_plugin(LoggingPlugin())
response = client.get("https://httpbin.org/get")
Plugin Hook Reference
| Hook | When it runs | Return value |
|---|---|---|
on_pre_request(scraper, method, url, kwargs) |
Before every request | Modified kwargs dict |
on_post_request(scraper, response, kwargs) |
After every response | response object |
on_challenge_detected(scraper, response, intel_result) |
When a challenge is found | bool |
on_retry(scraper, attempt, decision) |
Before each retry sleep | None |
WAF Bypass Engine
NebulaScrape's bypass capabilities are integrated across multiple layers of the library. There is no single "bypass" function. Instead, bypass is the result of the fingerprint, transport, intelligence, and retry systems working together.
Cloudflare
I'm Under Attack Mode (v1)
Detected by inspecting the response body for the characteristic jsch trace image and challenge form. The library extracts the challenge parameters, waits a browser-realistic delay (parsed from the page's own JavaScript, with jitter added), solves the JavaScript challenge using the native interpreter, submits the solution as a POST request, and follows the redirect to retrieve the real page. The cf_clearance cookie is then retained in the session for future requests.
Turnstile
Detected by looking for cf-turnstile or challenges.cloudflare.com/turnstile in the response. When this challenge is detected, the library raises TurnstileChallengeError and recommends using TransportHeadless with curl_cffi, which passes the Turnstile check at the TLS and HTTP/2 fingerprint layer without requiring a browser.
Managed Challenge and v2
Detected by inspecting the CDN CGI orchestration endpoint pattern. Escalation to the headless transport is recommended.
Cloudflare Firewall 1020
Detected and raised as CloudflareCode1020. This is an IP-level block that requires a proxy rotation.
Multi-WAF Detection
The SessionIntelligence layer detects the following vendors using header and body signature matching:
| WAF | Detection Method |
|---|---|
| Cloudflare | Server: cloudflare header + body patterns |
| DataDome | dd_sitekey, datadome.co cookie domains |
| PerimeterX | _pxdk cookie, PerimeterX body references |
| Kasada | kasada, kpsdk body references |
| Akamai | _abck, ak_bmsc cookies, sensor_data |
| Imperva | incap_ses_, visid_incap_ cookies |
| Shape Security | shape.io, x-shape- headers |
TLS Fingerprint Spoofing
Standard Python ssl sends a TLS fingerprint (JA3) that is trivially identifiable as a non-browser client. NebulaScrape replaces the default SSL context with one that:
- Sets the cipher suite list to match Chrome 120's exact order
- Sets the ECDH curve to
prime256v1 - Sets TLS minimum version to TLS 1.2 and maximum to TLS 1.3
- Preserves the correct TLS extension set
This produces a JA3 fingerprint that matches a real Chrome browser.
Advanced Usage
All Options Together
from nebulascrape import Client
from nebulascrape.plugins.rate_limit_handler import RateLimitPlugin
from nebulascrape.plugins.proxy_manager import ProxyManagerPlugin
from nebulascrape.plugins.header_optimizer import HeaderOptimizerPlugin
client = Client(
profile="chrome_windows",
auto_retry=True,
max_retries=7,
mode="auto",
interpreter="native",
debug=False,
)
client.register_plugin(RateLimitPlugin(min_delay=0.5, max_delay=3.0))
client.register_plugin(HeaderOptimizerPlugin(profile_name="chrome_windows"))
client.register_plugin(ProxyManagerPlugin(proxies=["http://proxy1:8080"]))
response = client.get("https://target.com", timeout=30)
print("Status: ", response.status_code)
print("Challenge: ", response.meta["challenge_type"])
print("Risk: ", response.meta["risk_score"])
print("WAF: ", response.meta["waf_vendor"])
print("Latency: ", response.meta["metrics"]["latency_ms"], "ms")
print("Retries: ", response.meta["metrics"]["retry_count"])
Using the Low-Level NebulaScraper
from nebulascrape import NebulaScraper
scraper = NebulaScraper(
browser={"browser": "chrome", "platform": "windows", "desktop": True},
auto_retry=True,
max_retries=5,
mode="auto",
captcha={"provider": "2captcha", "api_key": "YOUR_KEY"},
solveDepth=3,
doubleDown=True,
delay=None,
)
response = scraper.get("https://target.com")
cookies = response.cookies
tokens = scraper.cookies.get("cf_clearance")
Passing Cookies or Proxies
from nebulascrape import Client
client = Client(profile="chrome_windows")
# Proxies
response = client.get("https://target.com", proxies={
"http": "http://proxy:8080",
"https": "http://proxy:8080",
})
# Custom cookies
response = client.get("https://target.com", cookies={
"session_id": "abc123",
})
# Custom headers (merged with profile headers)
response = client.get("https://target.com", headers={
"Referer": "https://google.com",
"X-Custom-Header": "value",
})
Integrating with Existing Sessions
import requests
from nebulascrape import NebulaScraper
existing_session = requests.Session()
existing_session.headers.update({"Authorization": "Bearer token123"})
scraper = NebulaScraper.create_scraper(
sess=existing_session,
profile="chrome_linux",
auto_retry=True,
)
response = scraper.get("https://api.target.com/data")
Captcha Integration
from nebulascrape import Client
client = Client(
profile="chrome_windows",
captcha={
"provider": "2captcha",
"api_key": "YOUR_2CAPTCHA_KEY",
}
)
response = client.get("https://cloudflare-captcha-site.com")
Supported captcha providers: 2captcha, anticaptcha, capmonster, capsolver, 9kw, deathbycaptcha.
Getting Cloudflare Tokens
from nebulascrape import get_tokens, get_cookie_string
tokens, user_agent = get_tokens("https://cloudflare-protected-site.com")
print("cf_clearance:", tokens["cf_clearance"])
print("User-Agent: ", user_agent)
cookie_string, user_agent = get_cookie_string("https://cloudflare-protected-site.com")
print("Cookie:", cookie_string)
API Reference
Client
Client(
profile="chrome_windows",
auto_retry=True,
max_retries=5,
mode="auto",
captcha={},
interpreter="native",
debug=False,
**kwargs
)
| Parameter | Type | Default | Description |
|---|---|---|---|
profile |
str | chrome_windows |
Fingerprint profile to use |
auto_retry |
bool | True |
Enable smart retry engine |
max_retries |
int | 5 |
Maximum retry attempts |
mode |
str | auto |
Transport mode (auto, http1, http2, headless) |
captcha |
dict | {} |
Captcha provider configuration |
interpreter |
str | native |
JS interpreter for challenge solving |
debug |
bool | False |
Enable request/response debugging output |
Methods: get(url, **kwargs), post(url, **kwargs), put(url, **kwargs), delete(url, **kwargs), request(method, url, **kwargs), register_plugin(plugin), session (property), metrics (property)
AsyncClient / AsyncNebulaScraper
AsyncClient(
profile="chrome_windows",
auto_retry=True,
max_retries=5,
debug=False,
**kwargs
)
Methods: await get(url, **kwargs), await post(url, **kwargs), await put(url, **kwargs), await delete(url, **kwargs), await request(method, url, **kwargs), register_plugin(plugin), await close(), supports async with.
NebulaScraper
Extends requests.Session. All requests.Session methods are available.
Additional parameters on top of Client:
| Parameter | Type | Default | Description |
|---|---|---|---|
browser |
dict or None | None | Browser dict with keys browser, platform, desktop, mobile |
solveDepth |
int | 3 |
Maximum Cloudflare challenge solve loops |
doubleDown |
bool | True |
Double request on captcha to check if cfuid is enough |
delay |
float or None | None | Manual Cloudflare challenge delay in seconds |
disableCloudflareV1 |
bool | False |
Disable built-in Cloudflare v1 bypass |
requestPreHook |
callable | None | Function called before each request |
requestPostHook |
callable | None | Function called after each response |
source_address |
str or tuple | None | Bind to a specific local IP |
ssl_context |
ssl.SSLContext | None | Custom SSL context |
response.meta Fields
| Field | Type | Description |
|---|---|---|
challenge_type |
str | Detected challenge type (see challenge type table) |
waf_vendor |
str | Detected WAF vendor |
risk_score |
int | Risk score 0-100 |
retry_recommended |
bool | Whether retry is suggested |
rotate_session |
bool | Whether session rotation is suggested |
switch_transport |
bool | Whether transport escalation is suggested |
details |
dict | Raw details: retry_after, cf_ray, status_code |
metrics |
dict | latency_ms, tls_handshake_ms, retry_count, redirect_depth, transport_used |
Configuration Reference
Fingerprint Profiles
| Profile | User-Agent snippet | Platform |
|---|---|---|
chrome_windows |
Chrome/120.0.0.0 ... Windows NT 10.0 | Windows |
chrome_linux |
Chrome/120.0.0.0 ... X11; Linux x86_64 | Linux |
firefox |
Firefox/121.0 ... Windows NT 10.0 | Windows |
mobile |
Chrome/120.0.6099.144 Mobile ... Android 13 | Android |
Transport Modes
| Mode | Backend | HTTP Version | TLS Spoof | Impersonation Level |
|---|---|---|---|---|
http1 |
requests | HTTP/1.1 | JA3 cipher suite | High |
http2 |
httpx | HTTP/2 | JA3 + H2 SETTINGS | Very High |
headless |
curl_cffi | HTTP/2 | Full BoringSSL | Maximum |
auto |
escalating | depends | depends | Adaptive |
JavaScript Interpreters
| Interpreter | Requirement | Description |
|---|---|---|
native |
None (built-in) | Pure Python JS evaluation for simple challenges |
js2py |
pip install js2py |
Full JavaScript runtime |
nodejs |
Node.js installed | Executes via Node.js subprocess |
chakracore |
ChakraCore binary | Microsoft JS engine |
v8 |
V8 binary | Google V8 JS engine |
Author
| Field | Value |
|---|---|
| Developer | MERO |
| Contact | TG@QP4M |
| GitHub | github.com/6x-u |
| License | MIT |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nebulascrape-0.0.1.tar.gz.
File metadata
- Download URL: nebulascrape-0.0.1.tar.gz
- Upload date:
- Size: 118.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36e9526f21d3d09bdfc47006911452de637cdd0d689fe69efe8febf57b9892b0
|
|
| MD5 |
cf5f654cb7eb077a4ec067b05ea8abdd
|
|
| BLAKE2b-256 |
ac603b377da005a3e051a78b61bc3e2ceff28a6bc956106f5b8101b079067709
|
File details
Details for the file nebulascrape-0.0.1-py2.py3-none-any.whl.
File metadata
- Download URL: nebulascrape-0.0.1-py2.py3-none-any.whl
- Upload date:
- Size: 118.0 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
160e3b894bf6bdf3e7a354c3713e93fbdd2713fd2649736abc83892175216694
|
|
| MD5 |
11d9d474dbb5efc394f1bf10d19c6735
|
|
| BLAKE2b-256 |
49d36711770cdcd794a42cfd8f5142f30d3570346ad440db7ff46332039bf1c9
|