Skip to main content

libcurl ffi bindings for Python, with impersonation support.

Project description

curl_cffi

PyPI - Downloads PyPI - Python Version PyPI version Generic badge Generic badge

Documentation

Python binding for curl-impersonate fork via cffi.

Unlike other pure python http clients like httpx or requests, curl_cffi can impersonate browsers' TLS/JA3 and HTTP/2 fingerprints. If you are blocked by some website for no obvious reason, you can give curl_cffi a try.

Only Python 3.8 and above are supported. Python 3.7 has reached its end of life.


ProxyCurl

Scrape public LinkedIn profile data at scale with Proxycurl APIs. Built for developers, by developers.

  • GDPR, CCPA, SOC2 compliant
  • High rate limit (300 requests/min), Fast (APIs respond in ~2s), High accuracy
  • Fresh data - 88% of data is scraped real-time, other 12% is <29 days
  • Tons of data points returned per profile

Bypass Cloudflare with API

Yes Captcha!

Yescaptcha is a proxy service that bypasses Cloudflare and uses the API interface to obtain verified cookies (e.g. cf_clearance). Click here to register: https://yescaptcha.com/i/stfnIO


Scrape Ninja

ScrapeNinja is a web scraping API with two engines: fast, with high performance and TLS fingerprint; and slower with a real browser under the hood.

ScrapeNinja handles headless browsers, proxies, timeouts, retries, and helps with data extraction, so you can just get the data in JSON. Rotating proxies are available out of the box on all subscription plans.


Features

  • Supports JA3/TLS and http2 fingerprints impersonation, including recent browsers and custome fingerprints.
  • Much faster than requests/httpx, on par with aiohttp/pycurl, see benchmarks.
  • Mimics requests API, no need to learn another one.
  • Pre-compiled, so you don't have to compile on your machine.
  • Supports asyncio with proxy rotation on each request.
  • Supports http 2.0, which requests does not.
  • Supports websocket.
requests aiohttp httpx pycurl curl_cffi
http2
sync
async
websocket
fingerprints
speed 🐇 🐇🐇 🐇 🐇🐇 🐇🐇

Install

pip install curl_cffi --upgrade

This should work on Linux, macOS and Windows out of the box. If it does not work on you platform, you may need to compile and install curl-impersonate first and set some environment variables like LD_LIBRARY_PATH.

To install beta releases:

pip install curl_cffi --upgrade --pre

To install unstable version from GitHub:

git clone https://github.com/lexiforest/curl_cffi/
cd curl_cffi
make preprocess
pip install .

Usage

curl_cffi comes with a low-level curl API and a high-level requests-like API.

requests-like

from curl_cffi import requests

# Notice the impersonate parameter
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")

print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser

# To keep using the latest browser version as `curl_cffi` updates,
# simply set impersonate="chrome" without specifying a version.
# Other similar values are: "safari" and "safari_ios"
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")

# To pin a specific version, use version numbers together.
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome124")

# To impersonate other than browsers, bring your own ja3/akamai strings
# See examples directory for details.
r = requests.get("https://tls.browserleaks.com/json", ja3=..., akamai=...)

# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome", proxies=proxies)

proxies = {"https": "socks://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome", proxies=proxies)

Sessions

s = requests.Session()

# httpbin is a http test website, this endpoint makes the server set cookies
s.get("https://httpbin.org/cookies/set/foo/bar")
print(s.cookies)
# <Cookies[<Cookie foo=bar for httpbin.org />]>

# retrieve cookies again to verify
r = s.get("https://httpbin.org/cookies")
print(r.json())
# {'cookies': {'foo': 'bar'}}

curl_cffi supports the same browser versions as supported by my fork of curl-impersonate:

However, only WebKit-based browsers are supported. Firefox support is tracked in #59.

Browser versions will be added only when their fingerprints change. If you see a version, e.g. chrome122, were skipped, you can simply impersonate it with your own headers and the previous version.

If you are trying to impersonate a target other than a browser, use ja3=... and akamai=... to specify your own customized fingerprints. See the docs on impersonatation for details.

  • chrome99
  • chrome100
  • chrome101
  • chrome104
  • chrome107
  • chrome110
  • chrome116 [1]
  • chrome119 [1]
  • chrome120 [1]
  • chrome123 [3]
  • chrome124 [3]
  • chrome131 [4]
  • chrome99_android
  • chrome131_android [4]
  • edge99
  • edge101
  • safari15_3 [2]
  • safari15_5 [2]
  • safari17_0 [1]
  • safari17_2_ios [1]
  • safari18_0 [4]
  • safari18_0_ios [4]

Notes:

  1. Added in version 0.6.0.
  2. Fixed in version 0.6.0, previous http2 fingerprints were not correct.
  3. Added in version 0.7.0.
  4. Added in version 0.8.0.

asyncio

from curl_cffi.requests import AsyncSession

async with AsyncSession() as s:
    r = await s.get("https://example.com")

More concurrency:

import asyncio
from curl_cffi.requests import AsyncSession

urls = [
    "https://google.com/",
    "https://facebook.com/",
    "https://twitter.com/",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    results = await asyncio.gather(*tasks)

WebSockets

from curl_cffi.requests import WebSocket

def on_message(ws: WebSocket, message: str | bytes):
    print(message)

ws = WebSocket(on_message=on_message)
ws.run_forever("wss://api.gemini.com/v1/marketdata/BTCUSD")

For low-level APIs, Scrapy integration and other advanced topics, see the docs for more details.

asyncio WebSockets

import asyncio
from curl_cffi.requests import AsyncSession

async with AsyncSession() as s:
    ws = await s.ws_connect("wss://echo.websocket.org")
    await asyncio.gather(*[ws.send_str("Hello, World!") for _ in range(10)])
    async for message in ws:
        print(message)

Acknowledgement

  • Originally forked from multippt/python_curl_cffi, which is under the MIT license.
  • Headers/Cookies files are copied from httpx, which is under the BSD license.
  • Asyncio support is inspired by Tornado's curl http client.
  • The synchronous WebSocket API is inspired by websocket_client.
  • The asynchronous WebSocket API is inspired by aiohttp.

Sponsor

Buy Me A Coffee

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

curl_cffi-0.8.1b8.tar.gz (143.7 kB view details)

Uploaded Source

Built Distributions

curl_cffi-0.8.1b8-cp38-abi3-musllinux_1_1_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

curl_cffi-0.8.1b8-cp38-abi3-musllinux_1_1_aarch64.whl (6.8 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.3 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (5.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ i686

curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.9 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

curl_cffi-0.8.1b8-cp38-abi3-macosx_11_0_arm64.whl (2.6 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

curl_cffi-0.8.1b8-cp38-abi3-macosx_10_9_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.8+macOS 10.9+ x86-64

File details

Details for the file curl_cffi-0.8.1b8.tar.gz.

File metadata

  • Download URL: curl_cffi-0.8.1b8.tar.gz
  • Upload date:
  • Size: 143.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.8

File hashes

Hashes for curl_cffi-0.8.1b8.tar.gz
Algorithm Hash digest
SHA256 3a47342534bbdaa7459c0e0c5d440ab22956418b376ae82fbb6d8752303fbaf5
MD5 07e9319010a4968089f5dd2dd9f50ca9
BLAKE2b-256 adf845b7f8c044d7ac442dc3d021b45f2c7d465fcede94c7178f9f1a513b43d9

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 674c91b0e18f3fcc7a32afa9e90917966fbe146c73df47bf7cc3b20203040bf4
MD5 bf01a3e3401ac26fe4365f1c72f5640c
BLAKE2b-256 de0423132ea227767eb07a04e808fbb44d6d80d50e900f6e5044d3dec95025af

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 bfa80bd98b5452a8deeee65c1a78c44f197683e2d84baa9162b0c1eaef82330f
MD5 061109b7ab7bbb0f6bab19c6af1c49ea
BLAKE2b-256 1338473d68a109c0c9bad5cab9cb61ae0d62e6aa00b9401f21e5c5afde128a44

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b7369f11edb0b44a5a4b761f573db5e17c271ccc0fc6a0ecd0d04d108cd3b680
MD5 8086d74ebd3a8346bce8699f5c01dbe4
BLAKE2b-256 84f1d0812fe08adf1dbc77b95709f490c4d536496b6a9f4cba72c091ab731279

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 cebee550353c9bc732538e4095e4cd191e2d52974b604ffbb5b4075a1d5d1a40
MD5 fe21882f6a432e25d4c90f99e2d25704
BLAKE2b-256 2e1f7f3c4f3db364ed4f0c02fc3df3e627fcfc5baf4fac680c17c9508a9bdb34

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6338e30c1010c6314f4b2b645790d565b3e1467ad4ea1d4a351bdcc1b32568c2
MD5 494eb0f9d8bddc74d8a5aa4b2d3b7895
BLAKE2b-256 99901718fffb7e593e153239c2ddb6baf2505e152720f9a00e489fe357e8c08e

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bb93182d244eff952ad414c7efd0b2758e6a2b99c137ad283d3f71cace090c18
MD5 ad1ab40398c1f94b9d2e44e5cfd0a029
BLAKE2b-256 7a5bc9a64f5d7f36a8db9ac39c1f5c8137bb9574f81f6853a8f5ab9ba21fa06f

See more details on using hashes here.

File details

Details for the file curl_cffi-0.8.1b8-cp38-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.8.1b8-cp38-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d077ad7616f6698fd2be24c77f571b2930551b1c86da8342cd60a92205ea1816
MD5 bab521c50d132d56cc3ce90c8c549422
BLAKE2b-256 79faa05b998dd95b34ade84bf926e80cb8ee8fdced44d4935a2aef330eec28be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page