Skip to main content

libcurl ffi bindings for Python, with impersonation support.

Project description

curl_cffi

PyPI - Downloads PyPI - Python Version PyPI version Generic badge

Documentation

Python binding for curl-impersonate via cffi.

Unlike other pure python http clients like httpx or requests, curl_cffi can impersonate browsers' TLS/JA3 and HTTP/2 fingerprints. If you are blocked by some website for no obvious reason, you can give curl_cffi a try.

The fingerprints in 0.6 on Windows are all wrong, you should update to 0.7 if you are on Windows. Sorry for the inconvenience.

Only Python 3.8 and above are supported. Python 3.7 has reached its end of life.


Scrapfly.io

Scrapfly is an enterprise-grade solution providing Web Scraping API that aims to simplify the scraping process by managing everything: real browser rendering, rotating proxies, and fingerprints (TLS, HTTP, browser) to bypass all major anti-bots. Scrapfly also unlocks the observability by providing an analytical dashboard and measuring the success rate/block rate in detail.

Scrapfly is a good solution if you are looking for a cloud-managed solution for curl_cffi. If you are managing TLS/HTTP fingerprint by yourself with curl_cffi, they also maintain a curl to python converter.


Features

  • Supports JA3/TLS and http2 fingerprints impersonation, inlucding recent browsers and custome fingerprints.
  • Much faster than requests/httpx, on par with aiohttp/pycurl, see benchmarks.
  • Mimics requests API, no need to learn another one.
  • Pre-compiled, so you don't have to compile on your machine.
  • Supports asyncio with proxy rotation on each request.
  • Supports http 2.0, which requests does not.
  • Supports websocket.
requests aiohttp httpx pycurl curl_cffi
http2
sync
async
websocket
fingerprints
speed 🐇 🐇🐇 🐇 🐇🐇 🐇🐇

Install

pip install curl_cffi --upgrade

This should work on Linux, macOS and Windows out of the box. If it does not work on you platform, you may need to compile and install curl-impersonate first and set some environment variables like LD_LIBRARY_PATH.

To install beta releases:

pip install curl_cffi --upgrade --pre

To install unstable version from GitHub:

git clone https://github.com/lexiforest/curl_cffi/
cd curl_cffi
make preprocess
pip install .

Usage

curl_cffi comes with a low-level curl API and a high-level requests-like API.

requests-like

from curl_cffi import requests

# Notice the impersonate parameter
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")

print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser

# To keep using the latest browser version as `curl_cffi` updates,
# simply set impersonate="chrome" without specifying a version.
# Other similar values are: "safari" and "safari_ios"
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")

# To pin a specific version, use version numbers together.
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome124")

# To impersonate other than browsers, bring your own ja3/akamai strings
# See examples directory for details.
r = requests.get("https://tls.browserleaks.com/json", ja3=..., akamai=...)

# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome", proxies=proxies)

proxies = {"https": "socks://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome", proxies=proxies)

Sessions

s = requests.Session()

# httpbin is a http test website, this endpoint makes the server set cookies
s.get("https://httpbin.org/cookies/set/foo/bar")
print(s.cookies)
# <Cookies[<Cookie foo=bar for httpbin.org />]>

# retrieve cookies again to verify
r = s.get("https://httpbin.org/cookies")
print(r.json())
# {'cookies': {'foo': 'bar'}}

curl_cffi supports the same browser versions as supported by my fork of curl-impersonate:

However, only Chrome-like browsers are supported. Firefox support is tracked in #59.

Browser versions will be added only when their fingerprints change. If you see a version, e.g. chrome122, were skipped, you can simply impersonate it with your own headers and the previous version.

If you are trying to impersonate a target other than a browser, use ja3=... and akamai=... to specify your own customized fingerprints. See the docs on impersonatation for details.

  • chrome99
  • chrome100
  • chrome101
  • chrome104
  • chrome107
  • chrome110
  • chrome116 [1]
  • chrome119 [1]
  • chrome120 [1]
  • chrome123 [3]
  • chrome124 [3]
  • chrome99_android
  • edge99
  • edge101
  • safari15_3 [2]
  • safari15_5 [2]
  • safari17_0 [1]
  • safari17_2_ios [1]

Notes:

  1. Added in version 0.6.0.
  2. Fixed in version 0.6.0, previous http2 fingerprints were not correct.
  3. Added in version 0.7.0.

asyncio

from curl_cffi.requests import AsyncSession

async with AsyncSession() as s:
    r = await s.get("https://example.com")

More concurrency:

import asyncio
from curl_cffi.requests import AsyncSession

urls = [
    "https://google.com/",
    "https://facebook.com/",
    "https://twitter.com/",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    results = await asyncio.gather(*tasks)

WebSockets

from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
    print(message)

with Session() as s:
    ws = s.ws_connect(
        "wss://api.gemini.com/v1/marketdata/BTCUSD",
        on_message=on_message,
    )
    ws.run_forever()

For low-level APIs, Scrapy integration and other advanced topics, see the docs for more details.

Acknowledgement

  • Originally forked from multippt/python_curl_cffi, which is under the MIT license.
  • Headers/Cookies files are copied from httpx, which is under the BSD license.
  • Asyncio support is inspired by Tornado's curl http client.
  • The WebSocket API is inspired by websocket_client.

[Sponsor] Bypass Cloudflare with API

Yes Captcha!

Yescaptcha is a proxy service that bypasses Cloudflare and uses the API interface to obtain verified cookies (e.g. cf_clearance). Click here to register: https://yescaptcha.com/i/stfnIO

[Sponsor] ScrapeNinja

Scrape Ninja

ScrapeNinja is a web scraping API with two engines: fast, with high performance and TLS fingerprint; and slower with a real browser under the hood.

ScrapeNinja handles headless browsers, proxies, timeouts, retries, and helps with data extraction, so you can just get the data in JSON. Rotating proxies are available out of the box on all subscription plans.

Sponsor

Buy Me A Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

curl_cffi-0.7.3.tar.gz (136.4 kB view details)

Uploaded Source

Built Distributions

curl_cffi-0.7.3-cp38-abi3-win_amd64.whl (4.0 MB view details)

Uploaded CPython 3.8+ Windows x86-64

curl_cffi-0.7.3-cp38-abi3-win32.whl (4.2 MB view details)

Uploaded CPython 3.8+ Windows x86

curl_cffi-0.7.3-cp38-abi3-musllinux_1_1_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.8+ musllinux: musl 1.1+ x86-64

curl_cffi-0.7.3-cp38-abi3-musllinux_1_1_aarch64.whl (6.2 MB view details)

Uploaded CPython 3.8+ musllinux: musl 1.1+ ARM64

curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (5.5 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ i686

curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.7 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

curl_cffi-0.7.3-cp38-abi3-macosx_11_0_arm64.whl (2.5 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

curl_cffi-0.7.3-cp38-abi3-macosx_10_9_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ x86-64

File details

Details for the file curl_cffi-0.7.3.tar.gz.

File metadata

  • Download URL: curl_cffi-0.7.3.tar.gz
  • Upload date:
  • Size: 136.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.7

File hashes

Hashes for curl_cffi-0.7.3.tar.gz
Algorithm Hash digest
SHA256 901012e21af899bdf1278fd9fcee5aad6931ed56e1f5a620ffa90220c9e79f10
MD5 478644e13db092bfdedb720fc845e193
BLAKE2b-256 21974df615296fcbbeab657eced4d30ee549ca7f9f692f1e967e9ca0a14c13f1

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: curl_cffi-0.7.3-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.7

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 39e195a13b95bd7e4a1e8eb7804807b4504c462f6886fde01dfe4b7552fd965e
MD5 533a4ec87f5e9d23b62a1ee853df7079
BLAKE2b-256 5d98530b5dba79f1310fa32b3544e2566e0ea21c8c6b18230272ab52b71338b4

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-win32.whl.

File metadata

  • Download URL: curl_cffi-0.7.3-cp38-abi3-win32.whl
  • Upload date:
  • Size: 4.2 MB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.7

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 fe01be42c27667353028ee06802cc32e3936496e645a39175db4cf7bea45bb10
MD5 6f60432a1b30b693e3c7ec55dd07fed0
BLAKE2b-256 6d6d37c29d90af82f90d0309557d2dd0887a9d1231843bc4180af42d390293d7

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 a357d3da6aa8ccd41f1e6e422b9ff09ad4cc99b23b8f6299998cbfee6e690f01
MD5 177dd231134cf0ce17b8591b18860ca9
BLAKE2b-256 e704d5149beb9aafadfbbb94cf0941ec2d84d1caea6eba1974318a69efd7d86a

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 3f9373ce27b20e65fb1e6cbea6e4411edca1f3b4440be68090b262eece728544
MD5 1e6ee4bb7a1ddae67f2065117807de52
BLAKE2b-256 045c5a44eee5bba58b2487de228a42105c55ed42f3f7dcf65732bb7ebe9247fe

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9a90bfd0bcc7bf0f30ef6de6f4e612ea1bdc238a48928845532109a0273f6275
MD5 9c77b4f4e6b405c2d0fb733654b25aa4
BLAKE2b-256 f50c550ee53daae735b9883ad1e404a49d5a8d68d06a24f810872f800d2e0711

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 ec6718bc5151b3e0ecc35aaf40078a39cc239405182c63fc95933eb7bff572dd
MD5 3551fcf589408d7915fc5a6e26e33094
BLAKE2b-256 6cc77d354eacd2f451020dc9cd4c38df0e01efc8604c05099d342d49e605e5e9

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 43367efb8d48e9997cf7591084ed7529409ad98bb67e284ae152f2a15e4ee68e
MD5 f37c439001b52822384fa3b75264efac
BLAKE2b-256 902dedb41cbcbef44081353346f763781b42ae2f7af9810994a8ed9b51f54f81

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b5520bcf6284417e66c82728512e344b50bc0e9d8bd7949923b30558746b49a3
MD5 f6434b11ae589fd94e4a74939555c254
BLAKE2b-256 9ae69457d5e048b4beedae4a2ccf62f93fbaf345a6c38177af1ffb5e89fe5829

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.3-cp38-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.3-cp38-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 890f1b0e1454977ff6bd388d29eb1eafb76bb8ef63d3c8b7539aafd4808a6cec
MD5 6e7c6a2443bbc258dfcfa8b8dd9c8429
BLAKE2b-256 024774cefea3d584ef2a291415c42f6e4167470ce10456d1610838355371f871

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page