Skip to main content

libcurl ffi bindings for Python, with impersonation support.

Project description

curl_cffi

PyPI - Downloads PyPI - Python Version PyPI version Generic badge Generic badge

Documentation | 中文 README

Python binding for curl-impersonate via cffi.

Unlike other pure python http clients like httpx or requests, curl_cffi can impersonate browsers' TLS/JA3 and HTTP/2 fingerprints. If you are blocked by some website for no obvious reason, you can give curl_cffi a try.


Scrapfly.io

Scrapfly is an enterprise-grade solution providing Web Scraping API that aims to simplify the scraping process by managing everything: real browser rendering, rotating proxies, and fingerprints (TLS, HTTP, browser) to bypass all major anti-bots. Scrapfly also unlocks the observability by providing an analytical dashboard and measuring the success rate/block rate in detail.

Scrapfly is a good solution if you are looking for a cloud-managed solution for curl_cffi. If you are managing TLS/HTTP fingerprint by yourself with curl_cffi, they also maintain a curl to python converter.


Features

  • Supports JA3/TLS and http2 fingerprints impersonation.
  • Much faster than requests/httpx, on par with aiohttp/pycurl, see benchmarks.
  • Mimics requests API, no need to learn another one.
  • Pre-compiled, so you don't have to compile on your machine.
  • Supports asyncio with proxy rotation on each request.
  • Supports http 2.0, which requests does not.
  • Supports websocket.
requests aiohttp httpx pycurl curl_cffi
http2
sync
async
websocket
fingerprints
speed 🐇 🐇🐇 🐇 🐇🐇 🐇🐇

Install

pip install curl_cffi --upgrade

This should work on Linux, macOS and Windows out of the box. If it does not work on you platform, you may need to compile and install curl-impersonate first and set some environment variables like LD_LIBRARY_PATH.

To install beta releases:

pip install curl_cffi --upgrade --pre

To install unstable version from GitHub:

git clone https://github.com/yifeikong/curl_cffi/
cd curl_cffi
make preprocess
pip install .

Usage

curl_cffi comes with a low-level curl API and a high-level requests-like API.

Use the latest impersonate versions, do NOT copy chrome110 here without changing.

requests-like

from curl_cffi import requests

# Notice the impersonate parameter
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110")

print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser

# To keep using the latest browser version as `curl_cffi` updates,
# simply set impersonate="chrome" without specifying a version.
# Other similar values are: "safari" and "safari_ios"
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome")

# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110", proxies=proxies)

proxies = {"https": "socks://localhost:3128"}
r = requests.get("https://tools.scrapfly.io/api/fp/ja3", impersonate="chrome110", proxies=proxies)

Sessions

s = requests.Session()

# httpbin is a http test website, this endpoint makes the server set cookies
s.get("https://httpbin.org/cookies/set/foo/bar")
print(s.cookies)
# <Cookies[<Cookie foo=bar for httpbin.org />]>

# retrieve cookies again to verify
r = s.get("https://httpbin.org/cookies")
print(r.json())
# {'cookies': {'foo': 'bar'}}

Supported impersonate versions, as supported by my fork of curl-impersonate:

However, only Chrome-like browsers are supported. Firefox support is tracked in #59.

Browser versions will be added only when their fingerprints change. If you see a version, e.g. chrome122, were skipped, you can simply impersonate it with your own headers and the previous version.

  • chrome99
  • chrome100
  • chrome101
  • chrome104
  • chrome107
  • chrome110
  • chrome116 [1]
  • chrome119 [1]
  • chrome120 [1]
  • chrome123 [3]
  • chrome124 [3]
  • chrome99_android
  • edge99
  • edge101
  • safari15_3 [2]
  • safari15_5 [2]
  • safari17_0 [1]
  • safari17_2_ios [1]

Notes:

  1. Added in version 0.6.0.
  2. Fixed in version 0.6.0, previous http2 fingerprints were not correct.
  3. Added in version 0.7.0.

asyncio

from curl_cffi.requests import AsyncSession

async with AsyncSession() as s:
    r = await s.get("https://example.com")

More concurrency:

import asyncio
from curl_cffi.requests import AsyncSession

urls = [
    "https://google.com/",
    "https://facebook.com/",
    "https://twitter.com/",
]

async with AsyncSession() as s:
    tasks = []
    for url in urls:
        task = s.get(url)
        tasks.append(task)
    results = await asyncio.gather(*tasks)

WebSockets

from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
    print(message)

with Session() as s:
    ws = s.ws_connect(
        "wss://api.gemini.com/v1/marketdata/BTCUSD",
        on_message=on_message,
    )
    ws.run_forever()

For low-level APIs, Scrapy integration and other advanced topics, see the docs for more details.

Acknowledgement

  • Originally forked from multippt/python_curl_cffi, which is under the MIT license.
  • Headers/Cookies files are copied from httpx, which is under the BSD license.
  • Asyncio support is inspired by Tornado's curl http client.
  • The WebSocket API is inspired by websocket_client.

[Sponsor] Bypass Cloudflare with API

Yes Captcha!

Yescaptcha is a proxy service that bypasses Cloudflare and uses the API interface to obtain verified cookies (e.g. cf_clearance). Click here to register: https://yescaptcha.com/i/stfnIO

[Sponsor] ScrapeNinja

Scrape Ninja

ScrapeNinja is a web scraping API with two engines: fast, with high performance and TLS fingerprint; and slower with a real browser under the hood.

ScrapeNinja handles headless browsers, proxies, timeouts, retries, and helps with data extraction, so you can just get the data in JSON. Rotating proxies are available out of the box on all subscription plans.

Sponsor

Buy Me A Coffee

Citation

If you find this project useful, please cite it as below:

@software{Kong2023,
  author = {Yifei Kong},
  title = {curl_cffi - A Python HTTP client for impersonating browser TLS and HTTP/2 fingerprints},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  url = {https://github.com/yifeikong/curl_cffi},
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

curl_cffi-0.7.0b7.tar.gz (132.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

curl_cffi-0.7.0b7-cp38-abi3-win_amd64.whl (4.0 MB view details)

Uploaded CPython 3.8+Windows x86-64

curl_cffi-0.7.0b7-cp38-abi3-win32.whl (4.2 MB view details)

Uploaded CPython 3.8+Windows x86

curl_cffi-0.7.0b7-cp38-abi3-musllinux_1_1_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

curl_cffi-0.7.0b7-cp38-abi3-musllinux_1_1_aarch64.whl (6.2 MB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.1 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (5.5 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ i686

curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

curl_cffi-0.7.0b7-cp38-abi3-macosx_11_0_arm64.whl (2.5 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

curl_cffi-0.7.0b7-cp38-abi3-macosx_10_9_x86_64.whl (5.1 MB view details)

Uploaded CPython 3.8+macOS 10.9+ x86-64

File details

Details for the file curl_cffi-0.7.0b7.tar.gz.

File metadata

  • Download URL: curl_cffi-0.7.0b7.tar.gz
  • Upload date:
  • Size: 132.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for curl_cffi-0.7.0b7.tar.gz
Algorithm Hash digest
SHA256 e864d681836d42b4bc606dfa564ca2c48348a26040908fcc15bb2e6d6fb40534
MD5 140781553b0cb2879fc6771bc7a8aa1e
BLAKE2b-256 2351e40a2150c07cfaa24ac260c537919add2e7ff7b421caea2aad413e2dfcf0

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: curl_cffi-0.7.0b7-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 45a1455f211acc6987f5f418e9b42aed471d24dab7b4eac6c36ed6a99f2ff7eb
MD5 a573a1984f9e866197ee1cbd04d75cc8
BLAKE2b-256 3bf9450bdeaeec2349428dfc00d930203edf500abc1f620883b2269a152757d5

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-win32.whl.

File metadata

  • Download URL: curl_cffi-0.7.0b7-cp38-abi3-win32.whl
  • Upload date:
  • Size: 4.2 MB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 61c646ac2c8fceaaf9520fb121c6b90160d72b2b5732e39607dae645faa1281f
MD5 42f31922089f19a730c4d43eed96a36a
BLAKE2b-256 026030dad1090f2c3e846b8e18510429388835ef56760f47e16e7faade6711a9

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 79db864104766c06296068941045b0ea4ca99cef37f2e315eee1396ce13a4e78
MD5 d66e977bf4e827e09e4941dde03c5e96
BLAKE2b-256 aea488e73f377dd3ff695cb26deb8c43acbe480118770737cf04f4bae31bc09c

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 6a5ea2c6d46ebd342161554231e8b50aa21d91647b75bdb78970eecde1cd23be
MD5 6bcdb613263a2a68d6446f9f12cecc19
BLAKE2b-256 a4690d8b0253a01715c0f1211a639139939a6b50a9f99b8dcfdbadd6c20d1a90

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 48b8e59a1a3a13c6d80791b5d47c2a6875ceba0f7f3a3c3a61516750090354a4
MD5 51c278baf164bdf37b18573b8fbdab07
BLAKE2b-256 e000e853e2a2f9d6c4d8ae9f3f33718fb83fc563d98385e7d1d5468a49533380

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 d9b6392b40833ebc62002d45259be37a72ef39b16100725fe5d40db58d37a8fd
MD5 2c1166a1006eecb3928661b42cba2550
BLAKE2b-256 ec22f76b1e2565c416fe5239b4711cfcd018039bdcd5ba76f7b588cfa8949b9d

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 430ba16708237bc1947555726cdd561a7f64ea09799e030308fb26efba84b3ce
MD5 524316df576c20f612d22ba59d5ef2a1
BLAKE2b-256 7cfe6bcbea1fa42b4d720d90186ab3e6cd6c63a0ba1c78dd38a787ef3ff26e65

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8f87530ff1d7015b176abe42421fa02e3c0409ea46ffc2f5f36a29e21bb69151
MD5 d908516ef9835ca4c16cba417db5c6b6
BLAKE2b-256 7ee6752d46995cd33da41c253768c173f070f43fb2bbc88c6fbb4457a84b3895

See more details on using hashes here.

File details

Details for the file curl_cffi-0.7.0b7-cp38-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for curl_cffi-0.7.0b7-cp38-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6143f998b95d3086d55356dd6805b8a70093b41808f3c2d7de21c35580a1fa0a
MD5 834533476cde6174b9fe3f5ad0fa7b67
BLAKE2b-256 017ad9da404f11da1f1d2eabc6852b9cc1372857134114298430aff44bfa3047

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page