Skip to main content

Cloudflare scraper and cralwer written in Async

Project description

cfcrawler

Release Build status codecov Commit activity License

Cloudflare scraper and cralwer written in Async, In-place library for HTTPX. Crawl website that has cloudflare enabled, easier than ever!

Getting started

To use library, simply replace your aiohttp client with ours!

from cfcrawler import AsyncClient

async def get(url):
    client = AsyncClient()
    await client.get(url)

You can also rotate user agents

from cfcrawler import AsyncClient

client = AsyncClient()
client.rotate_useragent()

You can also specify which browser you want to use

from cfcrawler.types import Browser
from cfcrawler import AsyncClient

AsyncClient(browser=Browser.CHROME)

You can also use asyncer to syncify the implementation

from cfcrawler import AsyncClient
from asyncer import syncify

def get(url):
    client = AsyncClient()
    syncify(client.get)(url)

Coming Next

  1. CF JS Challenge solver
  2. Captcha solver integration (2Captcha and etc)

Contribution

I'll work on this library in few months, I don't have free time right now, but feel free to contribute. I'll check and test the PRs myself!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cfcrawler-0.0.2.tar.gz (5.2 kB view hashes)

Uploaded Source

Built Distribution

cfcrawler-0.0.2-py3-none-any.whl (6.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page