Crawlee for Python

These details have not been verified by PyPI

Project links

Project description

A web scraping and browser automation library

Crawlee covers your crawling and scraping end-to-end and helps you build reliable scrapers. Fast.

Your crawlers will appear almost human-like and fly under the radar of modern bot protections even with the default configuration. Crawlee gives you the tools to crawl the web for links, scrape data, and store it to disk or cloud while staying configurable to suit your project's needs.

We also have a TypeScript implementation of the Crawlee, which you can explore and utilize for your projects. Visit our GitHub repository for more information Crawlee on GitHub.

Installation

Crawlee is available as the crawlee PyPI package.

pip install crawlee

Additional, optional dependencies unlocking more features are shipped as package extras.

If you plan to use BeautifulSoupCrawler, install crawlee with beautifulsoup extra:

pip install 'crawlee[beautifulsoup]'

If you plan to use PlaywrightCrawler, install crawlee with the playwright extra:

pip install 'crawlee[playwright]'

Then, install the Playwright dependencies:

playwright install

You can install multiple extras at once by using a comma as a separator:

pip install 'crawlee[beautifulsoup,playwright]'

Features

Unified interface for HTTP and headless browser crawling.
Persistent queue for URLs to crawl (breadth & depth-first).
Pluggable storage of both tabular data and files.
Automatic scaling with available system resources.
Integrated proxy rotation and session management.
Configurable request routing - directing URLs to appropriate handlers.
Robust error handling.
Automatic retries when getting blocked.
Written in Python with type hints, which means better DX and fewer bugs.

Introduction

Crawlee covers your crawling and scraping end-to-end and helps you build reliable scrapers. Fast.

Your crawlers will appear human-like and fly under the radar of modern bot protections even with the default configuration. Crawlee gives you the tools to crawl the web for links, scrape data and persistently store it in machine-readable formats, without having to worry about the technical details. And thanks to rich configuration options, you can tweak almost any aspect of Crawlee to suit your project's needs if the default settings don't cut it.

Crawlers

Crawlee offers a framework for parallel web crawling through a variety of crawler classes, each designed to meet different crawling needs.

HttpCrawler

HttpCrawler provides a framework for the parallel crawling of web pages using plain HTTP requests. The URLs to crawl are fed from a request provider. It enables the recursive crawling of websites. The parsing of obtained HTML is the user's responsibility.

Since HttpCrawler uses raw HTTP requests to download web pages, it is very fast and efficient on data bandwidth. However, if the target website requires JavaScript to display the content, you might need to use some browser crawler instead, e.g. PlaywrightCrawler, because it loads the pages using a full-featured headless Chrome browser.

HttpCrawler downloads each URL using a plain HTTP request, obtain the response and then invokes the user-provided request handler to extract page data.

The source URLs are represented using the Request objects that are fed from RequestList or RequestQueue instances provided by the request provider option.

The crawler finishes when there are no more Request objects to crawl.

If you want to parse data using BeautifulSoup see the BeautifulSoupCrawler section.

Example usage:

import asyncio

from crawlee.http_crawler import HttpCrawler, HttpCrawlingContext


async def main() -> None:
    # Create a HttpCrawler instance and provide a starting requests
    crawler = HttpCrawler()

    # Define a handler for processing requests
    @crawler.router.default_handler
    async def request_handler(context: HttpCrawlingContext) -> None:
        # Crawler will provide a HttpCrawlingContext instance,
        # from which you can access the request and response data
        data = {
            'url': context.request.url,
            'status_code': context.http_response.status_code,
            'headers': dict(context.http_response.headers),
            'response': context.http_response.read().decode()[:1000],
        }
        # Extract the record and push it to the dataset
        await context.push_data(data)

    # Run the crawler
    await crawler.run(['https://crawlee.dev'])


if __name__ == '__main__':
    asyncio.run(main())

For further explanation of storages (dataset, request queue) see the storages section.

BeautifulSoupCrawler

BeautifulSoupCrawler extends the HttpCrawler. It provides the same features and on top of that, it uses BeautifulSoup HTML parser.

Same as for HttpCrawler, since BeautifulSoupCrawler uses raw HTTP requests to download web pages, it is very fast and efficient on data bandwidth. However, if the target website requires JavaScript to display the content, you might need to use PlaywrightCrawler instead, because it loads the pages using a full-featured headless browser (Chrome, Firefox or others).

BeautifulSoupCrawler downloads each URL using a plain HTTP request, parses the HTML content using BeautifulSoup and then invokes the user-provided request handler to extract page data using an interface to the parsed HTML DOM.

Example usage:

import asyncio

from crawlee.beautifulsoup_crawler import BeautifulSoupCrawler, BeautifulSoupCrawlingContext


async def main() -> None:
    # Create a BeautifulSoupCrawler instance and provide a request provider
    crawler = BeautifulSoupCrawler()

    # Define a handler for processing requests
    @crawler.router.default_handler
    async def request_handler(context: BeautifulSoupCrawlingContext) -> None:
        # Crawler will provide a BeautifulSoupCrawlingContext instance,
        # from which you can access the request and response data
        data = {
            'title': context.soup.title.text,
            'url': context.request.url,
        }
        # Extract the record and push it to the dataset
        await context.push_data(data)

    # Run the crawler
    await crawler.run(['https://crawlee.dev'])


if __name__ == '__main__':
    asyncio.run(main())

BeautifulSoupCrawler also provides a helper for enqueuing links in the currently crawling website. See the following example with the updated request handler:

from crawlee.enqueue_strategy import EnqueueStrategy

# ...

    @crawler.router.default_handler
    async def request_handler(context: BeautifulSoupCrawlingContext) -> None:
        # Use enqueue links helper to enqueue all links from the page with the same domain
        await context.enqueue_links(strategy=EnqueueStrategy.SAME_DOMAIN)

        data = {
            'title': context.soup.title.text,
            'url': context.request.url,
        }

        await context.push_data(data)

PlaywrightCrawler

PlaywrightCrawler extends the BasicCrawler. It provides the same features and on top of that, it uses Playwright browser automation tool.

This crawler provides a straightforward framework for parallel web page crawling using headless versions of Chromium, Firefox, and Webkit browsers through Playwright. URLs to be crawled are supplied by a request provider, which can be either a RequestList containing a static list of URLs or a dynamic RequestQueue.

Using a headless browser to download web pages and extract data, PlaywrightCrawler is ideal for crawling websites that require JavaScript execution. For websites that do not require JavaScript, consider using the BeautifulSoupCrawler, which utilizes raw HTTP requests and will be much faster.

Example usage:

import asyncio

from crawlee.playwright_crawler import PlaywrightCrawler, PlaywrightCrawlingContext


async def main() -> None:
    # Create a crawler instance and provide a request provider (and other optional arguments)
    crawler = PlaywrightCrawler(
        # headless=False,
        # browser_type='firefox',
    )

    @crawler.router.default_handler
    async def request_handler(context: PlaywrightCrawlingContext) -> None:
        data = {
            'request_url': context.request.url,
            'page_url': context.page.url,
            'page_title': await context.page.title(),
            'page_content': (await context.page.content())[:10000],
        }
        await context.push_data(data)

    await crawler.run(['https://crawlee.dev'])


if __name__ == '__main__':
    asyncio.run(main())

Example usage with custom browser pool:

import asyncio

from crawlee.browsers import BrowserPool, PlaywrightBrowserPlugin
from crawlee.playwright_crawler import PlaywrightCrawler, PlaywrightCrawlingContext


async def main() -> None:
    # Create a browser pool with a Playwright browser plugin
    browser_pool = BrowserPool(
        plugins=[
            PlaywrightBrowserPlugin(
                browser_type='firefox',
                browser_options={'headless': False},
                page_options={'viewport': {'width': 1920, 'height': 1080}},
            )
        ]
    )

    # Create a crawler instance and provide a browser pool and request provider
    crawler = PlaywrightCrawler(browser_pool=browser_pool)

    @crawler.router.default_handler
    async def request_handler(context: PlaywrightCrawlingContext) -> None:
        data = {
            'request_url': context.request.url,
            'page_url': context.page.url,
            'page_title': await context.page.title(),
            'page_content': (await context.page.content())[:10000],
        }
        await context.push_data(data)

    await crawler.run(['https://apify.com', 'https://crawlee.dev'])


if __name__ == '__main__':
    asyncio.run(main())

Storages

Crawlee introduces several result storage types that are useful for specific tasks. The storing of underlying data is realized by the storage client. Currently, only a memory storage client is implemented. Using this, the data are stored either in the memory or persisted on the disk.

By default, the data are stored in the directory specified by the CRAWLEE_STORAGE_DIR environment variable. With default .storage/.

Dataset

A Dataset is a type of storage mainly suitable for storing tabular data.

Datasets are used to store structured data where each object stored has the same attributes, such as online store products or real estate offers. The dataset can be imagined as a table, where each object is a row and its attributes are columns. The dataset is an append-only storage - we can only add new records to it, but we cannot modify or remove existing records.

Each Crawlee project run is associated with a default dataset. Typically, it is used to store crawling results specific to the crawler run. Its usage is optional.

The data are persisted as follows:

{CRAWLEE_STORAGE_DIR}/datasets/{DATASET_ID}/{INDEX}.json

The following code demonstrates the basic operations of the dataset:

import asyncio

from crawlee.storages import Dataset


async def main() -> None:
    # Open a default dataset
    dataset = await Dataset.open()

    # Push a single record
    await dataset.push_data({'key1': 'value1'})

    # Get records from the dataset
    data = await dataset.get_data()
    print(f'Dataset data: {data.items}')  # Dataset data: [{'key1': 'value1'}]

    # Open a named dataset
    dataset_named = await Dataset.open(name='some-name')

    # Push multiple records
    await dataset_named.push_data([{'key2': 'value2'}, {'key3': 'value3'}])


if __name__ == '__main__':
    asyncio.run(main())

Key-value store

The KeyValueStore is used for saving and reading data records or files. Each data record is represented by a unique key and associated with a MIME content type. Key-value stores are ideal for saving screenshots of web pages, and PDFs or to persist the state of crawlers.

Each Crawlee project run is associated with a default key-value store. By convention, the project input and output are stored in the default key-value store under the INPUT and OUTPUT keys respectively. Typically, both input and output are JSON files, although they could be any other format.

The data are persisted as follows:

{CRAWLEE_STORAGE_DIR}/key_value_stores/{STORE_ID}/{KEY}.{EXT}

The following code demonstrates the basic operations of key-value stores:

import asyncio

from crawlee.storages import KeyValueStore


async def main() -> None:
    kvs = await KeyValueStore.open()  # Open a default key-value store

    # Write the OUTPUT to the default key-value store
    await kvs.set_value('OUTPUT', {'my_result': 123})

    # Read the OUTPUT from the default key-value store
    value = await kvs.get_value('OUTPUT')
    print(f'Value of OUTPUT: {value}')  # Value of OUTPUT: {'my_result': 123}

    # Open a named key-value store
    kvs_named = await KeyValueStore.open(name='some-name')

    # Write a record to the named key-value store
    await kvs_named.set_value('some-key', {'foo': 'bar'})

    # Delete a record from the named key-value store
    await kvs_named.set_value('some-key', None)


if __name__ == '__main__':
    asyncio.run(main())

Request queue

The RequestQueue is a storage of URLs (requests) to crawl. The queue is used for the deep crawling of websites, where we start with several URLs and then recursively follow links to other pages. The data structure supports both breadth-first and depth-first crawling orders.

Each Crawlee project run is associated with a default request queue. Typically, it is used to store URLs to crawl in the specific crawler run. Its usage is optional.

The data are persisted as follows:

{CRAWLEE_STORAGE_DIR}/request_queues/{QUEUE_ID}/entries.json

The following code demonstrates the basic usage of the request queue:

import asyncio

from crawlee.storages import RequestQueue


async def main() -> None:
    # Open a default request queue
    rq = await RequestQueue.open()

    # Add a single request
    await rq.add_request('https://crawlee.dev')

    # Open a named request queue
    rq_named = await RequestQueue.open(name='some-name')

    # Add multiple requests
    await rq_named.add_requests_batched(['https://apify.com', 'https://example.com'])

    # Fetch the next request
    request = await rq_named.fetch_next_request()
    print(f'Next request: {request.url}')  # Next request: https://apify.com


if __name__ == '__main__':
    asyncio.run(main())

For an example of usage of the request queue with a crawler see the BeautifulSoupCrawler example.

Session Management

SessionPool is a class that allows us to handle the rotation of proxy IP addresses along with cookies and other custom settings in Crawlee.

The main benefit of using a session pool is that we can filter out blocked or non-working proxies, so our actor does not retry requests over known blocked/non-working proxies. Another benefit of using the session pool is that we can store information tied tightly to an IP address, such as cookies, auth tokens, and particular headers. Having our cookies and other identifiers used only with a specific IP will reduce the chance of being blocked. The last but not least benefit is the even rotation of IP addresses - the session pool picks the session randomly, which should prevent burning out a small pool of available IPs.

To use a default session pool with automatic session rotation use the use_session_pool option for the crawler.

from crawlee.http_crawler import HttpCrawler

crawler = HttpCrawler(use_session_pool=True)

If you want to configure your own session pool, instantiate it and provide it directly to the crawler.

import asyncio
from datetime import timedelta

from crawlee.http_crawler import HttpCrawler
from crawlee.sessions import Session, SessionPool


async def main() -> None:
    # Use dict as args for new sessions
    session_pool_v1 = SessionPool(
        max_pool_size=10,
        create_session_settings = {'max_age': timedelta(minutes=10)},
    )

    # Use lambda creation function for new sessions
    session_pool_v2 = SessionPool(
        max_pool_size=10,
        create_session_function=lambda _: Session(max_age=timedelta(minutes=10)),
    )

    crawler = HttpCrawler(session_pool=session_pool_v1, use_session_pool=True)


if __name__ == '__main__':
    asyncio.run(main())

Running on the Apify platform

Crawlee is open-source and runs anywhere, but since it's developed by Apify, it's easy to set up on the Apify platform and run in the cloud. Visit the Apify SDK website to learn more about deploying Crawlee to the Apify platform.

Support

If you find any bug or issue with Crawlee, please submit an issue on GitHub. For questions, you can ask on Stack Overflow, in GitHub Discussions or you can join our Discord server.

Contributing

Your code contributions are welcome, and you'll be praised for eternity! If you have any ideas for improvements, either submit an issue or create a pull request. For contribution guidelines and the code of conduct, see CONTRIBUTING.md.

License

This project is licensed under the Apache License 2.0 - see the LICENSE.md file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.2b4 pre-release

Nov 15, 2024

0.4.2b3 pre-release

Nov 14, 2024

0.4.2b2 pre-release

Nov 12, 2024

0.4.2b1 pre-release

Nov 12, 2024

0.4.1

Nov 11, 2024

0.4.1b11 pre-release

Nov 12, 2024

0.4.1b10 pre-release

Nov 12, 2024

0.4.1b9 pre-release

Nov 11, 2024

0.4.1b8 pre-release

Nov 11, 2024

0.4.1b7 pre-release

Nov 11, 2024

0.4.1b6 pre-release

Nov 11, 2024

0.4.1b5 pre-release

Nov 7, 2024

0.4.1b4 pre-release

Nov 7, 2024

0.4.1b3 pre-release

Nov 7, 2024

0.4.1b2 pre-release

Nov 7, 2024

0.4.1b1 pre-release

Nov 4, 2024

0.4.0

Nov 1, 2024

0.4.0b12 pre-release

Nov 4, 2024

0.4.0b11 pre-release

Nov 4, 2024

0.4.0b10 pre-release

Nov 1, 2024

0.4.0b9 pre-release

Oct 31, 2024

0.4.0b8 pre-release

Oct 31, 2024

0.4.0b7 pre-release

Oct 30, 2024

0.4.0b6 pre-release

Oct 29, 2024

0.4.0b5 pre-release

Oct 29, 2024

0.4.0b4 pre-release

Oct 25, 2024

0.4.0b3 pre-release

Oct 24, 2024

0.4.0b2 pre-release

Oct 23, 2024

0.4.0b1 pre-release

Oct 23, 2024

0.3.9

Oct 23, 2024

0.3.9b8 pre-release

Oct 22, 2024

0.3.9b7 pre-release

Oct 22, 2024

0.3.9b6 pre-release

Oct 22, 2024

0.3.9b5 pre-release

Oct 21, 2024

0.3.9b4 pre-release

Oct 16, 2024

0.3.9b3 pre-release

Oct 16, 2024

0.3.9b2 pre-release

Oct 14, 2024

0.3.9b1 pre-release

Oct 8, 2024

0.3.8

Oct 3, 2024

0.3.8b10 pre-release

Oct 7, 2024

0.3.8b9 pre-release

Oct 4, 2024

0.3.8b8 pre-release

Oct 2, 2024

0.3.8b7 pre-release

Oct 2, 2024

0.3.8b6 pre-release

Oct 1, 2024

0.3.8b5 pre-release

Oct 1, 2024

0.3.8b4 pre-release

Oct 1, 2024

0.3.8b3 pre-release

Sep 30, 2024

0.3.8b2 pre-release

Sep 30, 2024

0.3.8b1 pre-release

Sep 27, 2024

0.3.7

Sep 25, 2024

0.3.7b2 pre-release

Sep 25, 2024

0.3.7b1 pre-release

Sep 23, 2024

0.3.6

Sep 19, 2024

0.3.6b7 pre-release

Sep 19, 2024

0.3.6b6 pre-release

Sep 17, 2024

0.3.6b5 pre-release

Sep 16, 2024

0.3.6b4 pre-release

Sep 16, 2024

0.3.6b3 pre-release

Sep 12, 2024

0.3.6b2 pre-release

Sep 12, 2024

0.3.6b1 pre-release

Sep 10, 2024

0.3.5

Sep 10, 2024

0.3.5b4 pre-release

Sep 10, 2024

0.3.5b3 pre-release

Sep 9, 2024

0.3.5b2 pre-release

Sep 6, 2024

0.3.5b1 pre-release

Sep 5, 2024

0.3.4

Sep 5, 2024

0.3.4b1 pre-release

Sep 5, 2024

0.3.3

Sep 5, 2024

0.3.3b1 pre-release

Sep 5, 2024

0.3.2

Sep 4, 2024

0.3.2b4 pre-release

Sep 2, 2024

0.3.2b3 pre-release

Sep 2, 2024

0.3.2b2 pre-release

Sep 2, 2024

0.3.2b1 pre-release

Aug 30, 2024

0.3.1

Aug 30, 2024

0.3.1b2 pre-release

Aug 30, 2024

0.3.1b1 pre-release

Aug 29, 2024

0.3.0

Aug 27, 2024

0.3.0b5 pre-release

Aug 26, 2024

0.3.0b4 pre-release

Aug 26, 2024

0.3.0b3 pre-release

Aug 25, 2024

0.3.0b2 pre-release

Aug 23, 2024

0.3.0b1 pre-release

Aug 23, 2024

0.2.2b21 pre-release

Aug 22, 2024

0.2.2b20 pre-release

Aug 22, 2024

0.2.2b19 pre-release

Aug 21, 2024

0.2.2b18 pre-release

Aug 20, 2024

0.2.2b17 pre-release

Aug 20, 2024

0.2.2b16 pre-release

Aug 19, 2024

0.2.2b15 pre-release

Aug 18, 2024

0.2.2b14 pre-release

Aug 16, 2024

0.2.2b13 pre-release

Aug 15, 2024

0.2.2b12 pre-release

Aug 13, 2024

0.2.2b11 pre-release

Aug 13, 2024

0.2.2b10 pre-release

Aug 12, 2024

0.2.2b9 pre-release

Aug 12, 2024

0.2.2b8 pre-release

Aug 12, 2024

0.2.2b7 pre-release

Aug 11, 2024

0.2.2b6 pre-release

Aug 9, 2024

0.2.2b5 pre-release

Aug 9, 2024

0.2.2b4 pre-release

Aug 9, 2024

0.2.2b3 pre-release

Aug 9, 2024

0.2.2b2 pre-release

Aug 7, 2024

0.2.2b1 pre-release

Aug 7, 2024

0.2.1

Aug 5, 2024

0.2.0

Aug 5, 2024

0.1.3 yanked

Aug 5, 2024

Reason this release was yanked:

broken CI

0.1.2

Jul 30, 2024

0.1.2b6 pre-release

Jul 24, 2024

0.1.2b5 pre-release

Jul 24, 2024

0.1.2b4 pre-release

Jul 24, 2024

0.1.2b3 pre-release

Jul 23, 2024

0.1.2b2 pre-release

Jul 22, 2024

0.1.2b1 pre-release

Jul 22, 2024

0.1.1

Jul 19, 2024

0.1.1b9 pre-release

Jul 19, 2024

0.1.1b8 pre-release

Jul 19, 2024

0.1.1b7 pre-release

Jul 18, 2024

0.1.1b6 pre-release

Jul 18, 2024

0.1.1b5 pre-release

Jul 18, 2024

0.1.1b4 pre-release

Jul 16, 2024

0.1.1b3 pre-release

Jul 16, 2024

0.1.1b2 pre-release

Jul 15, 2024

0.1.1b1 pre-release

Jul 9, 2024

0.1.0

Jul 9, 2024

0.1.0b2 pre-release

Jul 8, 2024

0.1.0b1 pre-release

Jul 8, 2024

0.0.8b9 pre-release

Jul 8, 2024

0.0.8b8 pre-release

Jul 7, 2024

0.0.8b7 pre-release

Jul 2, 2024

0.0.8b6 pre-release

Jul 2, 2024

0.0.8b5 pre-release

Jul 1, 2024

0.0.8b4 pre-release

Jul 1, 2024

0.0.8b3 pre-release

Jun 28, 2024

0.0.8b2 pre-release

Jun 28, 2024

0.0.8b1 pre-release

Jun 27, 2024

0.0.7

Jun 27, 2024

0.0.7b3 pre-release

Jun 27, 2024

0.0.7b2 pre-release

Jun 26, 2024

0.0.7b1 pre-release

Jun 26, 2024

0.0.6

Jun 25, 2024

0.0.6b5 pre-release

Jun 25, 2024

0.0.6b4 pre-release

Jun 25, 2024

0.0.6b3 pre-release

Jun 25, 2024

0.0.6b2 pre-release

Jun 25, 2024

0.0.6b1 pre-release

Jun 24, 2024

0.0.5

Jun 21, 2024

0.0.5b19 pre-release

Jun 21, 2024

0.0.5b18 pre-release

Jun 20, 2024

This version

0.0.5b17 pre-release

Jun 20, 2024

0.0.5b16 pre-release

Jun 20, 2024

0.0.5b15 pre-release

Jun 20, 2024

0.0.5b14 pre-release

Jun 19, 2024

0.0.5b13 pre-release

Jun 19, 2024

0.0.5b12 pre-release

Jun 18, 2024

0.0.5b11 pre-release

Jun 18, 2024

0.0.5b10 pre-release

Jun 18, 2024

0.0.5b9 pre-release

Jun 17, 2024

0.0.5b8 pre-release

Jun 12, 2024

0.0.5b7 pre-release

Jun 12, 2024

0.0.5b6 pre-release

Jun 12, 2024

0.0.5b5 pre-release

Jun 10, 2024

0.0.5b4 pre-release

Jun 10, 2024

0.0.5b3 pre-release

Jun 3, 2024

0.0.5b2 pre-release

May 31, 2024

0.0.5b1 pre-release

May 31, 2024

0.0.4

May 30, 2024

0.0.4b4 pre-release

May 30, 2024

0.0.4b3 pre-release

May 29, 2024

0.0.4b2 pre-release

May 24, 2024

0.0.4b1 pre-release

May 21, 2024

0.0.3

May 15, 2024

0.0.3b12 pre-release

May 13, 2024

0.0.3b11 pre-release

May 13, 2024

0.0.3b10 pre-release

May 10, 2024

0.0.3b9 pre-release

May 9, 2024

0.0.3b8 pre-release

May 2, 2024

0.0.3b7 pre-release

May 2, 2024

0.0.3b6 pre-release

Apr 30, 2024

0.0.3b5 pre-release

Apr 29, 2024

0.0.3b4 pre-release

Apr 28, 2024

0.0.3b3 pre-release

Apr 28, 2024

0.0.3b2 pre-release

Apr 23, 2024

0.0.3b1 pre-release

Apr 23, 2024

0.0.3a1 pre-release

Apr 23, 2024

0.0.2

Apr 11, 2024

0.0.1

Jan 30, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlee-0.0.5b17.tar.gz (111.0 kB view details)

Uploaded Jun 20, 2024 Source

Built Distribution

crawlee-0.0.5b17-py3-none-any.whl (146.5 kB view details)

Uploaded Jun 20, 2024 Python 3

File details

Details for the file crawlee-0.0.5b17.tar.gz.

File metadata

Download URL: crawlee-0.0.5b17.tar.gz
Upload date: Jun 20, 2024
Size: 111.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for crawlee-0.0.5b17.tar.gz
Algorithm	Hash digest
SHA256	`c19ca0ef8fe78a1541494263a2031f56228eeb36911cb4f0ae580cf6b1741a6c`
MD5	`55b8aa3a531f7cd0374644cbf3862018`
BLAKE2b-256	`f16fd40e7381a790c0b790bd61509a3d5d02934c7e2c18f0fd01023ae1725896`

See more details on using hashes here.

File details

Details for the file crawlee-0.0.5b17-py3-none-any.whl.

File metadata

Download URL: crawlee-0.0.5b17-py3-none-any.whl
Upload date: Jun 20, 2024
Size: 146.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for crawlee-0.0.5b17-py3-none-any.whl
Algorithm	Hash digest
SHA256	`abdcf639e5fcebfd51658dac7caeb8557a08bd7007bb7e55b4920ade2532217c`
MD5	`6489be6a70d62cd5ed445ca26f84ddde`
BLAKE2b-256	`a65cbf746badb449948d424302cc8dce6b335eb189c556f2101d12934349f509`