SDK for Crawlab AI

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Crawlab AI SDK

This is the Python SDK for Crawlab AI, an AI-powered web scraping platform maintained by Crawlab.

Installation

pip install crawlab-ai

Pre-requisites

An API token is required to use this SDK. You can get the API token from the Crawlab official website.

Usage

Get data from a list page

from crawlab_ai import read_list

# Define the URL and fields
url = "https://example.com"

# Get the data without specifying fields
df = read_list(url=url)
print(df)

# You can also specify fields
fields = ["title", "content"]
df = read_list(url=url, fields=fields)

# You can also return a list of dictionaries instead of a DataFrame
data = read_list(url=url, as_dataframe=False)
print(data)

Usage with Scrapy

Create a Scrapy spider by extending ScrapyListSpider:

from crawlab_ai import ScrapyListSpider


class MySpider(ScrapyListSpider):
    name = "my_spider"
    start_urls = ["https://example.com"]
    fields = ["title", "content"]

Then run the spider:

scrapy crawl my_spider

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.9

Apr 16, 2024

0.0.8

Apr 16, 2024

0.0.7

Apr 16, 2024

0.0.6

Apr 16, 2024

0.0.5

Apr 16, 2024

0.0.4

Apr 9, 2024

0.0.3

Apr 7, 2024

0.0.2

Apr 3, 2024

0.0.1

Feb 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlab-ai-0.0.9.tar.gz (9.3 kB view hashes)

Uploaded Apr 16, 2024 Source

Built Distribution

crawlab_ai-0.0.9-py3-none-any.whl (14.2 kB view hashes)

Uploaded Apr 16, 2024 Python 3

Hashes for crawlab-ai-0.0.9.tar.gz

Hashes for crawlab-ai-0.0.9.tar.gz
Algorithm	Hash digest
SHA256	`1ec86013c16c90da44f282a8e8b4d223cc97746179bcf299bfff8e24bffd3087`
MD5	`6d63379170379b02b80f437ab34ff274`
BLAKE2b-256	`068024776a9acc2e861d1428c7e03a3ec716269d46130e72541bdb73efe58e0b`

Hashes for crawlab_ai-0.0.9-py3-none-any.whl

Hashes for crawlab_ai-0.0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d61d176387b392e92842ac0432142402e2a3ff9e3540bb689487cfb30cd827af`
MD5	`817b3e45765e2bf5d542389173bc5e01`
BLAKE2b-256	`1b3a556b8066a8e3603632d0a39779a0b14aa3ba408a84e2c685332ac898d9de`