Skip to main content

SDK for Crawlab AI

Project description

Crawlab AI SDK

This is the Python SDK for Crawlab AI, an AI-powered web scraping platform maintained by Crawlab.

Installation

pip install crawlab-ai

Pre-requisites

An API token is required to use this SDK. You can get the API token from the Crawlab official website.

Usage

Get data from a list page

from crawlab_ai import read_list

# Define the URL and fields
url = "https://example.com"

# Get the data without specifying fields
df = read_list(url=url)
print(df)

# You can also specify fields
fields = ["title", "content"]
df = read_list(url=url, fields=fields)

# You can also return a list of dictionaries instead of a DataFrame
data = read_list(url=url, as_dataframe=False)
print(data)

Usage with Scrapy

Create a Scrapy spider by extending ScrapyListSpider:

from crawlab_ai import ScrapyListSpider


class MySpider(ScrapyListSpider):
    name = "my_spider"
    start_urls = ["https://example.com"]
    fields = ["title", "content"]

Then run the spider:

scrapy crawl my_spider

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlab-ai-0.0.10.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

crawlab_ai-0.0.10-py3-none-any.whl (14.2 kB view details)

Uploaded Python 3

File details

Details for the file crawlab-ai-0.0.10.tar.gz.

File metadata

  • Download URL: crawlab-ai-0.0.10.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.14

File hashes

Hashes for crawlab-ai-0.0.10.tar.gz
Algorithm Hash digest
SHA256 b211f8d597bbcaa9e889448d569cbe9db08a6c879e30879742a96edb60826254
MD5 59678f898c2ba6d6141ec8b9db20ddc8
BLAKE2b-256 2235ae028accd56b7dfc00d88eef4d4139fa31f2a506a2bd2040152aca7e1a76

See more details on using hashes here.

File details

Details for the file crawlab_ai-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: crawlab_ai-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 14.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.14

File hashes

Hashes for crawlab_ai-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 d155b529dc9fd76d3de79046aab948e907760a602e3472d0fcd69b4f976eda06
MD5 7585a35d870091f0e2cba2d2401bbfa6
BLAKE2b-256 0c8acf2cc283d802ed8a51c36cb40d10204c1afa81e8bb0102ca5810ce81cee8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page