SDK for Crawlab AI
Project description
Crawlab AI SDK
This is the Python SDK for Crawlab AI, an AI-powered web scraping platform maintained by Crawlab.
Installation
pip install crawlab-ai
Pre-requisites
An API token is required to use this SDK. You can get the API token from the Crawlab official website.
Usage
Get data from a list page
from crawlab_ai import read_list
# Define the URL and fields
url = "https://example.com"
# Get the data without specifying fields
df = read_list(url=url)
print(df)
# You can also specify fields
fields = ["title", "content"]
df = read_list(url=url, fields=fields)
# You can also return a list of dictionaries instead of a DataFrame
data = read_list(url=url, as_dataframe=False)
print(data)
Usage with Scrapy
Create a Scrapy spider by extending ScrapyListSpider
:
from crawlab_ai import ScrapyListSpider
class MySpider(ScrapyListSpider):
name = "my_spider"
start_urls = ["https://example.com"]
fields = ["title", "content"]
Then run the spider:
scrapy crawl my_spider
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
crawlab-ai-0.0.10.tar.gz
(9.4 kB
view details)
Built Distribution
File details
Details for the file crawlab-ai-0.0.10.tar.gz
.
File metadata
- Download URL: crawlab-ai-0.0.10.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b211f8d597bbcaa9e889448d569cbe9db08a6c879e30879742a96edb60826254 |
|
MD5 | 59678f898c2ba6d6141ec8b9db20ddc8 |
|
BLAKE2b-256 | 2235ae028accd56b7dfc00d88eef4d4139fa31f2a506a2bd2040152aca7e1a76 |
File details
Details for the file crawlab_ai-0.0.10-py3-none-any.whl
.
File metadata
- Download URL: crawlab_ai-0.0.10-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d155b529dc9fd76d3de79046aab948e907760a602e3472d0fcd69b4f976eda06 |
|
MD5 | 7585a35d870091f0e2cba2d2401bbfa6 |
|
BLAKE2b-256 | 0c8acf2cc283d802ed8a51c36cb40d10204c1afa81e8bb0102ca5810ce81cee8 |