SDK for Crawlab AI
Project description
Crawlab AI SDK
This is the Python SDK for Crawlab AI, an AI-powered web scraping platform maintained by Crawlab.
Installation
pip install crawlab-ai
Pre-requisites
An API token is required to use this SDK. You can get the API token from the Crawlab official website.
Usage
Get data from a list page
from crawlab_ai import read_list
# Define the URL and fields
url = "https://example.com"
# Get the data without specifying fields
df = read_list(url=url)
print(df)
# You can also specify fields
fields = ["title", "content"]
df = read_list(url=url, fields=fields)
# You can also return a list of dictionaries instead of a DataFrame
data = read_list(url=url, as_dataframe=False)
print(data)
Usage with Scrapy
Create a Scrapy spider by extending ScrapyListSpider:
from crawlab_ai import ScrapyListSpider
class MySpider(ScrapyListSpider):
name = "my_spider"
start_urls = ["https://example.com"]
fields = ["title", "content"]
Then run the spider:
scrapy crawl my_spider
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file crawlab-ai-0.0.10.tar.gz.
File metadata
- Download URL: crawlab-ai-0.0.10.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b211f8d597bbcaa9e889448d569cbe9db08a6c879e30879742a96edb60826254
|
|
| MD5 |
59678f898c2ba6d6141ec8b9db20ddc8
|
|
| BLAKE2b-256 |
2235ae028accd56b7dfc00d88eef4d4139fa31f2a506a2bd2040152aca7e1a76
|
File details
Details for the file crawlab_ai-0.0.10-py3-none-any.whl.
File metadata
- Download URL: crawlab_ai-0.0.10-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d155b529dc9fd76d3de79046aab948e907760a602e3472d0fcd69b4f976eda06
|
|
| MD5 |
7585a35d870091f0e2cba2d2401bbfa6
|
|
| BLAKE2b-256 |
0c8acf2cc283d802ed8a51c36cb40d10204c1afa81e8bb0102ca5810ce81cee8
|