Official Python SDK for Olostep Web Scraping API
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
Olostep SDK
A lightweight Python SDK for interacting with the Olostep scraping, crawling, and batching API.
🚀 Installation
Install from PyPI:
pip install olostep-sdk
🧰 Features
- Scrape single URLs with different parsers
- Batch process multiple items
- Crawl starting from a URL
- Retrieve and parse content in multiple formats (JSON, Markdown, etc.)
🔑 Getting Started
First, initialize the SDK with your API token:
from olostep_sdk import OlostepClient
from olostep_sdk.services.scrape import ScrapeService
from olostep_sdk.enums import OlostepParser, Format
client = OlostepClient(api_token="your-api-token")
scraper = ScrapeService(client)
🔍 Scrape a URL
result = scraper.scrape(
url="https://example.com",
parser=OlostepParser.GOOGLE_SEARCH
)
print(result)
📦 Start a Batch
from olostep_sdk.services.batch import BatchService
batch = BatchService(client)
batch_id = batch.start_batch([
{"url": "https://example1.com"},
{"url": "https://example2.com"}
])
batch.wait_until_complete(batch_id)
items = batch.get_items(batch_id)
🌐 Crawl a Website
from olostep_sdk.services.crawl import CrawlService
crawler = CrawlService(client)
crawl_id = crawler.start_crawl("https://example.com")
crawler.wait_until_complete(crawl_id)
results = crawler.get_items(crawl_id)
📄 Formats and Parsers
from olostep_sdk.enums import Format, OlostepParser
Format.MARKDOWN
Format.JSON
OlostepParser.GOOGLE_SEARCH
OlostepParser.BASIC
🧪 Running Tests
python -m unittest discover -s tests
📬 License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file olostep_sdk-0.1.2.tar.gz.
File metadata
- Download URL: olostep_sdk-0.1.2.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9007280bced8c792114609f072907f311f38afd4db7912cc049b8e5805472efe
|
|
| MD5 |
c294daf70d4f3cb572fd6fb23731d4d7
|
|
| BLAKE2b-256 |
87a45d63051848baeb95bd3be0dc990325953da7617a7255c14b7c0d0aab7734
|
File details
Details for the file olostep_sdk-0.1.2-py3-none-any.whl.
File metadata
- Download URL: olostep_sdk-0.1.2-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d8ab6178a9babd071e0321a03492c05de1ca2627f3e0c71b03718ebfdfe7d1c8
|
|
| MD5 |
e001f78963d8517843cddddd89b5e793
|
|
| BLAKE2b-256 |
7d80e4d1349b7706554e293b107f6db0d1a07434973bdfff26b9dd93e2148724
|