lobstrio-sdk

Python SDK for the Lobstr.io API

These details have not been verified by PyPI

Project description

lobstrio-sdk
Python SDK for the Lobstr.io API — web scraping automation platform

Ruff mypy

Sync + async clients with the same API surface
Typed dataclass models for all responses
Lazy auto-pagination
Automatic token resolution from CLI config or environment

Installation

pip install lobstrio-sdk

Requires Python 3.10+. The only runtime dependency is httpx.

Authentication

The client resolves your API token in this order:

Explicit — LobstrClient(token="your-token")
Environment variable — LOBSTR_TOKEN
CLI config file — ~/.config/lobstr/config.toml (same file used by lobstr CLI)

If you already have the CLI set up, the SDK works with no configuration:

from lobstrio import LobstrClient

client = LobstrClient()  # token auto-resolved
user = client.me()
print(user.email)

Quick Start

from lobstrio import LobstrClient

with LobstrClient() as client:
    # Account info
    user = client.me()
    balance = client.balance()
    print(f"{user.email} — {balance.credits} credits")

    # List crawlers
    for crawler in client.crawlers.list():
        print(f"{crawler.name} ({crawler.id})")

    # Create a squid, add tasks, run it
    squid = client.squids.create("google-maps-scraper", name="My Scrape")
    client.squids.update(squid.id, params={"language": "English (United States)"})
    client.tasks.add(squid=squid.id, tasks=[{"url": "https://maps.google.com/..."}])
    run = client.runs.start(squid=squid.id)

    # Wait for completion with progress callback
    final = client.runs.wait(run.id, callback=lambda s: print(f"{s.percent_done}%"))

    # Download results
    client.runs.download(run.id, "results.csv")

Resources

All API operations are organized under resource namespaces on the client.

User

user = client.me()           # User profile
balance = client.balance()   # Account balance (credits, subscription)

Crawlers — browse scraper templates

crawlers = client.crawlers.list()              # All crawlers
crawler = client.crawlers.get("crawler-id")    # Single crawler
params = client.crawlers.params("crawler-id")  # Parameter schema
attrs = client.crawlers.attributes("crawler-id")  # Result columns

Models: Crawler, CrawlerAttribute, CrawlerParams

Squids — manage scraper instances

# List & iterate
squids = client.squids.list(limit=50, page=1)
for squid in client.squids.iter():     # auto-paginate all squids
    print(squid.name)

# CRUD
squid = client.squids.create("crawler-id", name="My Project")
squid = client.squids.get("squid-id")
squid = client.squids.update("squid-id", name="Renamed", concurrency=2,
                              params={"language": "English"})
client.squids.empty("squid-id")        # remove all tasks
client.squids.delete("squid-id")

Model: Squid (id, name, crawler, is_active, concurrency, params, created_at, ...)

Tasks — manage input URLs and keywords

# List & iterate
tasks = client.tasks.list(squid="squid-id")
for task in client.tasks.iter(squid="squid-id"):
    print(task.id)

# Add tasks
result = client.tasks.add(
    squid="squid-id",
    tasks=[
        {"url": "https://maps.google.com/maps?cid=123"},
        {"url": "https://maps.google.com/maps?cid=456"},
    ],
)
print(f"Added {len(result.tasks)}, {result.duplicated_count} duplicates")

# Upload from CSV/TSV
resp = client.tasks.upload(squid="squid-id", file="tasks.csv")
status = client.tasks.upload_status(resp["id"])

# Get & delete
task = client.tasks.get("task-hash")
client.tasks.delete("task-hash")

Models: Task, TaskStatus, AddTasksResult, UploadStatus, UploadMeta

Runs — start, monitor, and download

# Start a run
run = client.runs.start(squid="squid-id")

# List runs
runs = client.runs.list(squid="squid-id")
for run in client.runs.iter(squid="squid-id"):
    print(run.id, run.status)

# Monitor
run = client.runs.get("run-id")
stats = client.runs.stats("run-id")
print(f"{stats.percent_done}% done, {stats.total_results} results")

# Wait for completion (blocking, with optional progress callback)
final = client.runs.wait("run-id", poll_interval=5.0,
                          callback=lambda s: print(f"{s.percent_done}%"))

# Download results
url = client.runs.download_url("run-id")   # signed S3 URL
client.runs.download("run-id", "output.csv")  # download to file

# Abort
client.runs.abort("run-id")

# Tasks within a run
tasks = client.runs.tasks("run-id")

Models: Run, RunStats

Results — fetch scraped data

results = client.results.list(squid="squid-id", page_size=100)

# Auto-paginate all results
for row in client.results.iter(squid="squid-id"):
    print(row)  # dict

Results are returned as plain dict objects (the schema depends on the crawler).

Accounts — manage connected platform accounts

accounts = client.accounts.list()
account = client.accounts.get("account-id")
types = client.accounts.types()     # available account types

# Sync account with cookies
resp = client.accounts.sync(type="google", cookies={"SID": "...", "HSID": "..."})
status = client.accounts.sync_status(resp["id"])

# Update limits
client.accounts.update("account-id", type="google", params={"daily_limit": 100})

# Delete
client.accounts.delete("account-id")

Models: Account, AccountType, SyncStatus

Delivery — configure result delivery

# Email
client.delivery.email("squid-id", email="you@example.com")
client.delivery.test_email(email="you@example.com")

# Google Sheets
client.delivery.google_sheet("squid-id", url="https://docs.google.com/spreadsheets/d/...", append=True)
client.delivery.test_google_sheet(url="https://docs.google.com/spreadsheets/d/...")

# Webhook
client.delivery.webhook("squid-id", url="https://your-server.com/hook",
                         on_done=True, on_error=True)
client.delivery.test_webhook(url="https://your-server.com/hook")

# S3
client.delivery.s3("squid-id", bucket="my-bucket", target_path="scrapes/",
                    aws_access_key="...", aws_secret_key="...")
client.delivery.test_s3(bucket="my-bucket")

# SFTP
client.delivery.sftp("squid-id", host="ftp.example.com", username="user",
                      password="pass", directory="/uploads")
client.delivery.test_sftp(host="ftp.example.com", username="user",
                           password="pass", directory="/uploads")

Models: EmailDelivery, GoogleSheetDelivery, S3Delivery, WebhookDelivery, SFTPDelivery

Async Client

The async client mirrors the sync API exactly, using async/await:

from lobstrio import AsyncLobstrClient

async def main():
    async with AsyncLobstrClient() as client:
        user = await client.me()
        print(user.email)

        crawlers = await client.crawlers.list()
        for c in crawlers:
            print(c.name)

        squid = await client.squids.create("crawler-id", name="Async Scrape")
        await client.tasks.add(squid=squid.id, tasks=[{"url": "..."}])
        run = await client.runs.start(squid=squid.id)
        final = await client.runs.wait(run.id)
        await client.runs.download(run.id, "results.csv")

All resource methods (client.crawlers.*, client.squids.*, etc.) work identically — just add await.

Pagination

Resources that return lists support two patterns:

Single page (.list()) — returns one page of results:

page1 = client.squids.list(limit=10, page=1)
page2 = client.squids.list(limit=10, page=2)

Auto-pagination (.iter()) — lazy iterator that fetches pages on demand:

for squid in client.squids.iter(limit=50):
    print(squid.name)  # automatically fetches next pages

The async client provides AsyncPageIterator for use with async for.

Error Handling

All API errors raise typed exceptions with status_code, message, and body:

from lobstrio import LobstrClient, AuthError, NotFoundError, RateLimitError, APIError

try:
    client.squids.get("nonexistent")
except NotFoundError as e:
    print(f"Not found: {e.message}")
except AuthError:
    print("Invalid or expired token")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after}s")
except APIError as e:
    print(f"API error [{e.status_code}]: {e.message}")

Exception	HTTP Status	When
`AuthError`	401	Invalid or missing token
`NotFoundError`	404	Resource doesn't exist
`RateLimitError`	429	Too many requests (has `retry_after`)
`APIError`	4xx/5xx	All other API errors

CLI vs SDK

	CLI (`pip install lobstrio`)	SDK (`pip install lobstrio-sdk`)
Use case	Terminal workflows, quick scrapes, cron jobs	Scripts, pipelines, applications
Interface	Shell commands	Python API
Output	Rich tables, progress bars, CSV files	Typed dataclass models
Async	No	Yes (`AsyncLobstrClient`)
Pagination	Manual (`--page`, `--limit`)	Auto (`client.squids.iter()`)

For terminal workflows, see lobstrio — the companion CLI tool.

FAQ

Where do I get an API token?

Go to Dashboard → API to find your token. It's always available there, pre-generated.

Do I need the CLI installed for the SDK to work?

No. The SDK is standalone. However, if you have the CLI configured (lobstr config set-token), the SDK will automatically pick up the token from ~/.config/lobstr/config.toml — no code changes needed.

How do I handle rate limiting?

Catch RateLimitError and use its retry_after attribute:

from lobstrio import RateLimitError
import time

try:
    results = client.results.list(squid="squid-id")
except RateLimitError as e:
    time.sleep(float(e.retry_after or 5))
    results = client.results.list(squid="squid-id")

Can I use the async client with Django/FastAPI?

Yes. Use AsyncLobstrClient in any async context:

from lobstrio import AsyncLobstrClient

async def scrape_view(request):
    async with AsyncLobstrClient() as client:
        results = await client.results.list(squid="squid-id")
        return results

Development

# Clone and install
git clone https://github.com/lobstrio/lobstrio-sdk.git
cd lobstrio-sdk
pip install -e ".[dev]"

# Run unit tests
pytest

# Run live tests (requires API token)
pytest tests/test_live.py -v

# Lint & type check
ruff check src/ tests/
mypy src/lobstrio/

Contributing

Contributions are welcome! See CONTRIBUTING.md for development setup, code style, and versioning guidelines.

Changelog

See CHANGELOG.md for release history.

License

Apache 2.0

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.1

Mar 17, 2026

0.2.0

Mar 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lobstrio_sdk-0.2.1.tar.gz (36.8 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lobstrio_sdk-0.2.1-py3-none-any.whl (26.5 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file lobstrio_sdk-0.2.1.tar.gz.

File metadata

Download URL: lobstrio_sdk-0.2.1.tar.gz
Upload date: Mar 17, 2026
Size: 36.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for lobstrio_sdk-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`f83715e4c566cd3f8012088d9d7d87015b1f61f2c9a53e4f1cf552e172cfe29c`
MD5	`efb48a6e33978975bf7dc64e791346c6`
BLAKE2b-256	`a28061b9ea1a9934d0cb451e316c134f04cc568364bc037198a5e77d464799b2`

See more details on using hashes here.

File details

Details for the file lobstrio_sdk-0.2.1-py3-none-any.whl.

File metadata

Download URL: lobstrio_sdk-0.2.1-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 26.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for lobstrio_sdk-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`24df072bd81fe0ee9f76f0df90b59ff8a09a605854f22e626fe1aeb719c8ce17`
MD5	`95c67a6dc7f1d8468e51ddf30598c997`
BLAKE2b-256	`f65024ed3872739c9cdc419a34e8ed09465d9c59773af392c0d394dd38bc5488`

See more details on using hashes here.

lobstrio-sdk 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Installation

Authentication

Quick Start

Resources

Async Client

Pagination

Error Handling

CLI vs SDK

FAQ

Development

Contributing

Changelog

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes