Skip to main content

Oxylabs studio python sdk

Project description

OxyLabs AI Studio Python SDK

AI-Studio Python (1)

YouTube

A simple Python SDK for seamlessly interacting with Oxylabs AI Studio API services, including AI-Scraper, AI-Crawler, AI-Browser-Agent and other data extraction tools.

Requirements

  • python 3.10 and above
  • API KEY

Installation

pip install oxylabs-ai-studio

Usage

Crawl (AiCrawler.crawl)

from oxylabs_ai_studio.apps.ai_crawler import AiCrawler

crawler = AiCrawler(api_key="<API_KEY>")

url = "https://oxylabs.io"
result = crawler.crawl(
    url=url,
    user_prompt="Find all pages with proxy products pricing",
    output_format="markdown",
    render_javascript=False,
    return_sources_limit=3,
    geo_location="United States",
)
print("Results:")
for item in result.data:
    print(item, "\n")

Parameters:

  • url (str): Starting URL to crawl (required)
  • user_prompt (str): Natural language prompt to guide extraction (required)
  • output_format (Literal["json", "markdown", "csv", "toon"]): Output format (default: "markdown")
  • schema (dict | None): Json schema for structured extraction (required if output_format is "json", "csv" or "toon")
  • render_javascript (bool): Render JavaScript (default: False)
  • return_sources_limit (int): Max number of sources to return (default: 25)
  • geo_location (str): Proxy location in ISO2 format or country canonical name. See docs
  • max_credits (int | None): Maximum of credits to use (optional)

Scrape (AiScraper.scrape)

from oxylabs_ai_studio.apps.ai_scraper import AiScraper

scraper = AiScraper(api_key="<API_KEY>")

schema = scraper.generate_schema(prompt="want to parse developer, platform, type, price game title, genre (array) and description")
print(f"Generated schema: {schema}")

url = "https://sandbox.oxylabs.io/products/3"
result = scraper.scrape(
    url=url,
    output_format="json",
    schema=schema,
    render_javascript=False,
)
print(result)

Parameters:

  • url (str): Target URL to scrape (required)
  • output_format (Literal["json", "markdown", "csv", "screenshot", "toon"]): Output format (default: "markdown")
  • schema (dict | None): JSON schema for structured extraction (required if output_format is "json", "csv" or "toon")
  • render_javascript (bool | string): Render JavaScript. Can be set to "auto", meaning the service will detect if rendering is needed (default: False)
  • geo_location (str): Proxy location in ISO2 format or country canonical name. See docs
  • user_agent (str): User-Agent request header. See more at https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/http-context-and-job-management/user-agent-type.

Browser Agent (BrowserAgent.run)

from oxylabs_ai_studio.apps.browser_agent import BrowserAgent

browser_agent = BrowserAgent(api_key="<API_KEY>")

schema = browser_agent.generate_schema(
    prompt="game name, platform, review stars and price"
)
print("schema: ", schema)

prompt = "Find if there is game 'super mario odyssey' in the store. If there is, find the price. Use search bar to find the game."
url = "https://sandbox.oxylabs.io/"
result = browser_agent.run(
    url=url,
    user_prompt=prompt,
    output_format="json",
    schema=schema,
)
print(result.data)

Parameters:

  • url (str): Starting URL to browse (required)
  • user_prompt (str): Natural language prompt for extraction (required)
  • output_format (Literal["json", "markdown", "html", "screenshot", "csv", "toon"]): Output format (default: "markdown")
  • schema (dict | None): Json schema for structured extraction (required if output_format is "json", "csv" or "toon")
  • geo_location (str): Proxy location in ISO2 format or country canonical name. For example 'Germany' (capitalized).

Search (AiSearch.search)

from oxylabs_ai_studio.apps.ai_search import AiSearch


search = AiSearch(api_key="<API_KEY>")

query = "lasagna recipe"
result = search.search(
    query=query,
    limit=5,
    render_javascript=False,
    return_content=True,
)
print(result.data)

# Or for fast search
result = search.instant_search(
    query=query,
    limit=10,
)
print(result.data)

Parameters:

  • query (str): What to search for (required)
  • limit (int): Maximum number of results to return (default: 10, maximum: 50)
  • render_javascript (bool): Render JavaScript (default: False)
  • return_content (bool): Whether to return markdown contents in results (default: True)
  • geo_location (string): ISO 2-letter format, country name, coordinate formats are supported. See more at SERP Localization.

Note: When limit <= 10 and return_content=False, the search automatically uses the instant endpoint (/search/instant) which returns results immediately without polling, providing faster response times.

Instant search supported parameters:

  • query (string): The search query.
  • limit (integer): The maximum number of search results to return. Maximum: 10.
  • geo_location (string): Google's canonical name of the location. See more at Google Ads GeoTargets.

Map (AiMap.map)

from oxylabs_ai_studio.apps.ai_map import AiMap


ai_map = AiMap(api_key="<API_KEY>")
payload = {
    "url": "https://career.oxylabs.io",
    "search_keywords": ["career", "jobs", "vacancy"],
    "user_prompt": "job ad pages",
    "max_crawl_depth": 2,
    "limit": 10,
    "geo_location": "Germany",
    "render_javascript": False,
    "include_sitemap": True,
    "max_credits": None,
    "allow_subdomains": False,
    "allow_external_domains": False,
}
result = ai_map.map(**payload)
print(result.data)

Parameters:

  • url (str): Starting URL or domain to map (required)
  • search_keywords (list[str]): Keywords for URLs paths filtering (default: None)
  • user_prompt (str | None): Natural language prompt for keyword search. Can be used together with 'search_keywords' or standalone (optional)
  • max_crawl_depth (int): Max crawl depth (1..5, default: 1)
  • limit (int): Max number of URLs to return (default: 25)
  • geo_location (str): Proxy location in ISO2 format or country canonical name. See docs
  • render_javascript (bool): JavaScript rendering (default: False)
  • include_sitemap (bool): Whether to include sitemap as seed (default: True)
  • max_credits (int | None): Maximum of credits to use (optional)
  • allow_subdomains (bool): Include subdomains (default: False)
  • allow_external_domains (bool): Include external domains (default: False)

See the examples folder for usage examples of each method. Each method has corresponding async version.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oxylabs_ai_studio-0.2.20.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oxylabs_ai_studio-0.2.20-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file oxylabs_ai_studio-0.2.20.tar.gz.

File metadata

  • Download URL: oxylabs_ai_studio-0.2.20.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for oxylabs_ai_studio-0.2.20.tar.gz
Algorithm Hash digest
SHA256 4a5e699b840b90a02fab9f610695f2be8ddd52d92fc838bb828335381cef4221
MD5 5b332480ecf062970a98157ec3ad3acc
BLAKE2b-256 e6f3be85d45068d7c63a5b0dfdcb49eb67050c6dad7db8a5a83bdec81ece5edc

See more details on using hashes here.

File details

Details for the file oxylabs_ai_studio-0.2.20-py3-none-any.whl.

File metadata

File hashes

Hashes for oxylabs_ai_studio-0.2.20-py3-none-any.whl
Algorithm Hash digest
SHA256 3e14ebb9bfce1339d08f375f37b8ac0725cacc670374cae6a03df4b7afe15391
MD5 85d62cd149c65e09dc2446f8ca049317
BLAKE2b-256 fc32ea81eaabaf2569781ae17bc2d79c1692e5a44b759b242283847b18c52a6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page