Oxylabs studio python sdk
Project description
OxyLabs AI Studio Python SDK
A simple Python SDK for seamlessly interacting with Oxylabs AI Studio API services, including AI-Scraper, AI-Crawler, AI-Browser-Agent and other data extraction tools.
Requirements
- python 3.10 and above
- API KEY
Installation
pip install oxylabs-ai-studio
Usage
Crawl (AiCrawler.crawl)
from oxylabs_ai_studio.apps.ai_crawler import AiCrawler
crawler = AiCrawler(api_key="<API_KEY>")
url = "https://oxylabs.io"
result = crawler.crawl(
url=url,
user_prompt="Find all pages with proxy products pricing",
output_format="markdown",
render_javascript=False,
return_sources_limit=3,
geo_location="United States",
)
print("Results:")
for item in result.data:
print(item, "\n")
Parameters:
url(str): Starting URL to crawl (required)user_prompt(str): Natural language prompt to guide extraction (required)output_format(Literal["json", "markdown", "csv", "toon"]): Output format (default: "markdown")schema(dict | None): Json schema for structured extraction (required if output_format is "json", "csv" or "toon")render_javascript(bool): Render JavaScript (default: False)return_sources_limit(int): Max number of sources to return (default: 25)geo_location(str): Proxy location in ISO2 format or country canonical name. See docsmax_credits(int | None): Maximum of credits to use (optional)
Scrape (AiScraper.scrape)
from oxylabs_ai_studio.apps.ai_scraper import AiScraper
scraper = AiScraper(api_key="<API_KEY>")
schema = scraper.generate_schema(prompt="want to parse developer, platform, type, price game title, genre (array) and description")
print(f"Generated schema: {schema}")
url = "https://sandbox.oxylabs.io/products/3"
result = scraper.scrape(
url=url,
output_format="json",
schema=schema,
render_javascript=False,
)
print(result)
Parameters:
url(str): Target URL to scrape (required)output_format(Literal["json", "markdown", "csv", "screenshot", "toon"]): Output format (default: "markdown")schema(dict | None): JSON schema for structured extraction (required if output_format is "json", "csv" or "toon")render_javascript(bool | string): Render JavaScript. Can be set to "auto", meaning the service will detect if rendering is needed (default: False)geo_location(str): Proxy location in ISO2 format or country canonical name. See docsuser_agent(str): User-Agent request header. See more at https://developers.oxylabs.io/scraping-solutions/web-scraper-api/features/http-context-and-job-management/user-agent-type.
Browser Agent (BrowserAgent.run)
from oxylabs_ai_studio.apps.browser_agent import BrowserAgent
browser_agent = BrowserAgent(api_key="<API_KEY>")
schema = browser_agent.generate_schema(
prompt="game name, platform, review stars and price"
)
print("schema: ", schema)
prompt = "Find if there is game 'super mario odyssey' in the store. If there is, find the price. Use search bar to find the game."
url = "https://sandbox.oxylabs.io/"
result = browser_agent.run(
url=url,
user_prompt=prompt,
output_format="json",
schema=schema,
)
print(result.data)
Parameters:
url(str): Starting URL to browse (required)user_prompt(str): Natural language prompt for extraction (required)output_format(Literal["json", "markdown", "html", "screenshot", "csv", "toon"]): Output format (default: "markdown")schema(dict | None): Json schema for structured extraction (required if output_format is "json", "csv" or "toon")geo_location(str): Proxy location in ISO2 format or country canonical name. For example 'Germany' (capitalized).
Search (AiSearch.search)
from oxylabs_ai_studio.apps.ai_search import AiSearch
search = AiSearch(api_key="<API_KEY>")
query = "lasagna recipe"
result = search.search(
query=query,
limit=5,
render_javascript=False,
return_content=True,
)
print(result.data)
# Or for fast search
result = search.instant_search(
query=query,
limit=10,
)
print(result.data)
Parameters:
query(str): What to search for (required)limit(int): Maximum number of results to return (default: 10, maximum: 50)render_javascript(bool): Render JavaScript (default: False)return_content(bool): Whether to return markdown contents in results (default: True)geo_location(string): ISO 2-letter format, country name, coordinate formats are supported. See more at SERP Localization.
Note: When
limit <= 10andreturn_content=False, the search automatically uses the instant endpoint (/search/instant) which returns results immediately without polling, providing faster response times.
Instant search supported parameters:
query(string): The search query.limit(integer): The maximum number of search results to return. Maximum: 10.geo_location(string): Google's canonical name of the location. See more at Google Ads GeoTargets.
Map (AiMap.map)
from oxylabs_ai_studio.apps.ai_map import AiMap
ai_map = AiMap(api_key="<API_KEY>")
payload = {
"url": "https://career.oxylabs.io",
"search_keywords": ["career", "jobs", "vacancy"],
"user_prompt": "job ad pages",
"max_crawl_depth": 2,
"limit": 10,
"geo_location": "Germany",
"render_javascript": False,
"include_sitemap": True,
"max_credits": None,
"allow_subdomains": False,
"allow_external_domains": False,
}
result = ai_map.map(**payload)
print(result.data)
Parameters:
url(str): Starting URL or domain to map (required)search_keywords(list[str]): Keywords for URLs paths filtering (default: None)user_prompt(str | None): Natural language prompt for keyword search. Can be used together with 'search_keywords' or standalone (optional)max_crawl_depth(int): Max crawl depth (1..5, default: 1)limit(int): Max number of URLs to return (default: 25)geo_location(str): Proxy location in ISO2 format or country canonical name. See docsrender_javascript(bool): JavaScript rendering (default: False)include_sitemap(bool): Whether to include sitemap as seed (default: True)max_credits(int | None): Maximum of credits to use (optional)allow_subdomains(bool): Include subdomains (default: False)allow_external_domains(bool): Include external domains (default: False)
See the examples folder for usage examples of each method. Each method has corresponding async version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oxylabs_ai_studio-0.2.20.tar.gz.
File metadata
- Download URL: oxylabs_ai_studio-0.2.20.tar.gz
- Upload date:
- Size: 2.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a5e699b840b90a02fab9f610695f2be8ddd52d92fc838bb828335381cef4221
|
|
| MD5 |
5b332480ecf062970a98157ec3ad3acc
|
|
| BLAKE2b-256 |
e6f3be85d45068d7c63a5b0dfdcb49eb67050c6dad7db8a5a83bdec81ece5edc
|
File details
Details for the file oxylabs_ai_studio-0.2.20-py3-none-any.whl.
File metadata
- Download URL: oxylabs_ai_studio-0.2.20-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e14ebb9bfce1339d08f375f37b8ac0725cacc670374cae6a03df4b7afe15391
|
|
| MD5 |
85d62cd149c65e09dc2446f8ca049317
|
|
| BLAKE2b-256 |
fc32ea81eaabaf2569781ae17bc2d79c1692e5a44b759b242283847b18c52a6c
|