Skip to main content

Python client for Evomi API

Project description

Evomi Python Client

The official Python client for Evomi — a powerful web scraping and proxy platform. Extract data from any website with AI-powered processing, browser rendering, and a global proxy network.

Installation

pip install evomi-client

Quick Start

from evomi_client import EvomiClient

# Initialize the client
client = EvomiClient(api_key="your-api-key")

# Scrape a webpage
result = await client.scrape("https://example.com")
print(result["content"])

Core Features

  • Web Scraping — Extract content from any URL with automatic JS rendering detection
  • AI-Powered Extraction — Get structured data using natural language prompts
  • Crawling & Mapping — Discover and scrape entire websites
  • Proxy Network — Access residential, datacenter, and mobile proxies worldwide

Usage Examples

Basic Scraping

import asyncio
from evomi_client import EvomiClient

async def main():
    client = EvomiClient(api_key="your-api-key")
    
    # Simple scrape (auto-detects if JS rendering is needed)
    result = await client.scrape("https://example.com")
    print(result["content"])

asyncio.run(main())

AI-Powered Data Extraction

Extract structured data without writing selectors:

result = await client.scrape(
    "https://example.com/products",
    ai_enhance=True,
    ai_prompt="Extract all product names, prices, and availability"
)
print(result["ai_data"])

Browser Mode for JavaScript Sites

Force browser rendering for dynamic content:

result = await client.scrape(
    "https://spa-example.com",
    mode="browser",  # Forces headless browser
    wait_seconds=2   # Wait for dynamic content
)

Crawling Websites

Discover and scrape multiple pages:

result = await client.crawl(
    domain="example.com",
    max_urls=50,
    depth=2,
    url_pattern="/blog/.*"  # Only crawl blog pages
)

Synchronous Client

For non-async code:

from evomi_client import EvomiClientSync

client = EvomiClientSync(api_key="your-api-key")
result = client.scrape("https://example.com")
print(result["content"])

Proxy String Builder

Evomi provides a proxy network you can use with any HTTP client. Build proxy strings for tools like requests, httpx, or aiohttp:

from evomi_client import EvomiClient, ProxyConfig, ProxyType

client = EvomiClient(api_key="your-api-key")

# Build a proxy string for US residential proxy
proxy_string = await client.build_proxy_string(
    proxy_type=ProxyType.RESIDENTIAL,
    country="US",
    session="abc12345"  # Sticky session
)
print(proxy_string)
# Output: http://user:pass_country-US_session-abc12345@rp.evomi.com:1000

Manual Proxy Configuration

from evomi_client import ProxyConfig, ProxyType, ProxyProtocol

config = ProxyConfig(
    proxy_type=ProxyType.RESIDENTIAL,
    protocol=ProxyProtocol.HTTP,
    country="US",
    city="New York",
    username="your-username",
    password="your-password"
)

proxy_string = config.build_proxy_string()

Proxy Types

Type Endpoint Use Case
Residential rp.evomi.com:1000 Human-like browsing, anti-bot bypass
Datacenter dcp.evomi.com:2000 Fast, high-volume requests
Mobile mp.evomi.com:3000 Highest trust, mobile-specific targets

API Reference

Scraping Operations

scrape(url, ...)

Scrape a single URL with configurable options.

result = await client.scrape(
    "https://example.com",
    mode="auto",           # "request", "browser", or "auto"
    output="markdown",     # "html", "markdown", "screenshot", "pdf"
    device="windows",      # "windows", "macos", "android"
    proxy_type="residential",
    proxy_country="US",
    proxy_session_id="abc123",
    wait_until="domcontentloaded",
    ai_enhance=True,
    ai_prompt="Extract product data",
    ai_source="markdown",
    js_instructions=[{"click": ".load-more"}],
    execute_js="window.scrollTo(0, document.body.scrollHeight)",
    wait_seconds=2,
    screenshot=False,
    pdf=False,
    excluded_tags=["nav", "footer"],
    excluded_selectors=[".ads"],
    block_resources=["image", "stylesheet"],
    additional_headers={"X-Custom": "value"},
    capture_headers=True,
    network_capture=[{"url_pattern": "/api/.*"}],
    async_mode=False,
    config_id="cfg_abc123",
    scheme_id="sch_abc123",
    extract_scheme=[{"label": "title", "type": "content", "selector": "h1"}],
    storage_id="stor_abc123",
    use_default_storage=False,
    no_html=False,
)
Parameter Type Default Description
url str required URL to scrape
mode str "auto" Scraping mode: "request" (fast), "browser" (JS), "auto" (detect)
output str "markdown" Output format: "html", "markdown", "screenshot", "pdf"
device str "windows" Device type: "windows", "macos", "android"
proxy_type str "residential" Proxy type: "datacenter", "residential"
proxy_country str "US" Two-letter country code
proxy_session_id str None Proxy session ID (6-8 chars)
wait_until str "domcontentloaded" Wait condition: "load", "domcontentloaded", "networkidle", "commit"
ai_enhance bool False Enable AI extraction
ai_prompt str None Prompt for AI extraction
ai_source str None AI source: "markdown", "screenshot"
ai_force_json bool True Force AI response to valid JSON
js_instructions list None JS actions: click, wait, fill, wait_for
execute_js str None Raw JavaScript to execute
wait_seconds int 0 Seconds to wait after page load
screenshot bool False Capture screenshot
pdf bool False Capture PDF
excluded_tags list None HTML tags to remove
excluded_selectors list None CSS selectors to remove
block_resources list None Resource types to block
additional_headers dict None Extra HTTP headers
capture_headers bool False Capture response headers
network_capture list None Network capture filters
async_mode bool False Return immediately with task ID
config_id str None Saved config ID
scheme_id str None Saved extraction schema ID
extract_scheme list None Inline extraction schema
storage_id str None Storage config ID
use_default_storage bool False Use default storage
no_html bool False Exclude HTML from response

crawl(domain, ...)

Crawl a website to discover and scrape multiple pages.

result = await client.crawl(
    domain="example.com",
    max_urls=100,
    depth=2,
    url_pattern="/blog/.*",
    scraper_config={"mode": "browser", "output": "markdown"},
    async_mode=False,
)
Parameter Type Default Description
domain str required Domain to crawl
max_urls int 100 Maximum URLs to crawl
depth int 2 Crawl depth
url_pattern str None Regex pattern to filter URLs
scraper_config dict None Config for scraping each page
async_mode bool False Return immediately with task ID

map_website(domain, ...)

Discover URLs from a website via sitemaps, CommonCrawl, or crawling.

result = await client.map_website(
    domain="example.com",
    sources=["sitemap", "commoncrawl"],
    max_urls=500,
    url_pattern="/products/.*",
    check_if_live=False,
    depth=1,
    async_mode=False,
)
Parameter Type Default Description
domain str required Domain to map
sources list ["sitemap", "commoncrawl"] Sources: "sitemap", "commoncrawl", "crawl"
max_urls int 500 Maximum URLs to discover
url_pattern str None Regex pattern to filter URLs
check_if_live bool False Check if URLs are live
depth int 1 Crawl depth if using crawl source
async_mode bool False Return immediately with task ID

search_domains(query, ...)

Find domains by searching the web.

# Single query
result = await client.search_domains(
    query="e-commerce platforms",
    max_urls=20,
    region="us-en",
)

# Multiple queries (up to 10)
result = await client.search_domains(
    query=["web scraping tools", "data extraction services"],
    max_urls=20,
    region="us-en",
)
Parameter Type Default Description
query str or list required Search query or list of up to 10 queries
max_urls int 20 Max domains per query (max: 100)
region str "us-en" Region for results (e.g., "us-en", "de-de")

agent_request(message)

Send a natural language request to the AI agent.

result = await client.agent_request(
    "Scrape example.com and extract all product prices"
)
Parameter Type Default Description
message str required Natural language request

get_task_status(task_id, task_type)

Check the status of an async task.

result = await client.get_task_status(
    task_id="abc123",
    task_type="scrape"  # "scrape", "crawl", "map", "config_generate", "schema"
)
Parameter Type Default Description
task_id str required Task ID to check
task_type str "scrape" Task type: "scrape", "crawl", "map", "config_generate", "schema"

Config Management

Save and reuse scrape configurations.

list_configs(...)

List all saved scrape configs.

configs = await client.list_configs(
    page=1,
    per_page=20,
    sort_by="created_at",
    sort_order="desc",
)

create_config(name, config)

Create a new scrape config.

config = await client.create_config(
    name="Product Scraper",
    config={"mode": "browser", "output": "markdown"}
)

get_config(config_id)

Get a scrape config by ID.

config = await client.get_config("cfg_abc123")

update_config(config_id, ...)

Update an existing scrape config.

config = await client.update_config(
    "cfg_abc123",
    name="New Name",
    config={"mode": "request"}
)

delete_config(config_id)

Delete a scrape config.

await client.delete_config("cfg_abc123")

generate_config(name, prompt)

Generate a scrape config from natural language using AI.

config = await client.generate_config(
    name="Amazon Scraper",
    prompt="Scrape product title and price from Amazon product pages"
)

Schema Management

Define reusable structured data extraction schemas.

list_schemas(...)

List all saved extraction schemas.

schemas = await client.list_schemas(
    page=1,
    per_page=20,
    sort_by="created_at",
    sort_order="desc",
)

create_schema(name, config, ...)

Create a new extraction schema.

schema = await client.create_schema(
    name="Product Schema",
    config={
        "url": "https://example.com/product",
        "extract_scheme": [
            {"label": "title", "type": "content", "selector": "h1"},
            {"label": "price", "type": "content", "selector": ".price"}
        ]
    },
    test=True,  # Test the schema
    fix=False,  # Auto-fix issues
)

get_schema(scheme_id)

Get an extraction schema by ID.

schema = await client.get_schema("sch_abc123")

update_schema(scheme_id, name, config, ...)

Update an existing extraction schema.

schema = await client.update_schema(
    "sch_abc123",
    name="Updated Schema",
    config={"url": "...", "extract_scheme": [...]},
    test=True,
)

delete_schema(scheme_id)

Delete an extraction schema.

await client.delete_schema("sch_abc123")

get_schema_status(scheme_id)

Get the test status of a schema.

status = await client.get_schema_status("sch_abc123")

Schedule Management

Run scrape configs on a recurring schedule.

list_schedules(...)

List all scheduled jobs.

schedules = await client.list_schedules(
    page=1,
    per_page=20,
    active_only=False,
)

create_schedule(name, config_id, interval_minutes, ...)

Create a new scheduled scrape job.

schedule = await client.create_schedule(
    name="Daily Price Check",
    config_id="cfg_abc123",
    interval_minutes=1440,  # Daily
    start_time="09:00",     # UTC
    stop_on_error=True,
)

get_schedule(schedule_id)

Get a scheduled job by ID.

schedule = await client.get_schedule("sched_abc123")

update_schedule(schedule_id, ...)

Update an existing scheduled job.

schedule = await client.update_schedule(
    "sched_abc123",
    name="New Name",
    interval_minutes=720,
)

delete_schedule(schedule_id)

Delete a scheduled job.

await client.delete_schedule("sched_abc123")

toggle_schedule(schedule_id)

Toggle a scheduled job active/inactive.

await client.toggle_schedule("sched_abc123")

list_schedule_runs(schedule_id, ...)

Get execution history for a scheduled job.

runs = await client.list_schedule_runs(
    "sched_abc123",
    page=1,
    per_page=20,
)

Storage Management

Connect cloud storage to automatically save scrape results.

list_storage_configs()

List all storage configurations.

configs = await client.list_storage_configs()

create_storage_config(name, storage_type, config, ...)

Create a new storage configuration.

# S3-compatible storage
storage = await client.create_storage_config(
    name="My S3",
    storage_type="s3_compatible",
    config={
        "bucket": "my-bucket",
        "region": "us-east-1",
        "access_key": "...",
        "secret_key": "...",
    },
    set_as_default=True,
)

# Google Cloud Storage
storage = await client.create_storage_config(
    name="My GCS",
    storage_type="gcs",
    config={
        "bucket": "my-bucket",
        "credentials_json": "...",
    },
)

# Azure Blob Storage
storage = await client.create_storage_config(
    name="My Azure",
    storage_type="azure_blob",
    config={
        "container": "my-container",
        "connection_string": "...",
    },
)

update_storage_config(storage_id, ...)

Update an existing storage configuration.

storage = await client.update_storage_config(
    "stor_abc123",
    name="Renamed Storage",
    set_as_default=True,
)

delete_storage_config(storage_id)

Delete a storage configuration.

await client.delete_storage_config("stor_abc123")

Public API

Access proxy credentials and related data.

get_proxy_data()

Get detailed information about your proxy products.

data = await client.get_proxy_data()
# Returns: {"products": {"rp": {...}, "sdc": {...}, "mp": {...}}, ...}

get_targeting_options()

Get available targeting parameters for different proxy types.

options = await client.get_targeting_options()

get_scraper_data()

Get information about your Scraper API access.

data = await client.get_scraper_data()
# Returns: {"credits": ..., "concurrency_limit": ..., ...}

get_browser_data()

Get information about your Browser API access.

data = await client.get_browser_data()
# Returns: {"credits": ..., "concurrency_limit": ..., "endpoint": ..., ...}

rotate_session(session_id, product)

Force an IP address change for an existing proxy session.

result = await client.rotate_session(
    session_id="abc12345",
    product="rp"  # "rpc", "rp", "sdc", "mp"
)

generate_proxies(product, ...)

Generate proxy strings with specific targeting parameters.

proxies = await client.generate_proxies(
    product="rp",
    countries="US,GB,DE",
    city="New York",
    session="sticky",
    amount=10,
    protocol="http",
    lifetime=30,
    adblock=True,
)
# Returns plain text, one proxy per line
Parameter Type Description
product str Proxy product type
countries str ISO country codes, comma-separated
city str Target city name
region str Target region
isp str Target ISP name
session str "sticky" or "hard"
amount int Number of proxies to generate (1-100)
format str Output format (1, 2, or 3)
prepend_protocol bool Prepend protocol to proxy string
protocol str "http" or "socks5"
lifetime int Session duration in minutes
adblock bool Enable ad-blocking

Account Info

get_account_info()

Get account info including credit balance.

info = await client.get_account_info()
print(info.get("credits", "N/A"))

Proxy Helpers

build_proxy_config(...)

Build a proxy configuration with credentials from the Public API.

from evomi_client import ProxyType, ProxyProtocol, ResidentialMode

config = await client.build_proxy_config(
    proxy_type=ProxyType.RESIDENTIAL,
    protocol=ProxyProtocol.HTTP,
    country="US",
    city="New York",
    region="California",
    continent="north.america",
    isp="att",
    session="abc12345",
    hardsession=None,
    lifetime=30,
    mode=ResidentialMode.SPEED,
    latency=100,
    fraudscore=20,
    device="windows",
    http3=True,
)

build_proxy_string(...)

Build a proxy connection string directly.

proxy_string = await client.build_proxy_string(
    proxy_type=ProxyType.RESIDENTIAL,
    country="US",
    session="abc12345",
)

Configuration

API Key

Set your API key via environment variable:

export EVOMI_API_KEY="your-api-key"

Or pass it directly:

client = EvomiClient(api_key="your-api-key")

Proxy Credentials (Optional)

If you have separate credentials for the proxy API:

client = EvomiClient(
    api_key="your-api-key",
    public_api_key="your-proxy-api-key"
)

Error Handling

import httpx

try:
    result = await client.scrape("https://example.com")
except httpx.HTTPStatusError as e:
    print(f"API error: {e.response.status_code}")
    print(f"Details: {e.response.text}")

Credits & Pricing

All operations consume credits:

  • Base request: 1 credit
  • Browser mode: 5x multiplier
  • Residential proxy: 2x multiplier
  • AI enhancement: +30 credits

Credit usage is returned in response headers:

print(result["_credits_used"])
print(result["_credits_remaining"])

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evomi_client-1.0.2.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

evomi_client-1.0.2-py3-none-any.whl (20.6 kB view details)

Uploaded Python 3

File details

Details for the file evomi_client-1.0.2.tar.gz.

File metadata

  • Download URL: evomi_client-1.0.2.tar.gz
  • Upload date:
  • Size: 19.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.1

File hashes

Hashes for evomi_client-1.0.2.tar.gz
Algorithm Hash digest
SHA256 8c43c2c19e60c293bec21f09229ed6381e9365647e5ac157627b0ee5b9c1f7d6
MD5 f1fae7097c814805ff71bcc20937a317
BLAKE2b-256 2f0ad2b7dd40aef0a76e0e769ecfe84c081ee75df9797f47c167e12035c85441

See more details on using hashes here.

File details

Details for the file evomi_client-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: evomi_client-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 20.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.1

File hashes

Hashes for evomi_client-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 590fc31b2ff5892ca2042931af981c382fab968bce578c9c78d9c0ae0bdaef03
MD5 99cccfbe3caeb8467849889248beba97
BLAKE2b-256 043b12014aa31371e295eb7bf23d1f607266edae352144c512356b598fdcac50

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page