Web Search Proxy implementation
Project description
Unique Search Proxy
Unified web egress proxy for search engines and crawlers. Three publishable packages in this repo:
| PyPI name | Module | Role |
|---|---|---|
unique-search-proxy |
unique_search_proxy_client.web |
FastAPI server (proxy pod) |
unique-search-proxy-sdk |
unique_search_proxy_sdk |
Async HTTP client for callers |
unique-search-proxy-core |
unique_search_proxy_core |
Shared Pydantic types (no FastAPI) |
flowchart LR
subgraph caller["Caller pod"]
SDK["unique_search_proxy_sdk"]
end
subgraph proxy["Proxy pod"]
API["unique_search_proxy_client.web"]
Pool["HttpClientPool"]
end
Core["unique_search_proxy_core"]
Internet["Google / public web"]
SDK --> Core
API --> Core
SDK -->|"POST /v1/search"| API
API --> Pool
Pool --> Internet
- Server owns registry, secrets, Prometheus, and egress (
HttpClientPool). - SDK wraps the OpenAPI contract; depends on core for
GoogleConfig, errors, etc. - Core is server-free and safe to install without FastAPI/uvicorn.
Quick Start
Prerequisites
- Python 3.12+
- uv for dependency management
Installation
uv sync
cp .env.example .env
# Edit .env: set GOOGLE_SEARCH_API_KEY and GOOGLE_SEARCH_ENGINE_ID for live /v1/search
Running
uv run python -m unique_search_proxy_client.web.app
# or
uv run uvicorn unique_search_proxy_client.web.app:app --reload --port 2349
Python SDK (unique-search-proxy-sdk)
Workspace path: connectors/unique_search_proxy/unique_search_proxy_sdk/. Generated from the server OpenAPI spec via openapi-python-client.
| Path | Role |
|---|---|
unique_search_proxy_sdk/_generated/ |
Regenerated httpx client + attrs models |
unique_search_proxy_sdk/client.py |
UniqueSearchProxyClient facade |
connectors/unique_search_proxy/unique_search_proxy_client/openapi.json |
Exported spec (codegen input) |
Regenerate after API changes
cd connectors/unique_search_proxy/unique_search_proxy_client
uv sync
uv run python scripts/generate_sdk.py
Usage
from unique_search_proxy_sdk import UniqueSearchProxyClient
async with UniqueSearchProxyClient("http://unique-search-proxy:2349") as client:
await client.health()
result = await client.search.search("unique ag", engine="google", fetchSize=10)
crawl = await client.crawl.crawl(["https://example.com"], crawler="basic")
# Low-level: one generated function per route
raw = client.openapi # OpenAPIClient from _generated
| Facade method | HTTP |
|---|---|
health() |
GET /health |
ready() |
GET /ready |
search.search(...) |
POST /v1/search |
crawl.crawl(...) |
POST /v1/crawl |
Deployment config JSON Schema, defaults, and LLM call-schema projection live in unique_search_proxy_core (not HTTP). Assistants-core and tooling import those helpers directly.
Non-success responses raise the same ProxyError subclasses as the service. Generated request/response models live under sdk._generated.models.
For tests, pass an httpx.AsyncClient with ASGITransport(app=create_app()) and run the app lifespan so in-app egress is initialized.
Other OpenAPI codegen tools
| Tool | Notes |
|---|---|
| OpenAPI Generator | Broad language support; verbose Python output |
| openapi-python-client | Used here — async httpx + attrs |
| datamodel-code-generator | Pydantic models only |
| Kiota | Multi-language SDKs |
API (application)
| Endpoint | Description |
|---|---|
GET /health |
Liveness |
GET /ready |
Readiness (httpx pool + registered providers) |
GET /v1/configuration/providers |
Registered search engine and crawler ids |
POST /v1/search |
Execute search (flat request: engine, query, provider params, timeout) |
POST /v1/crawl |
Crawl URLs via configured crawler (flat request: crawler, urls, timeout, …) |
GET /metrics |
Prometheus scrape endpoint (when enabled) |
/docs |
OpenAPI (Swagger UI) — use Try it out and the request-body Examples dropdown on /v1/search and /v1/crawl |
Set ENABLED=false on monitoring settings (PrometheusSettings) to disable metrics. With WORKERS > 1, the entrypoint sets PROMETHEUS_MULTIPROC_DIR for correct aggregation across uvicorn workers.
Settings are colocated with each component and use env prefixes:
| Component | Prefix / vars | Example |
|---|---|---|
| Google search | (no prefix) | GOOGLE_SEARCH_API_KEY, GOOGLE_SEARCH_ENGINE_ID |
| HTTP client | HTTP_CLIENT_ |
HTTP_CLIENT_PROXY_HOST, HTTP_CLIENT_POOL_TIMEOUT_SECONDS |
| Prometheus | PROMETHEUS_ |
PROMETHEUS_ENABLED |
| Container entrypoint | (shell) | HOST, PORT, WORKERS, LOG_LEVEL, PROMETHEUS_MULTIPROC_DIR |
Copy .example.env to .env for a annotated template of all settings. Shared helpers live in web/settings/.
Runtime discovery (GET /v1/configuration/providers)
Lists search engine and crawler ids registered in the proxy pod (depends on env/secrets). Use this for health checks and capability discovery at runtime.
Deployment config JSON Schema, defaults, and LLM call-schema projection are core library concerns — import from unique_search_proxy_core.providers.schema and unique_search_proxy_core.search_engines.call_schema (or the crawl equivalents). Assistants-core embeds those shapes in tool manifests rather than calling extra HTTP routes on the proxy.
Search (POST /v1/search)
Flat request body: all execution fields at the top level (engine, query, optional provider knobs, timeout). Tooling merges deployment config with LLM invocation in core (merge_config_and_invocation) before calling the proxy.
{
"engine": "google",
"query": "example query",
"fetchSize": 10,
"gl": "de",
"dateRestrict": "d7",
"timeout": 30
}
engine: registered search engine id (discriminator)query,fetchSize, optional provider knobs,timeout: flat execution payload onPOST /v1/search- Deployment config (
ExposableParamwithexpose+value): resolved in core before building the flat search request — not a separate HTTP surface on the proxy - LLM call schema:
unique_search_proxy_core.search_engines.call_schema.resolve_search_call_schema(...)with optionalstrict=Falsefor nullable exposed fields
Response:
{
"engine": "google",
"query": "example query",
"raw": {
"pages": [
{
"pageIndex": 1,
"offset": 1,
"requestedCount": 10,
"response": {}
}
]
},
"curated": [
{
"url": "https://example.com",
"title": "Example",
"snippet": "...",
"content": ""
}
]
}
Crawl (POST /v1/crawl)
{
"urls": ["https://example.com"],
"crawler": "basic",
"timeout": 30
}
Errors
Non-2xx responses use a structured envelope:
{
"error": {
"code": "ENGINE_NOT_CONFIGURED",
"message": "Engine 'google' is not registered or not configured",
"engine": "google",
"retryable": false
}
}
Project Structure
connectors/unique_search_proxy/
├── unique_search_proxy/
│ ├── sdk/ # HTTP SDK (callers → proxy API)
│ │ ├── _generated/ # openapi-python-client output (regenerate via scripts/)
│ │ ├── client.py # UniqueSearchProxyClient facade
│ │ ├── converters.py # App Pydantic config → generated models
│ │ └── errors.py # Maps API error envelope → ProxyError
│ ├── openapi.json # Exported OpenAPI (codegen input)
│ ├── scripts/generate_sdk.py
│ └── web/ # FastAPI application (proxy pod)
│ ├── app.py # App factory + lifespan (HttpClientPool)
│ ├── settings/
│ ├── api/
│ │ ├── health.py
│ │ └── v1/
│ │ ├── configuration.py
│ │ ├── search.py
│ │ └── crawl.py
│ ├── monitoring/
│ └── core/
│ ├── client/ # Egress pool — application only, not SDK
│ ├── search_engines/
│ └── crawlers/
├── tests/
└── deploy/
Engines and crawlers register via web/core/registry.py at application startup.
Development
uv run ruff check .
uv run ruff format .
uv run pytest
uv run basedpyright
License
Proprietary - Unique AG
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unique_search_proxy-2026.26.0.dev4.tar.gz.
File metadata
- Download URL: unique_search_proxy-2026.26.0.dev4.tar.gz
- Upload date:
- Size: 20.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e5c16e640c9d8e9eae241466f12e721c228e2f6ca74be96228b0f4aa9388e0a
|
|
| MD5 |
9d576efeebfa7870957f78a6e482172e
|
|
| BLAKE2b-256 |
76125a8d4b8a74b2904770badc0f4aee89b59691b5726ffea000b0b285f1f29e
|
File details
Details for the file unique_search_proxy-2026.26.0.dev4-py3-none-any.whl.
File metadata
- Download URL: unique_search_proxy-2026.26.0.dev4-py3-none-any.whl
- Upload date:
- Size: 38.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61df10e6172446e32b7617c37cf23db035e47679dcb101a3f8ebd5b765f5dd85
|
|
| MD5 |
a24f220fe34c6f88a99a89fcad2cc708
|
|
| BLAKE2b-256 |
7346073437fdd77ee3780d0a5feac5c791bae7d6d804f5e40dc3d40c8feb7528
|