Skip to main content

LlamaIndex integration for Parallel AI Search and Extract APIs

Project description

Parallel AI Tool

This tool provides integration between LlamaIndex and Parallel AI's Search and Extract APIs, enabling LLM agents to perform web research and content extraction.

  • Search API: Returns structured, compressed excerpts from web search results optimized for LLM consumption
  • Extract API: Converts public URLs into clean, LLM-optimized markdown including JavaScript-heavy pages and PDFs

Installation

pip install llama-index-tools-parallel-web-systems

Setup

  1. Get your API key from Parallel AI Platform
  2. Set your API key as an environment variable or pass it directly

Usage

from llama_index.tools.parallel_web_systems import ParallelWebSystemsToolSpec
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI

# Initialize the tool with your API key
parallel_tool = ParallelWebSystemsToolSpec(
    api_key="your-api-key-here",
)

# Create an agent with the tool
agent = FunctionAgent(
    tools=parallel_tool.to_tool_list(),
    llm=OpenAI(model="gpt-4o"),
)

# Use the agent to perform web research
response = await agent.run("What was the GDP of France in 2023?")
print(response)

Available Functions

search

Search the web using Parallel AI's Search API. Returns structured excerpts optimized for LLM consumption.

Parameters:

  • objective (str, optional): Natural-language description of what to search for
  • search_queries (list[str], optional): Traditional keyword search queries (max 5)
  • max_results (int): Maximum results to return, 1-40 (default: 10)
  • mode (str, optional): 'one-shot' for comprehensive results, 'agentic' for token-efficient results
  • excerpts (dict, optional): Excerpt settings, e.g., {'max_chars_per_result': 1500}
  • source_policy (dict, optional): Domain and date preferences
  • fetch_policy (dict, optional): Cache vs live content policy

At least one of objective or search_queries must be provided.

Example:

from llama_index.tools.parallel_web_systems import ParallelWebSystemsToolSpec

parallel_tool = ParallelWebSystemsToolSpec(api_key="your-api-key")

# Search with an objective
results = parallel_tool.search(
    objective="What are the latest developments in renewable energy?",
    max_results=5,
    mode="one-shot",
)

for doc in results:
    print(f"Title: {doc.metadata.get('title')}")
    print(f"URL: {doc.metadata.get('url')}")
    print(f"Excerpts: {doc.text[:300]}...")
    print("---")

# Search with specific queries
results = parallel_tool.search(
    search_queries=["solar power 2024", "wind energy statistics"],
    max_results=8,
    mode="agentic",
)

extract

Extract clean, structured content from web pages using Parallel AI's Extract API.

Parameters:

  • urls (list[str]): List of URLs to extract content from
  • objective (str, optional): Natural language objective to focus extraction
  • search_queries (list[str], optional): Specific keyword queries to focus extraction
  • excerpts (bool | dict): Include excerpts (default: True). Can be dict like {'max_chars_per_result': 2000}
  • full_content (bool | dict): Include full page content (default: False)
  • fetch_policy (dict, optional): Cache vs live content policy

Example:

from llama_index.tools.parallel_web_systems import ParallelWebSystemsToolSpec

parallel_tool = ParallelWebSystemsToolSpec(api_key="your-api-key")

# Extract content focused on a specific objective
results = parallel_tool.extract(
    urls=["https://en.wikipedia.org/wiki/Artificial_intelligence"],
    objective="What are the main applications and ethical concerns of AI?",
    excerpts={"max_chars_per_result": 2000},
)

for doc in results:
    print(f"Title: {doc.metadata.get('title')}")
    print(f"Content: {doc.text[:500]}...")

# Extract full content from multiple URLs
results = parallel_tool.extract(
    urls=[
        "https://example.com/article1",
        "https://example.com/article2",
    ],
    full_content=True,
    excerpts=False,
)

Error Handling

The tool includes built-in error handling. If an API call fails, it returns an empty list, allowing your agent to continue:

results = parallel_tool.search(objective="test query")
if not results:
    print("No results found or API error occurred")

For extract operations, failed URLs are included in results with error information:

results = parallel_tool.extract(urls=["https://invalid-url.com/"])
for doc in results:
    if doc.metadata.get("error_type"):
        print(f"Failed: {doc.metadata['url']} - {doc.text}")

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file llama_index_tools_parallel_web_systems-0.1.0.tar.gz.

File metadata

  • Download URL: llama_index_tools_parallel_web_systems-0.1.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_tools_parallel_web_systems-0.1.0.tar.gz
Algorithm Hash digest
SHA256 de0d8a8479fb32f86340d83c47bdc2ff6c8233feca03a36788be7237e9d400aa
MD5 99f37d876ac2df56362dcdedd3e04fed
BLAKE2b-256 ae62b3f9e6e83f76934ac003a188fa4f02c52455eff172c04d7f725e43920358

See more details on using hashes here.

File details

Details for the file llama_index_tools_parallel_web_systems-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llama_index_tools_parallel_web_systems-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_tools_parallel_web_systems-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc850f0b6694d5b8460e62e9d7aaad0f07b4c8eb71e87c63059e4607f9b8dd4c
MD5 0f5664940553716d9e134aba78f8e49a
BLAKE2b-256 cfd0ece0bffc0d8571a04c6ab1a750c2879447950f6cb1e1c69d16ddac2fa46f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page