II Researcher Package

Project description

II-Researcher

ii_researcher

A powerful deep search agent that uses BAML functions to perform intelligent web searches and generate comprehensive answers to questions.

For more details about our project, please visit our blog post.

Features

🔍 Intelligent web search using Tavily and SerpAPI search providers
🕸️ Web scraping and content extraction with multiple providers (Firecrawl, Browser, BS4, Tavily)
🧠 Multi-step reasoning and reflection
⚙️ Configurable LLM models for different tasks
⚡ Asynchronous operation for better performance
📝 Comprehensive answer generation with references
🛠️ Support for customizable pipelines and reasoning methods for deep search

🎬 Demo

https://github.com/user-attachments/assets/d862b900-a06b-46c6-9694-cccd1edac6f6

🎬 MCP

https://github.com/user-attachments/assets/2c1542f0-0e1b-44d5-8fc5-0446a07b3821

🔧 Required Software

Python 3.7+ (required for local development)
Docker and Docker Compose (required for containerized deployment)
Node.js and npm (required for local frontend development)

🛠️ Installation and Setup

Option 1: Install from PyPI

pip install ii-researcher

Option 2: Install from Source

1. Clone the repository:

git clone https://github.com/Intelligent-Internet/ii-researcher.git
cd ii-researcher

2. Install the package in development mode:

pip install -e .

3. Set up your environment variables:

# API Keys
export OPENAI_API_KEY="your-openai-api-key"
export TAVILY_API_KEY="your-tavily-api-key" # set this api key when you select SEARCH_PROVIDER is tavily
export SERPAPI_API_KEY="your-serpapi-api-key"  # set this api key when you select SEARCH_PROVIDER is serpapi
export FIRECRAWL_API_KEY="your-firecrawl-api-key"  # set this api key when you select SCRAPER_PROVIDER is firecrawl

# API Endpoints
export OPENAI_BASE_URL="http://localhost:4000"

# Compress Configuration
export COMPRESS_EMBEDDING_MODEL="text-embedding-3-large"
export COMPRESS_SIMILARITY_THRESHOLD="0.3"
export COMPRESS_MAX_OUTPUT_WORDS="4096"
export COMPRESS_MAX_INPUT_WORDS="32000"

# Search and Scraping Configuration
export SEARCH_PROVIDER="serpapi"  # Options: 'serpapi' | 'tavily'
export SCRAPER_PROVIDER="firecrawl"  # Options: 'firecrawl' | 'bs' | 'browser' | 'tavily_extract'

# Timeouts and Performance Settings
export SEARCH_PROCESS_TIMEOUT="300"  # in seconds
export SEARCH_QUERY_TIMEOUT="20"     # in seconds
export SCRAPE_URL_TIMEOUT="30"       # in seconds
export STEP_SLEEP="100"              # in milliseconds

Config env when using compress by LLM (Optional: For better compression performance)

export USE_LLM_COMPRESSOR="TRUE"
export FAST_LLM="gemini-lite" # The model use for context compression

Config env when run with Pipeline:

# Model Configuration
export STRATEGIC_LLM="gpt-4o" # The model use for choose next action
export SMART_LLM="gpt-4o" # The model use for others tasks in pipeline

Config env when run with Reasoning:

export R_MODEL=r1 # The model use for reasoning
export R_TEMPERATURE=0.2 # Config temperature for reasoning model
export R_REPORT_MODEL=gpt-4o # The model use for writing report
export R_PRESENCE_PENALTY=0 # Config presence_penalty for reasoning model

4. Configure and Run LiteLLM (Local LLM Server):

# Install LiteLLM
pip install litellm

# Create litellm_config.yaml file
cat > litellm_config.yaml << EOL
model_list:
  - model_name: text-embedding-3-large
    litellm_params:
      model: text-embedding-3-large
      api_key: ${OPENAI_API_KEY}
  - model_name: gpt-4o
    litellm_params:
      model: gpt-4o
      api_key: ${OPENAI_API_KEY}
  - model_name: o1-mini
    litellm_params:
      model: o1-mini
      api_key: ${OPENAI_API_KEY}
  - model_name: r1
    litellm_params:
      model: deepseek-reasoner
      api_base: https://api.deepseek.com/beta
      api_key: ${DEEPSEEK_API_KEY}

litellm_settings:
  drop_params: true
EOL

# Start LiteLLM server
litellm --config litellm_config.yaml

The LiteLLM server will run on http://localhost:4000 by default.

5. (Optional) Configure and Run LiteLLM with OpenRouter:

cat > litellm_config.yaml << EOL
model_list:
  - model_name: text-embedding-3-large
    litellm_params:
      model: text-embedding-3-large
      api_key: ${OPENAI_API_KEY}
  - model_name: "gpt-4o"
    litellm_params:
      model: "openai/chatgpt-4o-latest"
      api_base: "https://openrouter.ai/api/v1"
      api_key: "your_openrouter_api_key_here"

  - model_name: "r1"
    litellm_params:
      model: "deepseek/deepseek-r1"
      api_base: "https://openrouter.ai/api/v1"
      api_key: "your_openrouter_api_key_here"

  - model_name: "gemini-lite"
    litellm_params:
      model: "google/gemini-2.0-flash-lite-001"
      api_base: "https://openrouter.ai/api/v1"
      api_key: "your_openrouter_api_key_here"

litellm_settings:
  drop_params: true
EOL

🖥️ Usage

Using the CLI

Run the deep search agent with your question:

There are two modes:

Pipeline Mode: This mode is suitable for general questions and tasks.

python ii_researcher/cli.py --question "your question here"

Reasoning Mode: This mode is suitable for complex questions and tasks.

python ii_researcher/cli.py --question "your question here" --use-reasoning --stream

Using MCP

Set up your environment variables

Copy the .env.example file to create a new file named .env
```
cp .env.example .env
```
Edit the .env file and add your API keys and configure other settings:

Integrating with Claude You can integrate your MCP server with Claude using: Claude Desktop Integration
Install mcp to Claude

mcp install mcp/server.py -f .env

Restart your Claude App

Using the Web Interface

Install and Run Backend API (In case for frontend serving):

# Start the API server
python api.py

The API server will run on http://localhost:8000

Setup env for Frontend

Create a .env file in the frontend directory with the following content:

NEXT_PUBLIC_API_URL=http://localhost:8000

Install and Run Frontend:

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Start the development server
npm run dev

The frontend will be available at http://localhost:3000

🐳 Run with Docker

Important: Make sure you have set up all environment variables from step 3 before proceeding.
Start the services using Docker Compose:

# Build and start all services
docker compose up --build -d

The following services will be started:

frontend: Next.js frontend application
api: FastAPI backend service
litellm: LiteLLM proxy server

The services will be available at:

Frontend: http://localhost:3000
Backend API: http://localhost:8000
LiteLLM Server: http://localhost:4000

View logs:

# View all logs
docker compose logs -f

# View specific service logs
docker compose logs -f frontend
docker compose logs -f api
docker compose logs -f litellm

Stop the services:

docker compose down

🛠️ Running QwQ Model with SGLang

To run the Qwen/QwQ-32B model using SGLang, use the following command:

python3 -m sglang.launch_server --model-path Qwen/QwQ-32B --host 0.0.0.0 --port 30000 --tp 8 --context-length 131072

💡 Acknowledgments

II-Researcher is inspired by and built with the support of the open-source community:

LiteLLM – Used for efficient AI model integration.
node-DeepResearch – Prompt inspiration
gpt-researcher - Prompt inspiration, web scraper tool
baml - Structured outputs

Project details

Release history Release notifications | RSS feed

0.1.5

May 9, 2025

0.1.4

Apr 14, 2025

This version

0.1.3

Apr 14, 2025

0.1.2

Apr 14, 2025

0.1.1

Apr 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ii_researcher-0.1.3.tar.gz (100.5 kB view details)

Uploaded Apr 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ii_researcher-0.1.3-py3-none-any.whl (120.6 kB view details)

Uploaded Apr 14, 2025 Python 3

File details

Details for the file ii_researcher-0.1.3.tar.gz.

File metadata

Download URL: ii_researcher-0.1.3.tar.gz
Upload date: Apr 14, 2025
Size: 100.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ii_researcher-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`b1870650e75d7f4f5d04d863e23019ede9711124bfb2cb7079a27f1b0a0e210b`
MD5	`9edcd7b3dc9888f568f7c20161d0252b`
BLAKE2b-256	`d1dbe1d9c6c703acd3bde42c35de7bc5cc16dae45a6065e3a5a4c6ae99393096`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ii_researcher-0.1.3.tar.gz:

Publisher: release_pypi.yaml on Intelligent-Internet/ii-researcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ii_researcher-0.1.3.tar.gz
- Subject digest: b1870650e75d7f4f5d04d863e23019ede9711124bfb2cb7079a27f1b0a0e210b
- Sigstore transparency entry: 196615686
- Sigstore integration time: Apr 14, 2025
Source repository:
- Permalink: Intelligent-Internet/ii-researcher@8232733a9816c469d2b099b6265b584703e21d6b
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/Intelligent-Internet
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release_pypi.yaml@8232733a9816c469d2b099b6265b584703e21d6b
- Trigger Event: push

File details

Details for the file ii_researcher-0.1.3-py3-none-any.whl.

File metadata

Download URL: ii_researcher-0.1.3-py3-none-any.whl
Upload date: Apr 14, 2025
Size: 120.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ii_researcher-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9b324855cbf90a7c5298d431a5ff9031f5921ee8c80b71a8de5abe93e24e8dbf`
MD5	`5814a78eb73c498bc38a9744f1aa10a1`
BLAKE2b-256	`ad6c5a9116a04edb9b12ae571b37cf0da43bbaa9aa455bc8310679710a2d37c4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ii_researcher-0.1.3-py3-none-any.whl:

Publisher: release_pypi.yaml on Intelligent-Internet/ii-researcher

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ii_researcher-0.1.3-py3-none-any.whl
- Subject digest: 9b324855cbf90a7c5298d431a5ff9031f5921ee8c80b71a8de5abe93e24e8dbf
- Sigstore transparency entry: 196615688
- Sigstore integration time: Apr 14, 2025
Source repository:
- Permalink: Intelligent-Internet/ii-researcher@8232733a9816c469d2b099b6265b584703e21d6b
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/Intelligent-Internet
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release_pypi.yaml@8232733a9816c469d2b099b6265b584703e21d6b
- Trigger Event: push

ii-researcher 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

II-Researcher

Features

🎬 Demo

🎬 MCP

🔧 Required Software

🛠️ Installation and Setup

Option 1: Install from PyPI

Option 2: Install from Source

1. Clone the repository:

2. Install the package in development mode:

3. Set up your environment variables:

4. Configure and Run LiteLLM (Local LLM Server):

5. (Optional) Configure and Run LiteLLM with OpenRouter:

🖥️ Usage

Using the CLI

Using MCP

Using the Web Interface

🐳 Run with Docker

🛠️ Running QwQ Model with SGLang

💡 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance