SmartWebSearch is a Python package that combines the Tavily search API with Retrieval-Augmented Generation (RAG), LLM-powered query expansion, and web content extraction to perform intelligent, deep web searches with automated summarization.
Project description
Smart Web Search Package
SmartWebSearch is a Python package that combines the Tavily search API with Retrieval-Augmented Generation (RAG), LLM-powered query expansion, and web content extraction to perform intelligent, deep web searches with automated summarization.
Package Version
- 1.3.1
Features
- 🌐 Web Search – Uses Tavily API to fetch relevant search results.
- 🧠 Query Expansion – Leverages LLMs (e.g., DeepSeek) to decompose complex queries and generate auxiliary searches.
- 📄 Content Extraction – Fetches full page content using headless Chrome and filters noise.
- 🔍 RAG Pipeline – Embeds documents with multilingual models (e.g., multilingual-e5-base) and retrieves context-aware chunks.
- 📝 Summarization – Summarizes retrieved content using LLMs.
Environment
- Python 3.12 or above
- Windows 11 Pro 64-bit (macOS haven't tested)
- Python Packages (requests, bs4, selenium, markdownify, tavily, numpy, sentence_transformers, langchain_text_splitters)
Installation
Method 1
- PYPI: Install the SmartWebSearch package from PYPI through command
pip install smartwebsearch
Method 2
- The SmartWebSearch Package: Install the SmartWebSearch package here or with git command
git clone https://github.com/LittleWai07/smart-web-search-package.git(Git is required to run this command) - Required Python Packages: Install the required Python packages by command
pip install -r requirements.txt
API Keys
You need two API keys
- Tavily API key: Sign up and get the API key here (1,000 free quotas per month)
- OpenAI Compatible API key: eg., from OpenAI, DeepSeek, etc.
🔒 Security Note
For security reasons, never hard-code your API keys directly in your source code.
Instead, store them in environment variables, a .env file or a *.json file and load them into your program.
Quick Start
Fill in the API keys and following required parameters manually.
- Tavily API Key: The Tavily search API key (The key starts with
tvly-dev-). - OpenAI Compatible API Key: The API key for the OpenAI Compatible API platform (The key usually starts with
sk-). - AI Model: The id of the AI model used for summarization. (Default:
deepseek-chat) - OpenAI Compatible API Base URL: The base url of the OpenAI Compatible API platform (The URL usually end with
/chat/completions) (Default:https://api.deepseek.com/chat/completions)
"""
SmartWebSearch
~~~~~~~~~~~~
An example of how to use the SmartWebSearch package.
"""
# Import the SmartWebSearch package
import SmartWebSearch as sws
# --------------------------------------------------------------------
# You can configure for different API providers by changing the
# model and base_url. Below are some examples:
# --------------------------------------------------------------------
# Example 1: Using DeepSeek (default)
search = sws.SmartWebSearch(
"<Tavily API Key>",
"<OpenAI Compatible API Key>",
model="deepseek-chat",
openai_comp_api_base_url="https://api.deepseek.com/chat/completions"
)
# Example 2: Using OpenAI
# search = sws.SmartWebSearch(
# "<Tavily API Key>",
# "<OpenAI Compatible API Key>",
# model="gpt-4-turbo-preview",
# openai_comp_api_base_url="https://api.openai.com/v1/chat/completions"
# )
# --------------------------------------------------------------------
# Run a search
# --------------------------------------------------------------------
prompt = input("Enter a prompt: ")
print("=== Normal Search (Tavily summaries) ===")
print(search.search(prompt))
print("\n=== Deep Search (full page content + RAG) ===")
print(search.deepsearch(prompt))
Note: The documentation of this package will be completed in the future.
License
This project is licensed under the MIT License - see the LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smartwebsearch-1.3.1.tar.gz.
File metadata
- Download URL: smartwebsearch-1.3.1.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c84152631dc3ca7e04f6c0ceffac8ddcf2bcdcf090ddbad44a9202989e771ac1
|
|
| MD5 |
8f92ee65dc199c2e20536aefa78476b6
|
|
| BLAKE2b-256 |
78c77b59a6f3b689b7fa713dd7f1daabf2a696a98eb2c7445f1daeaaaa106b47
|
File details
Details for the file smartwebsearch-1.3.1-py3-none-any.whl.
File metadata
- Download URL: smartwebsearch-1.3.1-py3-none-any.whl
- Upload date:
- Size: 28.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c81d8c92e24fe6dad0d0139cf7aee384b81b68d7553b720fc451bff56e4b3c9
|
|
| MD5 |
fe15059e6a259c6172460359ea4831f3
|
|
| BLAKE2b-256 |
7431cdc4aa761846a65f69163cd79dc08c204266c3ac26e08bebc21458b3c394
|