Async web search library supporting Google, Wikipedia, and arXiv
Project description
Web Search
Async web search library supporting Google Custom Search, Wikipedia, and arXiv APIs.
You can search across multiple sources and retrieve relevant, clean, and formatted results efficiently.
🌟 Features
- ⚡ Asynchronous Searching: Perform searches concurrently across multiple sources
- 🔗 Multi-Source Support: Query Google Custom Search, Wikipedia, and arXiv
- 🧹 Content extraction and cleaning
- 🔧 Configurable Search Parameters: Adjust maximum results, preview length, and sources.
📋 Prerequisites
- 🐍 Python 3.8 or newer
- 🔑 API keys and configuration:
- Google Search: Requires a Google API key and a Custom Search Engine (CSE) ID.
- arXiv: No API key required.
- Wikipedia: No API key required.
Set environment variables for Google API:
export GOOGLE_API_KEY="your_google_api_key"
export CSE_ID="your_cse_id"
📦 Installation
pip install async-web-search
🛠️ Usage
Example 1: Search across multiple sources
from web_search import WebSearch, WebSearchConfig, GoogleSearchConfig
config = WebSearchConfig(sources=["google", "arxiv"])
results = await WebSearch(config).search("quantum computing")
print(results)
Example 2: Google Search
from web_search import GoogleSearch, GoogleSearchConfig
config = GoogleSearchConfig(
api_key="your_google_api_key",
cse_id="your_cse_id",
max_results=5
)
results = await GoogleSearch(config)._search("quantum computing")
for result in results:
print(result)
Example 3: Wikipedia Search
from web_search import WikipediaSearch, BaseConfig
wiki_config = BaseConfig(max_results=5, max_preview_chars=500)
results = await WikipediaSearch(wiki_config)._search("deep learning")
for result in results:
print(result)
Example 4: ArXiv Search
from web_search import ArxivSearch, BaseConfig
arxiv_config = BaseConfig(max_results=3, max_preview_chars=800)
results = await ArxivSearch(arxiv_config)._search("neural networks")
for result in results:
print(result)
📘 API Overview
🔧 Configuration
- BaseConfig: Shared configuration for all sources (e.g., max_results, max_preview_chars).
- GoogleSearchConfig: Google-specific settings (e.g., api_key, cse_id).
- WebSearchConfig: Configuration for the overall search process (e.g., sources to query).
📚 Classes
- WebSearch: Entry point for performing searches across multiple sources.
- GoogleSearch: Handles searches via Google Custom Search Engine API.
- WikipediaSearch: Searches Wikipedia and retrieves article previews.
- ArxivSearch: Queries arXiv for academic papers.
⚙️ Methods
- search(query: str): Main search method for WebSearch.
- _search(query: str): Source-specific search logic for GoogleSearch, WikipediaSearch, and ArxivSearch.
🤝 Contributing
We welcome contributions! To contribute:
- Fork the repository.
- Create a new branch (git checkout -b feature-name).
- Commit your changes (git commit -am "Add new feature").
- Push to the branch (git push origin feature-name).
- Open a pull request.
🧪 Running Tests
pytest -v
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file async_web_search-0.2.0.tar.gz
.
File metadata
- Download URL: async_web_search-0.2.0.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b84cfc1eaae8240ffed33bdc429974777acedbc1c3df428418a4e8656253930d |
|
MD5 | 96ee2959164730737bd3efb441a91044 |
|
BLAKE2b-256 | be020f5392c746f43fb89f5d84c832997e6a5e42258badf60ed452ba98c1a14e |
File details
Details for the file async_web_search-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: async_web_search-0.2.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6d71795fcb35a8df3045aa8b860fddf7ff9e4d4e0133b49c339c2e327c2642cb |
|
MD5 | 82489ec039b5beb61056578af868c234 |
|
BLAKE2b-256 | 4501494150de7a56b0fa44f729d9bbbd4e4bad21769de5440c9e8deb7d44296e |