AI-powered Wikipedia navigation using semantic similarity
Project description
WikiRaces 🏁
AI-powered Wikipedia navigation using semantic similarity. WikiRaces finds intelligent paths between Wikipedia articles by understanding content semantically, not just following random links.
Features ✨
- Semantic Navigation: Uses sentence transformers to understand article content and find meaningful connections
- Smart Path Finding: Avoids dead ends and cycles while navigating toward the target
- Real-time Progress: Beautiful progress bars showing confidence and current article
- Robust Error Handling: Gracefully handles missing pages, disambiguation pages, and network issues
- Local AI Models: No external API dependencies - everything runs locally
Installation 📦
pip install wikiraces
Quick Start 🚀
from wikiraces import WikiBot
# Create a bot to navigate from Python to Artificial Intelligence
bot = WikiBot("Python (programming language)", "Artificial intelligence")
# Run the navigation
success = bot.run()
if success:
print(f"Found path in {len(bot.path) - 1} steps!")
print(" -> ".join(bot.path))
else:
print("Could not find a path")
Advanced Usage 🔧
Customize Search Parameters
# Limit the number of candidate links to consider at each step
bot = WikiBot("Source Article", "Target Article", limit=20)
# Check if articles exist before starting
if bot.exists("Some Article"):
print("Article exists!")
# Get links from any Wikipedia page
links = bot.links("Python (programming language)")
print(f"Found {len(links)} outgoing links")
Semantic Similarity
from wikiraces.embed import most_similar_with_scores
# Find most semantically similar articles
candidates = ["Machine Learning", "Data Science", "Web Development"]
similar = most_similar_with_scores("Artificial Intelligence", candidates)
for article, score in similar:
print(f"{article}: {score:.3f}")
How It Works 🧠
- Start at the source Wikipedia article
- Extract all outgoing links from the current article
- Filter out dead ends and previously visited pages
- Rank candidate links by semantic similarity to the target
- Rerank using article summaries for better context understanding
- Move to the most promising next article
- Repeat until reaching the target or getting stuck
API Reference 📚
WikiBot Class
class WikiBot:
def __init__(self, source: str, destination: str, limit: int = 15)
def run() -> bool
def exists(page: str) -> bool
def links(page: str) -> list[str]
Parameters:
source: Starting Wikipedia article titledestination: Target Wikipedia article titlelimit: Maximum number of candidate links to consider (default: 15)
Returns:
run(): True if path found, False otherwiseexists(): True if Wikipedia page existslinks(): List of outgoing links from the page
Development 🛠️
# Clone the repository
git clone https://github.com/markshteyn/wikiraces.git
cd wikiraces
# Install with Poetry
poetry install
# Run tests
poetry run pytest
# Run with verbose output
poetry run pytest -v -s
Requirements 📋
- Python 3.9+
- sentence-transformers
- wikipedia
- numpy
- tqdm
License 📄
MIT License - see LICENSE file for details.
Contributing 🤝
Contributions welcome! Please feel free to submit a Pull Request.
Acknowledgments 🙏
- Built with sentence-transformers for semantic understanding
- Uses the wikipedia library for API access
- Progress bars powered by tqdm
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wikiraces-0.1.0.tar.gz.
File metadata
- Download URL: wikiraces-0.1.0.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.5 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80771b945473d8676fbe83ea212b25f5e64c24be7916182d25722204baa955cc
|
|
| MD5 |
cfb57133ec3a4813bf119b5a9fbeb1d0
|
|
| BLAKE2b-256 |
a41092afbee2c1e98ebd964256ace1fae1c73afcc7eabdb8dcef06a74b44f133
|
File details
Details for the file wikiraces-0.1.0-py3-none-any.whl.
File metadata
- Download URL: wikiraces-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.5 Darwin/24.5.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47df2414ea8deed4fc9b0977c268a0e758475848e8274426034f3600b53bbba7
|
|
| MD5 |
6590dc686d05c66167a741bb126eb699
|
|
| BLAKE2b-256 |
b19be173b31768d58590cf605c1b20422e01a1575f10dc5b631f8e9fb0ebd31e
|