LocalSearch: An open source framework and dataset for webscale local search.
Project description
AgentSearch [ΨΦ]: A Comprehensive Agent-First Framework and Dataset for Webscale Search
AgentSearch is a powerful new tool that allows you to operate a webscale search engine locally, catering to both Large Language Models (LLMs) and human users. This open-source initiative provides access to over one billion high-quality embeddings sourced from a wide array of content, including selectively filtered Creative Commons data and the entirety of Arxiv, Wikipedia, and Project Gutenberg.
Features of AgentSearch
- Gated Access: Controlled and secure access to the search engine, ensuring data integrity and privacy.
- Offline Support: Ability to operate in a fully offline environment.
- Customizable: Upload your own local data or tailor the provided datasets according to your needs.
- API Endpoint: AgentSearch offers a fully managed access through a dedicated API, facilitating easy and efficient integration into various workflows.
Quickstart Guide for AgentSearch
Follow this guide for a streamlined setup and demonstration of the AgentSearch project.
Prerequisites
Make sure Docker is installed on your system. If not, download and install it from Docker's official website.
Quick Setup
-
Install the AgentSearch client by executing:
git clone https://github.com/AgentSearch-AI/agent-search.git && cd agent-search pip install -e .
Running a Query
-
To perform a query and witness AgentSearch in action, use:
python agent_search/script/run_query.py query --query="What is Fermat's last theorem?"
Note that this command assumes you have followed the steps below to launch your local agent-first search engine. For remote access to our search engine, please register for a free API key at AgentSearch.
Local Setup and Initialization
-
Database Population:
-
Populate the SQLite database with this command:
python agent_search/script/populate_dbs.py populate_sqlite
This creates a SQLite database
open_web_search.db
in thedata
directory. This script can be readily adopted to your own bespoke datasets. For a direct installation of the 1TB data into the database, please use [insert link].
-
-
Start Qdrant (vector database) Service with Docker:
-
Run Qdrant service in a Docker container with this command, which sets up the necessary ports and storage:
docker run -p 6333:6333 -p 6334:6334 \ -v $(pwd)/qdrant_storage:/qdrant/storage:z \ qdrant/qdrant
For installation guidance on Qdrant, refer to their documentation.
-
-
Run the Server:
-
Launch the AgentSearch server:
python agent_search/app/server.py
-
Additional Notes
- Run all commands from the root directory of the AgentSearch project.
- Replace the
query
in the run command with your desired search query.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for agent_search-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f0ce768df95febf337ffa996bca6e64ebbad24dad6245e564bee2f5f0557d8e8 |
|
MD5 | 382817d9f0243fbbea36a3edbdc87000 |
|
BLAKE2b-256 | 220835826bcd1a9db00f19d40f38fc1b71a3fbd5db88c7a0b47dde5be98c14c1 |