Skip to main content

LocalSearch: An open source framework and dataset for webscale local search.

Project description

AgentSearch [ΨΦ]: A Comprehensive Agent-First Framework and Dataset for Webscale Search

AgentSearch is a powerful new tool that allows you to operate a webscale search engine locally, catering to both Large Language Models (LLMs) and human users. This open-source initiative provides access to over one billion high-quality embeddings sourced from a wide array of content, including selectively filtered Creative Commons data and the entirety of Arxiv, Wikipedia, and Project Gutenberg.

Features of AgentSearch

  • Gated Access: Controlled and secure access to the search engine, ensuring data integrity and privacy.
  • Offline Support: Ability to operate in a fully offline environment.
  • Customizable: Upload your own local data or tailor the provided datasets according to your needs.
  • API Endpoint: AgentSearch offers a fully managed access through a dedicated API, facilitating easy and efficient integration into various workflows.

Quickstart Guide for AgentSearch

Follow this guide for a streamlined setup and demonstration of the AgentSearch project.

Prerequisites

Make sure Docker is installed on your system. If not, download and install it from Docker's official website.

Quick Setup

  1. Install the AgentSearch client by executing:

    git clone https://github.com/AgentSearch-AI/agent-search.git && cd agent-search
    pip install -e .
    

Running a Query

  • To perform a query and witness AgentSearch in action, use:

    python agent_search/script/run_query.py query --query="What is Fermat's last theorem?"
    

    Note that this command assumes you have followed the steps below to launch your local agent-first search engine. For remote access to our search engine, please register for a free API key at AgentSearch.

Local Setup and Initialization

  1. Database Population:

    • Populate the SQLite database with this command:

      python agent_search/script/populate_dbs.py populate_sqlite
      

      This creates a SQLite database open_web_search.db in the data directory. This script can be readily adopted to your own bespoke datasets. For a direct installation of the 1TB data into the database, please use [insert link].

  2. Start Qdrant (vector database) Service with Docker:

    • Run Qdrant service in a Docker container with this command, which sets up the necessary ports and storage:

      docker run -p 6333:6333 -p 6334:6334 \
          -v $(pwd)/qdrant_storage:/qdrant/storage:z \
          qdrant/qdrant
      

      For installation guidance on Qdrant, refer to their documentation.

  3. Run the Server:

    • Launch the AgentSearch server:

      python agent_search/app/server.py
      

Additional Notes

  • Run all commands from the root directory of the AgentSearch project.
  • Replace the query in the run command with your desired search query.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_search-0.0.1.tar.gz (159.0 kB view hashes)

Uploaded Source

Built Distribution

agent_search-0.0.1-py3-none-any.whl (158.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page