Extended storage, searchable memory and instant retrieval
Project description
Sahara
Extended storage, searchable memory and instant retrieval.
Sahara turns folders on your computer into searchable memory. Find files by meaning, ask questions with cited sources, and expose the same local index to MCP clients. External drives, MinIO, and AWS storage are optional extensions, not prerequisites.
Local-first: indexing and semantic search run on your computer. No account, API key, storage bucket, or additional drive is required for the core search experience.
Latest release: v0.2.1
(June 7, 2026) adds index-only setup, multiple content roots, verified offload/fetch,
one-command Claude Desktop setup, and trusted sahara-memory packaging. See the
changelog.
Fictional documents shown in a generic MCP client. Sahara retrieves from the configured local index, cites its sources, and reports when a requested detail is absent.
What Sahara Does
- Searches PDFs, DOCX files, notes, code, and other text documents by meaning
- Answers questions over indexed files with source paths and supporting snippets
- Indexes multiple folders without copying them to a storage backend
- Exposes read-only search and Q&A tools through MCP
- Optionally syncs selected folders to a drive, NAS, MinIO, or AWS S3
- Can offload verified stored files while keeping their indexed content searchable
Sahara is a single-user CLI and local retrieval service. It is not a hosted cloud service, autonomous agent, or general filesystem access layer.
Quick Start
Sahara requires Python 3.11 or newer.
The Python distribution is named sahara-memory, but it installs the sahara
command. Do not run pip install sahara; that name belongs to the unrelated OpenStack
project.
Install Sahara from PyPI:
python3 -m pip install "sahara-memory[search,mcp]"
sahara init --mode basic --folder ~/Documents
sahara index
sahara search "my tax return from 2024" --snippet
On Windows, use py -3.11 -m pip instead of python3 -m pip.
The first sahara index downloads a local embedding model of roughly 200 MB.
Hugging Face may show an unauthenticated-download warning; no account or token is
required.
Add more folders whenever you need them:
sahara folder add ~/Projects
sahara index
Every folder added this way remains index-only unless you explicitly enable storage sync for it.
Ask Questions
Semantic search does not require an LLM. sahara ask adds an optional answer-generation
step after Sahara retrieves the relevant local passages.
Local answers with Ollama
Install Ollama, then download Sahara's default model:
ollama pull mistral
sahara ask --snippet "what does the lease say about pets?"
The current Mistral download is approximately 4.4 GB. New Sahara installations use
Ollama as the answer provider. If Ollama is not already running, launch the application
or run ollama serve in another terminal.
OpenAI without Ollama
Ollama is not required when you prefer OpenAI:
export OPENAI_API_KEY="your-api-key"
sahara config set answer_provider openai
sahara ask --snippet "what does the lease say about pets?"
Windows PowerShell:
$env:OPENAI_API_KEY = "your-api-key"
sahara config set answer_provider openai
sahara ask --snippet "what does the lease say about pets?"
Sahara stores the provider preference, not the API key. When OpenAI is selected, the question and retrieved snippets needed to answer it are sent to OpenAI. OpenAI API billing is separate from a ChatGPT subscription.
See Answer Provider Setup for installation, model selection, privacy details, and troubleshooting.
Connect an MCP Client
Sahara exposes five read-only MCP tools for search, cited Q&A, chunk reads, folder listing, and index status. These tools operate only on Sahara's indexed corpus; they cannot browse arbitrary files or modify your data.
Claude Desktop is the first tested client:
sahara mcp install-claude
Fully quit and reopen Claude Desktop, then confirm sahara appears under Connectors. The installer preserves existing settings and MCP servers, uses Sahara's absolute executable path, and creates a backup before changing an existing config.
See Claude Desktop Setup for verification and troubleshooting, or MCP Integrations for the tool surface and authenticated remote transport.
Optional Storage
Start with local indexing. Add storage later without rebuilding the semantic index.
| Setup | What it provides | Status |
|---|---|---|
| Basic | Local indexing across one or more folders | Core mode |
| Local drive | Copies selected folders to an external drive, NAS, or network share | Optional |
| AWS | Copies selected folders to S3, with optional Glacier features | Optional |
Attach a local drive:
sahara storage configure local --drive /Volumes/Archive/Sahara
sahara folder sync ~/Documents --enable
sahara sync
Attach AWS:
sahara storage configure aws \
--bucket my-sahara-bucket \
--region us-east-1
sahara folder sync ~/Documents --enable
sahara sync
MinIO and local-plus-Glacier modes are available through the interactive
sahara init wizard. See Getting Started for storage
credentials, content-root behavior, deletion semantics, and migration paths.
After a file has been synced and indexed, Sahara can free its source disk space while retaining search metadata:
sahara offload Documents/archive/report.pdf
sahara fetch Documents/archive/report.pdf
Offload verifies the stored copy before removing the local source. Ordinary filesystem deletion is not treated as offload.
Privacy and Security
- The semantic index is stored locally in
~/.sahara/state.db. - Indexing, embeddings, and
sahara searchstay local. - Ollama answer generation stays local.
- OpenAI receives the question and retrieved snippets when explicitly selected.
- MCP is read-only and scoped to indexed content.
- Remote MCP requires authentication by default and supports tool, folder, and snippet limits.
- Optional storage encryption uses client-side AES-256-GCM.
Review SECURITY.md before exposing MCP remotely or relying on encrypted storage.
Supported Content
Sahara extracts text from:
- PDF and DOCX documents
- Markdown, reStructuredText, and plain text
- Python, JavaScript, TypeScript, JSON, YAML, TOML, CSV, HTML, and XML
- Other files that can be safely detected as UTF-8 text
Current limitations:
- Scanned PDFs and images are not searchable because OCR is not implemented yet.
- Audio and video transcription are not supported.
- Sahara is designed for one user and one local index.
- The project is beta; keep independent backups of important files.
Use sahara index-report to inspect indexed files, unsupported content, and failures.
Create a .saharaignore file in any indexed folder to exclude content using
gitignore-style patterns:
.env*
secrets/
node_modules/
*.tmp
Start from the example ignore file for common operating-system, editor, build, and credential exclusions.
Core Commands
| Command | Purpose |
|---|---|
sahara init --mode basic --folder PATH |
Create an index-only local library |
sahara folder add/list/remove |
Manage indexed folders |
sahara index [--force] |
Build or refresh the semantic index |
sahara index-report |
Inspect indexing coverage and failures |
sahara search QUERY |
Find files and passages by meaning |
sahara ask --snippet QUESTION |
Generate an answer and show supporting sources |
sahara mcp install-claude |
Connect Sahara to Claude Desktop |
sahara mcp serve |
Run the read-only MCP server |
Storage and operational command groups
| Command group | Purpose |
|---|---|
sahara storage ... |
Configure, inspect, or disable optional storage |
sahara folder sync ... |
Choose which indexed folders also sync |
sahara sync/push/pull/status |
Inspect and execute storage synchronization |
sahara offload/fetch |
Free and restore local space with verification |
sahara encryption ... |
Configure or rotate storage encryption |
sahara doctor |
Diagnose configuration and connectivity |
sahara daemon ... |
Manage background watching and synchronization |
sahara config ... |
Inspect or change configuration |
See the complete command reference, or run
sahara --help and sahara COMMAND --help for live CLI help.
Documentation
- Getting Started: index-only, local-drive, and AWS paths
- Command Reference: every CLI command grouped by purpose
- Answer Providers: Ollama and OpenAI setup
- Claude Desktop: installation, MCP contract, and troubleshooting
- Security: threat model, encryption, and vulnerability reporting
- Roadmap: current scope, planned work, and non-goals
- Architecture: system design and extension points
- Contributing: development setup, tests, and pull requests
- Changelog: release history
License
Sahara is available under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sahara_memory-0.2.1.tar.gz.
File metadata
- Download URL: sahara_memory-0.2.1.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b19c62c628ab25c5ca8ad5a52030ae9998463004bd1b58d65dc6d541de4dba0a
|
|
| MD5 |
589107c89221e4b6ba7830c0f91e3f7d
|
|
| BLAKE2b-256 |
e82d5e13178ac92d30688ba37cc3a654c9eafed371cdf6314a5c9ea366643a57
|
Provenance
The following attestation bundles were made for sahara_memory-0.2.1.tar.gz:
Publisher:
publish.yml on nidheesh-p/sahara
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sahara_memory-0.2.1.tar.gz -
Subject digest:
b19c62c628ab25c5ca8ad5a52030ae9998463004bd1b58d65dc6d541de4dba0a - Sigstore transparency entry: 1753088663
- Sigstore integration time:
-
Permalink:
nidheesh-p/sahara@9050050c4538522ad81e2bfe3454bc787f619597 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/nidheesh-p
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9050050c4538522ad81e2bfe3454bc787f619597 -
Trigger Event:
release
-
Statement type:
File details
Details for the file sahara_memory-0.2.1-py3-none-any.whl.
File metadata
- Download URL: sahara_memory-0.2.1-py3-none-any.whl
- Upload date:
- Size: 98.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07d27d9d6949156e7285ad3573b37e63a5cc1a8f0ccbc0b888102f2486dd22ec
|
|
| MD5 |
56a368a19e5a14c22e4388e0247a50a0
|
|
| BLAKE2b-256 |
4b77a57846236c4610ccb610cbe0c87cee7d07e38022e0c00b926643e80bc426
|
Provenance
The following attestation bundles were made for sahara_memory-0.2.1-py3-none-any.whl:
Publisher:
publish.yml on nidheesh-p/sahara
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sahara_memory-0.2.1-py3-none-any.whl -
Subject digest:
07d27d9d6949156e7285ad3573b37e63a5cc1a8f0ccbc0b888102f2486dd22ec - Sigstore transparency entry: 1753088741
- Sigstore integration time:
-
Permalink:
nidheesh-p/sahara@9050050c4538522ad81e2bfe3454bc787f619597 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/nidheesh-p
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9050050c4538522ad81e2bfe3454bc787f619597 -
Trigger Event:
release
-
Statement type: