Local document/photo search with WhatsApp bot
Project description
Pinpoint
Search all your local files from WhatsApp. No cloud uploads. No tracking. Everything stays on your machine.
Pinpoint indexes your documents, PDFs, spreadsheets, images, and media into a local SQLite database, then lets you search and work with them through natural language — either via WhatsApp or direct API calls.
What You Can Do
- "Find the Sharma invoice from last month" — searches across all your indexed documents instantly
- "What's in that Excel in Downloads?" — reads and analyzes spreadsheets, CSVs, PDFs on the fly
- "Find everyone over 30 in the contacts spreadsheet" — search, filter, group, sort within Excel/CSV files
- "Create an Excel with these expense totals" — generates spreadsheets, text files, charts from conversation
- Send a purchase order image, ask "turn this into Excel" — extracts tables from images/PDFs into spreadsheets
- "Merge these 3 PDFs into one" — merge, split PDFs, convert images to PDF and back
- "Move all receipts to the Tax folder" — batch file operations through conversation
- "Group my wedding photos by category" — AI classifies photos into groups and sorts them into folders
- "Cull my camera roll — keep the best 80%" — AI scores every photo for quality and separates rejects
- "Who is this person?" — remembers faces, recognizes them across photos later
- "OCR this scanned document" — extracts text from images and scanned PDFs
- "Send me that PDF" — sends files directly to your WhatsApp
- Send a photo/file to the bot — saves it to your PC, renames it, organizes into folders you choose. Important WhatsApp images and documents never get lost
- "Create a folder called Tax 2025 and move all receipts there" — creates folders and organizes files through conversation
- "Find Sharma's phone number from that Excel" — searches inside spreadsheets with smart phone/ID normalization (finds "920-889-6630" when you type "9208896630")
- "Where's that chart I made yesterday?" — searches files Pinpoint itself created in past conversations
- "Send an email to john@company.com with the Q1 report" — Gmail send, Calendar create, Drive upload (requires gws CLI + Google auth)
- "Remind me to call the dentist at 3pm" — persistent reminders that survive restarts and reconnects
- "Remember that my car insurance expires in March" — persistent memory across conversations
- "Watch my Documents folder" — auto-indexes new files as they appear
- Voice messages — transcribes and responds to audio messages
How It Works
Your Files ──> Indexer ──> SQLite/FTS5 ──> Search API ──> WhatsApp Bot
(local) (extract) (local DB) (FastAPI) (Gemini AI)
- Index — Pinpoint extracts text from PDFs, Office docs, images (OCR), spreadsheets, and plain text files
- Search — FTS5 full-text search with metadata-aware ranking and smart fallbacks
- Clarify — When results are ambiguous (multiple similar matches), Pinpoint asks "which one do you mean?" instead of guessing wrong
- Act — The WhatsApp bot turns your questions into tool calls: search, read, move, analyze, transform
Everything runs locally. The only external calls are to Gemini (for the AI layer) and optionally to web search providers.
Important: Search only finds files that have been indexed. Files get indexed when you:
- Explicitly index a file or folder (
/index-file,/index) - Watch a folder — new files are picked up every 60 minutes
- Read or analyze a file — auto-indexes in the background for future searches
Pinpoint does not scan your entire computer automatically. You control what gets indexed.
Gets Smarter Over Time
Every interaction builds a local cache that makes future operations faster and cheaper:
- Documents — text, chunks, and embeddings stored after first index. Re-search is instant, no re-extraction.
- Images — embeddings cached after first search or group. Next time you search or group the same folder, cached images are free.
- Videos — frame embeddings stored per video. Searching the same video again costs nothing.
- Photo scores — culling scores cached by file mtime. Re-running cull on the same folder skips already-scored photos.
- Photo classifications — grouping results cached. Re-grouping reuses existing classifications.
- Faces — detected face data cached per image. Recognition on already-scanned photos is instant.
- Facts — extracted key facts stored per document. Fact search never re-extracts.
- Search queries — query expansion and reranking results cached. Repeated searches are free.
If you cancel a long job halfway (like embedding 1000 photos), the work already done is saved. Next run picks up where it left off.
Memory System
Pinpoint has a 4-layer memory system that learns from everyday use:
Conversation memory — Keeps the last 50 messages per session. In the bot flow, long conversations are compacted instead of simply truncated, so important outcomes can survive even when older turns are compressed. Idle chats reset after 60 minutes.
Persistent personal memory — "Remember that my passport number is X12345." Stored permanently in SQLite, searchable with FTS5, and survives restarts. When you save a new fact, Gemini can decide to add it, update an existing memory, merge complementary details, ignore duplicates, or supersede a contradiction with an audit trail. You can also forget by description — "forget my old address" — without needing an internal ID.
Document fact extraction — When a file is indexed, Gemini extracts key facts such as names, dates, amounts, and topics, then stores them separately from the raw document text. You can search facts directly without reopening the full file.
Face memory — "Remember this is John." Saves face embeddings persistently so future face detection runs can recognize the same person across photos.
Quick Start
Backend only (search + file APIs)
# 1. Create the environment
conda env create -f environment.yml
conda activate pinpoint
# 2. Configure shared user settings
pinpoint setup
# 3. Validate local setup
pinpoint doctor
# 4. Start
pinpoint api
The API runs at http://localhost:5123. Try http://localhost:5123/docs for interactive API docs.
With WhatsApp bot
# Install bot deps
cd bot && npm install && cd ..
# Start everything
pinpoint start
Scan the QR code with WhatsApp to pair. Then just message your files.
Common Things You Can Do
| Ask this | Pinpoint does this |
|---|---|
| "Find invoice 4821" | Searches indexed documents by content and filename |
| "Read the quarterly report PDF" | Extracts and returns the text |
| "Search for Sharma across all files" | Full-text search with ranked results |
| "Analyze the sales spreadsheet" | Loads Excel/CSV into pandas, runs queries |
| "Filter rows where amount > 5000" | Search, filter, group, sort within spreadsheets |
| "Create an Excel summary of Q1 expenses" | Generates new spreadsheets from conversation |
| (send purchase order image) "make this an Excel" | Extracts tables from images/PDFs into spreadsheets |
| "Merge invoice_1.pdf and invoice_2.pdf" | Merge, split PDFs |
| "Convert these images to a single PDF" | Images to PDF, PDF to images |
| "Make a bar chart of sales by month" | Generates charts from data |
| "Move old files to archive" | Batch file operations |
| "Watch my Downloads folder" | Auto-indexes new files every 60 minutes |
| "Find photos of the beach" | Visual image search across your photos |
| "OCR this scanned receipt" | Extracts text from images/scanned PDFs |
| "Group wedding photos by category" | AI classifies and sorts photos into folders |
| "Cull my camera roll, keep best 80%" | AI scores photos, separates rejects into a folder |
| "Who is this person?" | Face detection + recognition across photos |
| "Send me that report" | Sends the file to your WhatsApp chat |
| (send a photo/file to bot) | Saves to PC, renames, puts in your chosen folder |
| "Make a folder called Invoices 2025" | Creates folders on your computer |
| "Find Sharma's number from contacts.xlsx" | Searches inside spreadsheets, normalizes phone/ID formats |
| "Where's that chart I made yesterday?" | Searches files Pinpoint created in past conversations |
| "Email john@company.com the Q1 report" | Gmail send with attachment (needs gws CLI setup) |
| "Remind me at 5pm to call bank" | Persistent reminders — survive restarts |
| "Remember my passport number is X" | Stores in persistent memory |
| (voice message) | Transcribes audio and responds |
What's Stable vs Optional
Stable core — works without any API keys:
- Document search (FTS5)
- File read/list/move/rename/delete
- Auto-indexing on file access
- Watch folders
- Background job tracking
- Data analysis (Excel, CSV)
Optional — needs Gemini API key or extra setup:
- WhatsApp bot (needs Gemini + WhatsApp pairing)
- OCR, captioning, fact extraction (Gemini-powered)
- Photo culling/scoring (Gemini Flash — vision judges quality, needs to "see" each photo)
- Photo grouping by category (Gemini Embedding 2 — cheap, classifies by similarity not vision)
- Visual image/video search (Gemini Embedding 2 — text-to-image similarity)
- Face recognition (needs insightface + GPU)
- Google Workspace — Gmail, Calendar, Drive (needs gws CLI:
npm install -g @googleworkspace/cli && gws auth login) - Web search (needs Jina or LangSearch API key)
Note: Gemini Embedding 2 is used for image/video/photo features. Document text search uses FTS5 by default — embedding-based document search exists but is not the default path.
Architecture
pinpoint/
run_api.py # Backend entrypoint (port 5123)
api/ # FastAPI routers
core.py # health, indexing
search.py # document search, facts, web read
files.py # file ops, watch folders, background jobs
data.py # Excel/CSV analysis
media.py # image/video/audio search, OCR
photos.py # photo scoring, culling, grouping
faces.py # face detection and recognition
transform.py # file/image/PDF transforms
memory.py # conversation memory
google.py # Google Workspace integration
search_pipeline.py # Search: FTS5, ranking, ambiguity detection
indexing_service.py # Shared index/chunk/embed pipeline
job_service.py # Persistent background job lifecycle
database.py # SQLite schema and helpers
extractors.py # Text extraction (PDF, Office, images, OCR)
bot/
index.js # WhatsApp bot entrypoint
src/tools.js # Gemini tool declarations
src/llm.js # LLM loop (Gemini / Ollama)
src/skills.js # Skill system for tool routing
For deeper details: docs/architecture.md
Repo Layout
The main product surface is intentionally small:
api/— FastAPI routers and API-facing behaviorpinpoint/— Python package and CLI entry pointsbot/— WhatsApp bot packageskills/— skill markdowns shipped with the productbenchmarks/— search evaluation datasets, reports, and benchmark scriptstests/— regression and packaging coveragedocs/— product docs, troubleshooting, and release notes
Internal planning notes and downloaded comparison repos are kept out of the GitHub-facing product surface.
Environment Variables
Pinpoint stores shared CLI/bot config in ~/.pinpoint/.env. pinpoint setup writes that file for you.
Key variables:
| Variable | Required? | What it does |
|---|---|---|
GEMINI_API_KEY |
For AI features | Enables bot, OCR, media, photo workflows |
API_SECRET |
No | If set, all API requests need X-API-Secret header |
GEMINI_MODEL |
No | Defaults to gemini-3.1-flash-lite-preview |
OLLAMA_MODEL |
No | Use local Ollama instead of Gemini for bot |
JINA_API_KEY |
No | Enables web search via Jina |
Tests
conda run -n pinpoint python -m pytest tests/ -q
The suite covers search, indexing, file operations, jobs, packaging, security, and API contracts.
CLI
Useful commands:
pinpoint setup
pinpoint doctor
pinpoint api
pinpoint start
pinpoint search "invoice 4821"
pinpoint index /path/to/folder
pinpoint status
pinpoint logs
Benchmarks
Pinpoint includes offline search evaluation and load testing:
# Search quality
python evaluate_search.py --dataset benchmarks/search_relevance_v4_mixed.json --corpus benchmarks/corpus_v4_mixed
# Concurrent load
python load_test_search.py --corpus benchmarks/corpus_v4_mixed --rounds 10 --concurrency 8
Current results on the mixed-domain offline benchmark: 94.4% success@1, perfect recall, ~4.3ms average query latency. In a concurrent load test (concurrency=2, rounds=1): ~216 QPS, ~9.2ms average wall latency.
See benchmarks/README.md for details.
Docs
- Architecture — how the system is built
- Benchmark Summary — what Pinpoint has measured so far
- Stability Policy — what Pinpoint treats as stable, optional, or experimental
- Troubleshooting — common issues and fixes
- Release Checklist — what to check before pushing
- Contributing — how to contribute
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pinpoint_search-1.0.0.tar.gz.
File metadata
- Download URL: pinpoint_search-1.0.0.tar.gz
- Upload date:
- Size: 172.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ef1f5af0d2c52d8425f1c0fd9c8800fb344aeaa02acde06f5fe06926975afb9
|
|
| MD5 |
9ffd1ac7fe465ee2a990ecaf914423df
|
|
| BLAKE2b-256 |
a16ae9b6ce186ee287a66b00fe7446ad05583290bea2fd1986a2d9bdea139ade
|
File details
Details for the file pinpoint_search-1.0.0-py3-none-any.whl.
File metadata
- Download URL: pinpoint_search-1.0.0-py3-none-any.whl
- Upload date:
- Size: 157.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cb34e05297c86c4915714bb2dbad7f5ee7673a93621835f6d3d6359e3449906
|
|
| MD5 |
b6e6416ba2b11e67151da8201751e359
|
|
| BLAKE2b-256 |
0fefd10619befa1fd915dc41731042a54e06960bbad2cbe9acf29cd78f5dbde1
|