Local-first paper manager with semantic search and LLM reasoning
Project description
MAVYN - Your AI Research Assistant
A privacy-first research paper manager with natural language interface and local semantic search.
โจ Features
- ๐ Privacy-First: All papers stored locally, no cloud uploads
- ๐ฌ Natural Language Interface: Interactive REPL - just chat with your papers
- ๐ Fast Semantic Search: Local vector search across all papers
- ๐ค AI Q&A: Ask questions naturally without command prefixes
- ๐ Paper Comparison: Compare multiple papers with intelligent caching
- ๐ Similar Papers: Find related work in your library with optional arXiv suggestions
- ๐ Auto-Processing: One command to scan, rename, and embed
- ๐ Incremental Updates: 70-90% faster re-embedding
- ๐ Smart Cleanup: Automatically removes deleted papers from database
๐ Quick Start
1. Installation
pip install -r requirements.txt
2. Start MAVYN
mavyn
This launches the interactive assistant.
3. Initial Setup
MAVYN> /sync ~/Papers
# Your papers are now:
# โ Scanned and indexed
# โ Renamed with metadata
# โ Embedded for semantic search
# โ Ready for questions
4. List Your Papers
MAVYN> /list
# Shows numbered list of all papers
5. Ask Questions Naturally
MAVYN> tell me the methodology summary of paper 4
MAVYN> compare papers 1 and 5
MAVYN> find similar papers about transformers
MAVYN> what are the main findings in paper 7?
No command prefixes needed! Just type your question naturally.
๐ Complete Workflow
First-Time Setup
$ mavyn
MAVYN> /sync ~/Papers
# MAVYN will:
# 1. Scan all PDFs in ~/Papers
# 2. Extract metadata (title, authors, year, etc.)
# 3. Rename files for easy identification
# 4. Generate embeddings for semantic search
Daily Use
$ mavyn
MAVYN> /list
# See all your papers with IDs
MAVYN> what is the main contribution of paper 5?
# Get AI-powered summaries
MAVYN> compare the methodology in papers 2 and 7
# Side-by-side comparisons
MAVYN> find papers similar to paper 3
# Discover related work
MAVYN> /exit
# Exit when done
Auto-Sync New Papers
MAVYN> /sync ~/Papers --watch
# Now just drop new PDFs into ~/Papers
# They'll be automatically processed!
๐ง Available Commands
Slash Commands
/sync [directory]- Setup and sync your papers/sync --watch- Continuously monitor for new papers/list- List all indexed papers with IDs/help- Show help message/exit- Exit MAVYN
Natural Language Queries
No special syntax needed! Just ask naturally:
- "tell me about paper 5"
- "summarize the methodology of paper 3"
- "compare papers 1, 4, and 7"
- "find papers similar to paper 2"
- "what are the key contributions in paper 6?"
๐ค AI Q&A Setup (Optional)
Set up an API key to enable AI-powered question answering:
# Option 1: Environment variable
export GROQ_API_KEY="your_key_here"
# Option 2: .env file
echo "GROQ_API_KEY=your_key_here" > ~/.MAVYN/.env
Get Free API Keys:
- Groq - Fast and generous free tier (recommended)
- Google Gemini - Alternative option
Without API keys, MAVYN will still:
- Index and search papers
- Show relevant papers for your queries
- Perform all local operations
๐ Privacy
- All papers stay on your machine - never uploaded anywhere
- Embeddings generated locally - no external API calls
- Cloud APIs only for Q&A - and only if you configure them
- Similar papers + arXiv (optional) - If enabled, MAVYN sends a keyword search query (derived from your question or seed paper title/abstract) to export.arxiv.org. No PDFs are uploaded. Responses are cached locally for 24 hours.
- Database stored locally at
~/.MAVYN/MAVYN.db
๐ฆ Requirements
- Python 3.10 or higher
- ~500MB disk space for embedding models
- Internet connection only for optional AI Q&A and optional arXiv similar-paper search
๐ฏ Example Session
$ mavyn
Welcome to MAVYN - Your AI Research Assistant
Available Commands:
- /sync [directory] - Setup and sync your papers
- /list - List all indexed papers
- /help - Show help message
- /exit - Exit MAVYN
Natural Language Queries:
Just type your question naturally!
MAVYN> /sync ~/Documents/Papers
๐ Scanning: /Users/you/Documents/Papers
โ paper1.pdf [renamed] [embedded]
โ paper2.pdf [renamed] [embedded]
โ paper3.pdf [renamed] [embedded]
Sync Summary
โโโโโโโโโโโโ
Total files: 3
New papers: 3
Embedded: 3
MAVYN> /list
โโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโ
โ ID โ Title โ Authors โ Year โ
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโค
โ 1 โ Attention Is All You Need โ Vaswani et โ 2017 โ
โ 2 โ BERT: Pre-training of Deep Bidirect โ Devlin et โ 2018 โ
โ 3 โ GPT-3: Language Models are Few-Shot โ Brown et a โ 2020 โ
โโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโ
MAVYN> tell me about the methodology in paper 1
[AI generates detailed methodology summary...]
MAVYN> compare papers 1 and 2
[AI generates side-by-side comparison...]
MAVYN> /exit
Goodbye!
๐ ๏ธ Advanced Features
Incremental Updates
MAVYN is smart about re-processing papers:
- Only embeds new or modified papers
- Reuses unchanged content chunks
- 70-90% faster when updating existing papers
Paper Comparison Caching
When you compare papers, results are cached:
- Instant results for repeat comparisons
- Reduces API costs
- Cached for 24 hours
arXiv Integration
Find related papers beyond your library:
- Set
MAVYN_ARXIV_RELATED=1or use--arxivflag (when available) - MAVYN queries arXiv API with keyword search
- Results are deduplicated against your library
- Ranked by semantic similarity
๐บ๏ธ Project Structure
MAVYN/
โโโ ~/.MAVYN/ # User data directory
โ โโโ MAVYN.db # Local database
โ โโโ search.index # FAISS vector index
โ โโโ .env # Optional API keys
โโโ src/MAVYN/ # Source code
โโโ cli/ # Interactive REPL & commands
โโโ core/ # PDF processing & sync
โโโ db/ # Database & models
โโโ embeddings/ # Vector search
โโโ llm/ # AI providers & caching
โโโ integrations/ # arXiv client
๐ License
MIT License - see LICENSE file for details
๐ฌ Support
- Report issues: GitHub Issues
- Questions: Open a discussion on GitHub
๐ Acknowledgments
MAVYN builds on open source technologies:
- Sentence Transformers for embeddings
- FAISS for vector search
- Rich for beautiful terminal UI
- Click for CLI framework
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mavyn-2.0.0.tar.gz.
File metadata
- Download URL: mavyn-2.0.0.tar.gz
- Upload date:
- Size: 90.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9038870f929288fe967d95df6c9f4ab1972f5beccd57699d8b34d9734008283
|
|
| MD5 |
219e640ac5e03b5e4bf13ccac09132f0
|
|
| BLAKE2b-256 |
b160ea67b2e1082bdad7f7453135e50c631e5bbda97f05863b6defa686e9dd09
|
File details
Details for the file mavyn-2.0.0-py3-none-any.whl.
File metadata
- Download URL: mavyn-2.0.0-py3-none-any.whl
- Upload date:
- Size: 99.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64e24dd035df92eb357c572786b702934f2997c009a56f6d951b70f26eb50dff
|
|
| MD5 |
81f74a699d33e274591e96e4fd8e246b
|
|
| BLAKE2b-256 |
cf822bd6b9ec476bf985402419b3ae9ca53652ea1fc2d0d09c0d28ec2ef3f002
|