Skip to main content

Local-first paper manager with semantic search and LLM reasoning

Project description

mavyn

MAVYN - Your AI Research Assistant

A privacy-first research paper manager with natural language interface and local semantic search.

Python 3.10+ License: MIT

โœจ Features

  • ๐Ÿ”’ Privacy-First: All papers stored locally, no cloud uploads
  • ๐Ÿ’ฌ Natural Language Interface: Interactive REPL - just chat with your papers
  • ๐Ÿš€ Fast Semantic Search: Local vector search across all papers
  • ๐Ÿค– AI Q&A: Ask questions naturally without command prefixes
  • ๐Ÿ“Š Paper Comparison: Compare multiple papers with intelligent caching
  • ๐Ÿ”— Similar Papers: Find related work in your library with optional arXiv suggestions
  • ๐Ÿ“ˆ Auto-Processing: One command to scan, rename, and embed
  • ๐Ÿ”„ Incremental Updates: 70-90% faster re-embedding
  • ๐Ÿ“‚ Smart Cleanup: Automatically removes deleted papers from database

๐Ÿš€ Quick Start

1. Installation

pip install -r requirements.txt

2. Start MAVYN

mavyn

This launches the interactive assistant.

3. Initial Setup

MAVYN> /sync ~/Papers

# Your papers are now:
# โœ“ Scanned and indexed
# โœ“ Renamed with metadata
# โœ“ Embedded for semantic search
# โœ“ Ready for questions

4. List Your Papers

MAVYN> /list

# Shows numbered list of all papers

5. Ask Questions Naturally

MAVYN> tell me the methodology summary of paper 4

MAVYN> compare papers 1 and 5

MAVYN> find similar papers about transformers

MAVYN> what are the main findings in paper 7?

No command prefixes needed! Just type your question naturally.

๐Ÿ“– Complete Workflow

First-Time Setup

$ mavyn

MAVYN> /sync ~/Papers

# MAVYN will:
# 1. Scan all PDFs in ~/Papers
# 2. Extract metadata (title, authors, year, etc.)
# 3. Rename files for easy identification
# 4. Generate embeddings for semantic search

Daily Use

$ mavyn

MAVYN> /list
# See all your papers with IDs

MAVYN> what is the main contribution of paper 5?
# Get AI-powered summaries

MAVYN> compare the methodology in papers 2 and 7
# Side-by-side comparisons

MAVYN> find papers similar to paper 3
# Discover related work

MAVYN> /exit
# Exit when done

Auto-Sync New Papers

MAVYN> /sync ~/Papers --watch

# Now just drop new PDFs into ~/Papers
# They'll be automatically processed!

๐Ÿ”ง Available Commands

Slash Commands

  • /sync [directory] - Setup and sync your papers
  • /sync --watch - Continuously monitor for new papers
  • /list - List all indexed papers with IDs
  • /help - Show help message
  • /exit - Exit MAVYN

Natural Language Queries

No special syntax needed! Just ask naturally:

  • "tell me about paper 5"
  • "summarize the methodology of paper 3"
  • "compare papers 1, 4, and 7"
  • "find papers similar to paper 2"
  • "what are the key contributions in paper 6?"

๐Ÿค– AI Q&A Setup (Optional)

Set up an API key to enable AI-powered question answering:

# Option 1: Environment variable
export GROQ_API_KEY="your_key_here"

# Option 2: .env file
echo "GROQ_API_KEY=your_key_here" > ~/.MAVYN/.env

Get Free API Keys:

  • Groq - Fast and generous free tier (recommended)
  • Google Gemini - Alternative option

Without API keys, MAVYN will still:

  • Index and search papers
  • Show relevant papers for your queries
  • Perform all local operations

๐Ÿ”’ Privacy

  • All papers stay on your machine - never uploaded anywhere
  • Embeddings generated locally - no external API calls
  • Cloud APIs only for Q&A - and only if you configure them
  • Similar papers + arXiv (optional) - If enabled, MAVYN sends a keyword search query (derived from your question or seed paper title/abstract) to export.arxiv.org. No PDFs are uploaded. Responses are cached locally for 24 hours.
  • Database stored locally at ~/.MAVYN/MAVYN.db

๐Ÿ“ฆ Requirements

  • Python 3.10 or higher
  • ~500MB disk space for embedding models
  • Internet connection only for optional AI Q&A and optional arXiv similar-paper search

๐ŸŽฏ Example Session

$ mavyn

Welcome to MAVYN - Your AI Research Assistant

Available Commands:
- /sync [directory] - Setup and sync your papers
- /list - List all indexed papers
- /help - Show help message
- /exit - Exit MAVYN

Natural Language Queries:
Just type your question naturally!

MAVYN> /sync ~/Documents/Papers
๐Ÿ“‚ Scanning: /Users/you/Documents/Papers
โœ“ paper1.pdf [renamed] [embedded]
โœ“ paper2.pdf [renamed] [embedded]
โœ“ paper3.pdf [renamed] [embedded]

Sync Summary
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Total files: 3
New papers: 3
Embedded: 3

MAVYN> /list
โ”Œโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ID โ”‚ Title                                โ”‚ Authors    โ”‚ Year โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1  โ”‚ Attention Is All You Need            โ”‚ Vaswani et โ”‚ 2017 โ”‚
โ”‚ 2  โ”‚ BERT: Pre-training of Deep Bidirect  โ”‚ Devlin et  โ”‚ 2018 โ”‚
โ”‚ 3  โ”‚ GPT-3: Language Models are Few-Shot  โ”‚ Brown et a โ”‚ 2020 โ”‚
โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”˜

MAVYN> tell me about the methodology in paper 1

[AI generates detailed methodology summary...]

MAVYN> compare papers 1 and 2

[AI generates side-by-side comparison...]

MAVYN> /exit
Goodbye!

๐Ÿ› ๏ธ Advanced Features

Incremental Updates

MAVYN is smart about re-processing papers:

  • Only embeds new or modified papers
  • Reuses unchanged content chunks
  • 70-90% faster when updating existing papers

Paper Comparison Caching

When you compare papers, results are cached:

  • Instant results for repeat comparisons
  • Reduces API costs
  • Cached for 24 hours

arXiv Integration

Find related papers beyond your library:

  • Set MAVYN_ARXIV_RELATED=1 or use --arxiv flag (when available)
  • MAVYN queries arXiv API with keyword search
  • Results are deduplicated against your library
  • Ranked by semantic similarity

๐Ÿ—บ๏ธ Project Structure

MAVYN/
โ”œโ”€โ”€ ~/.MAVYN/              # User data directory
โ”‚   โ”œโ”€โ”€ MAVYN.db          # Local database
โ”‚   โ”œโ”€โ”€ search.index      # FAISS vector index
โ”‚   โ””โ”€โ”€ .env              # Optional API keys
โ””โ”€โ”€ src/MAVYN/            # Source code
    โ”œโ”€โ”€ cli/              # Interactive REPL & commands
    โ”œโ”€โ”€ core/             # PDF processing & sync
    โ”œโ”€โ”€ db/               # Database & models
    โ”œโ”€โ”€ embeddings/       # Vector search
    โ”œโ”€โ”€ llm/              # AI providers & caching
    โ””โ”€โ”€ integrations/     # arXiv client

๐Ÿ“ License

MIT License - see LICENSE file for details

๐Ÿ’ฌ Support

  • Report issues: GitHub Issues
  • Questions: Open a discussion on GitHub

๐Ÿ™ Acknowledgments

MAVYN builds on open source technologies:

  • Sentence Transformers for embeddings
  • FAISS for vector search
  • Rich for beautiful terminal UI
  • Click for CLI framework

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mavyn-2.0.0.tar.gz (90.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mavyn-2.0.0-py3-none-any.whl (99.2 kB view details)

Uploaded Python 3

File details

Details for the file mavyn-2.0.0.tar.gz.

File metadata

  • Download URL: mavyn-2.0.0.tar.gz
  • Upload date:
  • Size: 90.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for mavyn-2.0.0.tar.gz
Algorithm Hash digest
SHA256 d9038870f929288fe967d95df6c9f4ab1972f5beccd57699d8b34d9734008283
MD5 219e640ac5e03b5e4bf13ccac09132f0
BLAKE2b-256 b160ea67b2e1082bdad7f7453135e50c631e5bbda97f05863b6defa686e9dd09

See more details on using hashes here.

File details

Details for the file mavyn-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: mavyn-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 99.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for mavyn-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64e24dd035df92eb357c572786b702934f2997c009a56f6d951b70f26eb50dff
MD5 81f74a699d33e274591e96e4fd8e246b
BLAKE2b-256 cf822bd6b9ec476bf985402419b3ae9ca53652ea1fc2d0d09c0d28ec2ef3f002

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page