Skip to main content

Local-first paper manager with semantic search and LLM reasoning

Project description

mavyn

MAVYN - Your AI Research Assistant

A privacy-first research paper manager with natural language interface and local semantic search.

Python 3.10+ License: MIT

โœจ Features

  • ๐Ÿ”’ Privacy-First: All papers stored locally, no cloud uploads
  • ๐Ÿ’ฌ Natural Language Interface: Interactive REPL - just chat with your papers
  • ๐Ÿš€ Fast Semantic Search: Local vector search across all papers
  • ๐Ÿค– AI Q&A: Ask questions naturally without command prefixes
  • ๐Ÿ“Š Paper Comparison: Compare multiple papers with intelligent caching
  • ๐Ÿ”— Similar Papers: Find related work in your library with optional arXiv suggestions
  • ๐Ÿ“ˆ Auto-Processing: One command to scan, rename, and embed
  • ๐Ÿ”„ Incremental Updates: 70-90% faster re-embedding
  • ๐Ÿ“‚ Smart Cleanup: Automatically removes deleted papers from database

๐Ÿš€ Quick Start

1. Installation

pip install -r requirements.txt

2. Start MAVYN

mavyn

This launches the interactive assistant.

3. Initial Setup

MAVYN> /sync ~/Papers

# Your papers are now:
# โœ“ Scanned and indexed
# โœ“ Renamed with metadata
# โœ“ Embedded for semantic search
# โœ“ Ready for questions

4. List Your Papers

MAVYN> /list

# Shows numbered list of all papers

5. Ask Questions Naturally

MAVYN> tell me the methodology summary of paper 4

MAVYN> compare papers 1 and 5

MAVYN> find similar papers about transformers

MAVYN> what are the main findings in paper 7?

No command prefixes needed! Just type your question naturally.

๐Ÿ“– Complete Workflow

First-Time Setup

$ mavyn

MAVYN> /sync ~/Papers

# MAVYN will:
# 1. Scan all PDFs in ~/Papers
# 2. Extract metadata (title, authors, year, etc.)
# 3. Rename files for easy identification
# 4. Generate embeddings for semantic search

Daily Use

$ mavyn

MAVYN> /list
# See all your papers with IDs

MAVYN> what is the main contribution of paper 5?
# Get AI-powered summaries

MAVYN> compare the methodology in papers 2 and 7
# Side-by-side comparisons

MAVYN> find papers similar to paper 3
# Discover related work

MAVYN> /exit
# Exit when done

Auto-Sync New Papers

MAVYN> /sync ~/Papers --watch

# Now just drop new PDFs into ~/Papers
# They'll be automatically processed!

๐Ÿ”ง Available Commands

Slash Commands

  • /sync [directory] - Setup and sync your papers
  • /sync --watch - Continuously monitor for new papers
  • /list - List all indexed papers with IDs
  • /help - Show help message
  • /exit - Exit MAVYN

Natural Language Queries

No special syntax needed! Just ask naturally:

  • "tell me about paper 5"
  • "summarize the methodology of paper 3"
  • "compare papers 1, 4, and 7"
  • "find papers similar to paper 2"
  • "what are the key contributions in paper 6?"

๐Ÿค– AI Q&A Setup (Optional)

Set up an API key to enable AI-powered question answering:

# Option 1: Environment variable
export GROQ_API_KEY="your_key_here"

# Option 2: .env file
echo "GROQ_API_KEY=your_key_here" > ~/.MAVYN/.env

Get Free API Keys:

  • Groq - Fast and generous free tier (recommended)
  • Google Gemini - Alternative option

Without API keys, MAVYN will still:

  • Index and search papers
  • Show relevant papers for your queries
  • Perform all local operations

๐Ÿ”’ Privacy

  • All papers stay on your machine - never uploaded anywhere
  • Embeddings generated locally - no external API calls
  • Cloud APIs only for Q&A - and only if you configure them
  • Similar papers + arXiv (optional) - If enabled, MAVYN sends a keyword search query (derived from your question or seed paper title/abstract) to export.arxiv.org. No PDFs are uploaded. Responses are cached locally for 24 hours.
  • Database stored locally at ~/.MAVYN/MAVYN.db

๐Ÿ“ฆ Requirements

  • Python 3.10 or higher
  • ~500MB disk space for embedding models
  • Internet connection only for optional AI Q&A and optional arXiv similar-paper search

๐ŸŽฏ Example Session

$ mavyn

Welcome to MAVYN - Your AI Research Assistant

Available Commands:
- /sync [directory] - Setup and sync your papers
- /list - List all indexed papers
- /help - Show help message
- /exit - Exit MAVYN

Natural Language Queries:
Just type your question naturally!

MAVYN> /sync ~/Documents/Papers
๐Ÿ“‚ Scanning: /Users/you/Documents/Papers
โœ“ paper1.pdf [renamed] [embedded]
โœ“ paper2.pdf [renamed] [embedded]
โœ“ paper3.pdf [renamed] [embedded]

Sync Summary
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
Total files: 3
New papers: 3
Embedded: 3

MAVYN> /list
โ”Œโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ID โ”‚ Title                                โ”‚ Authors    โ”‚ Year โ”‚
โ”œโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1  โ”‚ Attention Is All You Need            โ”‚ Vaswani et โ”‚ 2017 โ”‚
โ”‚ 2  โ”‚ BERT: Pre-training of Deep Bidirect  โ”‚ Devlin et  โ”‚ 2018 โ”‚
โ”‚ 3  โ”‚ GPT-3: Language Models are Few-Shot  โ”‚ Brown et a โ”‚ 2020 โ”‚
โ””โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”˜

MAVYN> tell me about the methodology in paper 1

[AI generates detailed methodology summary...]

MAVYN> compare papers 1 and 2

[AI generates side-by-side comparison...]

MAVYN> /exit
Goodbye!

๐Ÿ› ๏ธ Advanced Features

Incremental Updates

MAVYN is smart about re-processing papers:

  • Only embeds new or modified papers
  • Reuses unchanged content chunks
  • 70-90% faster when updating existing papers

Paper Comparison Caching

When you compare papers, results are cached:

  • Instant results for repeat comparisons
  • Reduces API costs
  • Cached for 24 hours

arXiv Integration

Find related papers beyond your library:

  • Set MAVYN_ARXIV_RELATED=1 or use --arxiv flag (when available)
  • MAVYN queries arXiv API with keyword search
  • Results are deduplicated against your library
  • Ranked by semantic similarity

๐Ÿ—บ๏ธ Project Structure

MAVYN/
โ”œโ”€โ”€ ~/.MAVYN/              # User data directory
โ”‚   โ”œโ”€โ”€ MAVYN.db          # Local database
โ”‚   โ”œโ”€โ”€ search.index      # FAISS vector index
โ”‚   โ””โ”€โ”€ .env              # Optional API keys
โ””โ”€โ”€ src/MAVYN/            # Source code
    โ”œโ”€โ”€ cli/              # Interactive REPL & commands
    โ”œโ”€โ”€ core/             # PDF processing & sync
    โ”œโ”€โ”€ db/               # Database & models
    โ”œโ”€โ”€ embeddings/       # Vector search
    โ”œโ”€โ”€ llm/              # AI providers & caching
    โ””โ”€โ”€ integrations/     # arXiv client

๐Ÿ“ License

MIT License - see LICENSE file for details

๐Ÿ’ฌ Support

  • Report issues: GitHub Issues
  • Questions: Open a discussion on GitHub

๐Ÿ™ Acknowledgments

MAVYN builds on open source technologies:

  • Sentence Transformers for embeddings
  • FAISS for vector search
  • Rich for beautiful terminal UI
  • Click for CLI framework

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mavyn-2.1.0.tar.gz (90.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mavyn-2.1.0-py3-none-any.whl (99.2 kB view details)

Uploaded Python 3

File details

Details for the file mavyn-2.1.0.tar.gz.

File metadata

  • Download URL: mavyn-2.1.0.tar.gz
  • Upload date:
  • Size: 90.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mavyn-2.1.0.tar.gz
Algorithm Hash digest
SHA256 c4783eaf7fcf0296d4c4482d3646d6927791fe97cd4c3eafc6151817e9aae96d
MD5 ca3c0e9afc73030beee0a64ca5a9dd1e
BLAKE2b-256 c875f6fd7d9ecb869e632b4c5af29d884cd2b10f3cc4a7ac4676aed063b33699

See more details on using hashes here.

File details

Details for the file mavyn-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: mavyn-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 99.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mavyn-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5fec7ee9db34d854141fd96885828618e3d5245ba96880e70f673da3508743a8
MD5 24d546565f6892477138cb33320fb38c
BLAKE2b-256 49b5c10514037e43aaca84677696e228641a6a458b27514bf3d46ae1cdaa99ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page