Skip to main content

Build local, queryable packs from videos, articles, podcasts, and files for MCP and local LLM use.

Project description

beyin

base engine for your information nodes

also means “brain” in Turkish.

Build local, queryable packs from videos, articles, podcasts, and local files. Query them through MCP with your AI agent, or explore them directly with a local model.

PyPI Python 3.11+ MCP License: MIT

✨ Features

  • 🔗 MCP compatible: works with Claude Code, Codex, Cursor, Windsurf, Zed and more
  • 📦 Local-first pipeline: processing, embedding, and storage all happen on your machine
  • 🎬 Rich source support: YouTube videos and playlists, podcasts, PDFs, articles, local files
  • 🌍 50+ languages: multilingual embedding model out of the box
  • 🤖 Ollama support: run fully offline with a local model
  • Plug and play: one command to connect via MCP, then manage everything by just talking to your agent
  • 🎯 Multi-query expansion: generates query variants automatically for better retrieval

⚙️ How it works

The recommended way to use beyin is through MCP with the AI agent you already use.

  1. Install beyin and connect it to your agent once
  2. Build a pack from your sources
  3. Ask questions naturally. Your agent handles retrieval automatically.

Once set up, you can ask your agent to create, build, and manage packs, add sources, check status, and retrieve relevant results, all in plain language. See Example Usage with MCP.

You can also query packs directly with a local model, no external API or agent needed. See Query with a Local Model.


📂 Supported Sources

Type Examples
Web articles Public URLs
YouTube Videos and playlists
Podcasts RSS feed URLs
Local documents .pdf, .docx, .pptx, .epub, .xlsx, .csv
Local text .txt, .md, .rst, .html
Local audio .mp3, .m4a, .wav
Local video .mp4, .mov, .mkv, .webm

beyin is built for local processing on your own machine. Use it with content you are allowed to process, preferably public, permitted sources or material you own or have rights to use. Avoid copied, paywalled, private, restricted, or illegally shared content.


📦 Installation

The recommended way to install beyin is with uv:

uv tool install beyin

Why uv is the main path:

  • uv tool install installs beyin as a standalone CLI app instead of dropping it into whatever Python environment happens to be active
  • you get a plain beyin command on your PATH, so day-to-day usage stays simple
  • the install is isolated, which avoids environment drift and version clashes from unrelated Python projects
  • it is easier to keep CLI usage and agent/MCP usage aligned when beyin is installed as one managed tool

If you already manage a dedicated Python environment and explicitly want beyin inside it, pip install beyin still works, but it is a fallback path rather than the recommended default.

ffmpeg is required for video and audio sources. Skip if you only use articles and local files:

# macOS
brew install ffmpeg

# Linux
sudo apt install ffmpeg

# Windows
winget install ffmpeg

No Homebrew on macOS or winget not working? Download directly from ffmpeg.org/download.html.


🖥️ Using via CLI

You can use beyin directly in your terminal without MCP:

beyin

If you want to verify runtime dependencies before building anything:

beyin check-deps

What happens on first run:

  • beyin starts a guided setup flow
  • after setup, if you do not have any packs yet, beyin shows a start screen where you can create a new pack or import an existing one
  • if setup is already complete and no packs are available, that same start screen is shown again
  • if you already have packs, you can manage and build them from the CLI as usual

Useful examples:

beyin
beyin list
beyin build my-pack
beyin settings
beyin check-deps

If you do not want a persistent install, you can run the same commands with uvx beyin ... instead.


🔌 Connect to Your Agent

You only need to do this once.

beyin runs as a stdio MCP server. After registration, your MCP client launches and manages the server for you, so you do not need to keep a separate terminal open. If the tool does not appear immediately, restart the client.

Claude Code

claude mcp add beyin -- beyin mcp-server

To make it available across all your projects:

claude mcp add --scope user beyin -- beyin mcp-server

Codex

codex mcp add beyin -- beyin mcp-server

Cursor

Open or create ~/.cursor/mcp.json and add:

{
  "mcpServers": {
    "beyin": {
      "command": "beyin",
      "args": ["mcp-server"]
    }
  }
}

Or go to Command Palette → "View: Open MCP Settings".

Windsurf

Open or create ~/.codeium/windsurf/mcp_config.json and add:

{
  "mcpServers": {
    "beyin": {
      "command": "beyin",
      "args": ["mcp-server"]
    }
  }
}

Or go to Command Palette → "MCP: Add Server".

Zed

In ~/.config/zed/settings.json:

{
  "context_servers": {
    "beyin": {
      "source": "custom",
      "command": "beyin",
      "args": ["mcp-server"]
    }
  }
}

Any other MCP-compatible agent

Recommended command:

beyin mcp-server

It runs a stdio MCP server, compatible with any agent that supports the MCP protocol.


💬 Example Usage with MCP

Once beyin is connected through MCP, you can talk to your agent naturally. You do not need to memorize commands or even say "beyin" every time. Just ask for what you want.

Some prompts that mention local files or folders may require your AI agent to have read access to those locations first.

What you want What to say
Build a new pack create a pack called "yt-research", add this YouTube playlist: https://youtube.com/playlist?list=..., and build it
Add a source I have a PDF about growth strategy in my Downloads folder, add it to my "mobile-marketing" pack and rebuild
Add more sources add these to my "product-ideas" pack and rebuild: https://example.com/article-1, https://example.com/article-2, https://example.com/article-3
Ask a question any useful info about onboarding screens in my "mobile marketing" pack?
Control the response ask yt-research pack about building an audience from scratch, include sources and timestamps
Check your packs list my packs and show me their status
Ask about a pack whats the status of mobile marketing pack? and also its sources?
Remove a source remove sources 2 and 3 from mobile marketing pack
Remove a pack remove that pack about tech podcast

🛠️ MCP Tools Reference

These are the tools beyin exposes to your agent. Your agent uses them automatically; you do not need to call them yourself.

Tool What it does
packs List all installed packs
status Show details and readiness for a pack
retrieve Return relevant results for one or more queries
build Build or update a pack. Pass sources to build only selected sources by index or range. Automatically purges chunks of removed sources.
add Add a pack from a path, URL, or YAML
add_sources Add new sources to a pack. Rebuilds automatically for single sources; playlists/feeds are expanded for review first.
remove_sources Remove sources by index, range, or text match. Removed chunks stay in the vector store until you rebuild.
remove Remove an installed pack (moves to trash)
registry Browse the beyin community registry by topic, tag, or keyword

📋 All Commands

Pack lifecycle

Command What it does
beyin create Create a new pack interactively
beyin add <path-or-url> Import an existing pack from a file or URL
beyin build <pack> Build or rebuild a pack
beyin build <pack> --source 1 3 5 Build only selected sources by index or range
beyin update <pack> Fetch new content and rebuild incrementally
beyin remove <pack> Remove a pack
beyin list List all installed packs
beyin status <pack> Show pack details and readiness

Sources

Command What it does
beyin add-source <pack> <url> Add a new source to an installed pack
beyin remove-source <pack> 2 Remove source by index
beyin remove-source <pack> 1 3 5 Remove multiple sources by index
beyin remove-source <pack> 1-3 Remove a range of sources
beyin remove-source <pack> "keyword" Remove a source by title/URL text match
beyin remove-source <pack> 2 --build Remove and rebuild immediately to clean up vector store

Query

Command What it does
beyin query <pack> "question" Ask a question directly (requires Ollama)

Server & config

Command What it does
beyin mcp-server Start the MCP server
beyin settings View and configure settings
beyin check-deps Verify runtime dependencies
beyin about Version and info
beyin help List all commands

🤖 Query with a Local Model

You can query your packs with a local model using Ollama, without sending anything to an external API. Everything stays on your machine.

If you use beyin through an MCP-connected agent (Claude Code, Codex, etc.), you do not need Ollama. Your agent is the LLM. beyin just retrieves results for it.

Setup:

  1. Download and install Ollama from ollama.com
  2. Pull a model:
ollama pull llama3.2     # 2 GB, fast, good for most queries
ollama pull qwen2.5:7b   # 4.7 GB, stronger reasoning
  1. Start Ollama:
ollama serve
  1. Build a pack and query it:
beyin query my-pack "What does this source say about X?"

To change the model, run beyin settings.


🔧 Troubleshooting

Pack is not queryable yet

beyin status my-pack
beyin build my-pack

A partially-ready pack is still queryable — sources that built successfully are available. Rebuilding recovers any failed sources.

MCP is connected but retrieval is not working

  • Make sure the pack was built: beyin status my-pack
  • Restart your agent after adding beyin for the first time
  • Verify the server is registered: claude mcp list
  • Make sure the same beyin installation is used by both CLI and the MCP server

Audio and video builds are slow

beyin uses Whisper to transcribe audio and video sources. Model size controls the trade-off between speed, memory use, and transcription quality. English-only .en variants through medium.en are useful when you know the audio is only English.

Model Type Parameters Download size Speed Accuracy Best for
tiny multilingual 39M ~75 MB fastest lowest Quick tests, clean audio, mixed-language detection
tiny.en English-only 39M ~75 MB fastest low Fastest English-only transcripts
base multilingual 74M ~145 MB fast low Simple podcasts, lightweight multilingual audio
base.en English-only 74M ~145 MB fast low+ English podcasts and interviews
small multilingual 244M ~483 MB moderate good Most use cases, multilingual content
small.en English-only 244M ~483 MB moderate good+ Strong default for English-only speech
medium multilingual 769M ~1.5 GB slow better Harder English, multilingual, accented, or noisy audio
medium.en English-only 769M ~1.5 GB slow better+ Higher English accuracy without multilingual support
large multilingual 1.55B ~3 GB slowest best Maximum accuracy, difficult audio

The default model is small. To use a faster or English-only model, change it in settings:

beyin settings

Or pass it per build:

beyin build my-pack --model small

small is a good default for most content. If your audio is strictly English, small.en is a good faster/simpler option. Use medium, medium.en, or large for harder audio.

Video or audio builds fail

  • Check that ffmpeg is installed: ffmpeg -version
  • Check that yt-dlp is installed and current: yt-dlp --version
  • Make sure the source URL is still reachable

Pack name with spaces is not recognized

Pack IDs use kebab-case, not spaces. Use my-pack instead of my pack. The display name can be anything, but the ID used in commands must be kebab-case.

Which install path should I use?

Prefer one of these two patterns and stay consistent:

  • Installed workflow: uv tool install beyin, then run beyin ...
  • No-install workflow: run uvx beyin ...

Do not mix them casually across CLI and MCP setup, or you may end up checking one environment and running another. pip install beyin is available if you explicitly want beyin inside a managed Python environment, but it is not the main path.

Windows notes

The main beyin ... commands stay the same on Windows after installation. The examples above use Unix-like shell syntax and config paths in a few places, so if a path does not match your system, use the equivalent settings location for your MCP client on Windows. For media builds, install ffmpeg first, then rerun beyin check-deps.


🧑‍💻 Development

git clone https://github.com/buralog/beyin.git
cd beyin
uv sync

Run commands from the repo:

uv run beyin help

MCP config for a local repo install:

claude mcp add beyin -- uv run beyin mcp-server --cwd /absolute/path/to/beyin

Or manually in your agent's config file:

{
  "mcpServers": {
    "beyin": {
      "command": "uv",
      "args": ["run", "beyin", "mcp-server"],
      "cwd": "/absolute/path/to/beyin"
    }
  }
}

Run tests:

uv run pytest tests/test_cli.py tests/test_mcp_server.py

🔍 Behind the Scenes

  1. beyin fetches or loads your source content
  2. It extracts text or generates transcripts (for audio/video)
  3. It chunks the content into indexed segments
  4. It embeds those chunks into a local vector store
  5. At query time, it retrieves the best-matching chunks using multi-query expansion

beyin uses a multilingual embedding model by default, so it works well across 50+ languages, not just English.

Privacy note: Steps 1–4 are entirely local. At step 5, only the retrieved chunks reach your LLM. For full privacy, use beyin with Ollama so nothing leaves your machine.


🤝 Contributing

Issues and pull requests are welcome at github.com/buralog/beyin.

See CONTRIBUTING.md for pack submissions, pack policy, and code contribution guidelines.


⚖️ Legal

beyin does not host, publish, or redistribute third-party content. Any retrieval, transcription, indexing, or embedding of source material happens locally on the end user's own machine.

Users are responsible for ensuring that their use of beyin complies with applicable laws, copyright rules, and the terms of service of the source platforms.


📄 License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beyin-0.3.1.tar.gz (315.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beyin-0.3.1-py3-none-any.whl (90.0 kB view details)

Uploaded Python 3

File details

Details for the file beyin-0.3.1.tar.gz.

File metadata

  • Download URL: beyin-0.3.1.tar.gz
  • Upload date:
  • Size: 315.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for beyin-0.3.1.tar.gz
Algorithm Hash digest
SHA256 67acdb7d6eafb017e10dd81b6ce5ae9886f26ecc9e274ec360ea71d2743c3fe3
MD5 5c68cc2893fd8817735e4834a0b228c9
BLAKE2b-256 21997080763d7b1c49452749a486ccbc35cc4d50993582efb20aae393cc6a0e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for beyin-0.3.1.tar.gz:

Publisher: python-publish.yml on buralog/beyin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file beyin-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: beyin-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 90.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for beyin-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6189f075e3254f16d442112a3859283ff2ff0898c4b414853c5ccefb6d13e86e
MD5 44532a8d83b920bffc9c84250849e90b
BLAKE2b-256 fb6ae775ae9e784445bcb009d6da824f0a4a049c4ad5f1e73bc6df0d15042e14

See more details on using hashes here.

Provenance

The following attestation bundles were made for beyin-0.3.1-py3-none-any.whl:

Publisher: python-publish.yml on buralog/beyin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page