Build local, queryable packs from videos, articles, podcasts, and files for MCP and local LLM use.
Project description
beyin
base engine for your information nodes
also means “brain” in Turkish.
Build local, queryable packs from videos, articles, podcasts, and local files. Query them through MCP with your AI agent, or explore them directly with a local model.
✨ Features
- 🔗 MCP compatible: works with Claude Code, Codex, Cursor, Windsurf, Zed and more
- 📦 Local-first pipeline: processing, embedding, and storage all happen on your machine
- 🎬 Rich source support: YouTube videos and playlists, podcasts, PDFs, articles, local files
- 🌍 50+ languages: multilingual embedding model out of the box
- 🤖 Ollama support: run fully offline with a local model
- ⚡ Plug and play: one command to connect via MCP, then manage everything by just talking to your agent
- 🎯 Multi-query expansion: generates query variants automatically for better retrieval
⚙️ How it works
The recommended way to use beyin is through MCP with the AI agent you already use.
- Install beyin and connect it to your agent once
- Build a pack from your sources
- Ask questions naturally. Your agent handles retrieval automatically.
Once set up, you can ask your agent to create, build, and manage packs, add sources, check status, and retrieve relevant results, all in plain language. See Example Usage with MCP.
You can also query packs directly with a local model, no external API or agent needed. See Query with a Local Model.
📂 Supported Sources
| Type | Examples |
|---|---|
| Web articles | Public URLs |
| YouTube | Videos and playlists |
| Podcasts | RSS feed URLs |
| Local documents | .pdf, .docx, .pptx, .epub, .xlsx, .csv |
| Local text | .txt, .md, .rst, .html |
| Local audio | .mp3, .m4a, .wav |
| Local video | .mp4, .mov, .mkv, .webm |
beyin is built for local processing on your own machine. Use it with content you are allowed to process, preferably public, permitted sources or material you own or have rights to use. Avoid copied, paywalled, private, restricted, or illegally shared content.
📦 Installation
The recommended way to install beyin is with uv:
uv tool install beyin
uvx beyin check-deps
Why uv / uvx is the main path:
- it installs beyin like a standalone CLI app, instead of mixing it into whatever Python environment you happen to be using
- it avoids the common "it works in my terminal, but my MCP server can't find it" problem
- it gives you one consistent way to run beyin in both CLI and MCP setups:
uvx beyin ...
In practice, that consistency matters a lot for agents like Codex, Claude Code, Cursor, Windsurf, and Zed, because they often launch MCP servers in a different environment from your interactive shell.
If you already manage Python environments carefully and want beyin inside a specific environment, pip install beyin still works:
pip install beyin
beyin check-deps
But unless you specifically need that, prefer the uv tool install + uvx route.
ffmpeg is required for video and audio sources. Skip if you only use articles and local files:
# macOS
brew install ffmpeg
# Linux
sudo apt install ffmpeg
# Windows
winget install ffmpeg
No Homebrew on macOS or winget not working? Download directly from ffmpeg.org/download.html.
🖥️ Using via CLI
You can use beyin directly in your terminal without MCP:
uvx beyin
What happens on first run:
- beyin starts a guided setup flow
- after setup, if you do not have any packs yet, beyin shows a start screen where you can create a new pack or import an existing one
- if setup is already complete and no packs are available, that same start screen is shown again
- if you already have packs, you can manage and build them from the CLI as usual
Useful examples:
uvx beyin
uvx beyin list
uvx beyin build my-pack
uvx beyin settings
uvx beyin check-deps
If you installed with pip, use beyin ... instead of uvx beyin ....
🔌 Connect to Your Agent
You only need to do this once.
If you installed beyin with uv tool install, use uvx beyin mcp-server.
If you installed beyin with pip install, use beyin mcp-server.
The uvx form is recommended because it makes the MCP server use the same tool-managed installation every time.
Claude Code
Recommended (uv tool install beyin):
claude mcp add beyin -- uvx beyin mcp-server
If you installed with pip and normally run beyin directly in your terminal:
claude mcp add beyin -- beyin mcp-server
No config file editing needed, and no need to keep a terminal open. Claude Code launches and manages the server process automatically. Restart Claude Code and beyin will appear in your MCP tools.
To make it available across all your projects:
claude mcp add --scope user beyin -- uvx beyin mcp-server
Codex (OpenAI)
Recommended (uv tool install beyin):
codex mcp add beyin -- uvx beyin mcp-server
If you installed with pip and normally run beyin directly in your terminal:
codex mcp add beyin -- beyin mcp-server
Cursor
Open or create ~/.cursor/mcp.json and add:
{
"mcpServers": {
"beyin": {
"command": "uvx",
"args": ["beyin", "mcp-server"]
}
}
}
Or go to Command Palette → "View: Open MCP Settings".
If you installed with pip, use "command": "beyin" and "args": ["mcp-server"] instead.
Windsurf
Open or create ~/.codeium/windsurf/mcp_config.json and add:
{
"mcpServers": {
"beyin": {
"command": "uvx",
"args": ["beyin", "mcp-server"]
}
}
}
Or go to Command Palette → "MCP: Add Server".
If you installed with pip, use "command": "beyin" and "args": ["mcp-server"] instead.
Zed
In ~/.config/zed/settings.json:
{
"context_servers": {
"beyin": {
"source": "custom",
"command": "uvx",
"args": ["beyin", "mcp-server"]
}
}
}
If you installed with pip, use "command": "beyin" and "args": ["mcp-server"] instead.
Any other MCP-compatible agent
Recommended command:
uvx beyin mcp-server
If you installed with pip, use beyin mcp-server instead.
It runs a stdio MCP server, compatible with any agent that supports the MCP protocol.
💬 Example Usage with MCP
Once beyin is connected through MCP, you can talk to your agent naturally. You do not need to memorize commands or even say "beyin" every time. Just ask for what you want.
Some prompts that mention local files or folders may require your AI agent to have read access to those locations first.
| What you want | What to say |
|---|---|
| Build a new pack | create a pack called "yt-research", add this YouTube playlist: https://youtube.com/playlist?list=..., and build it |
| Add a source | I have a PDF about growth strategy in my Downloads folder, add it to my "mobile-marketing" pack and rebuild |
| Add more sources | add these to my "product-ideas" pack and rebuild: https://example.com/article-1, https://example.com/article-2, https://example.com/article-3 |
| Ask a question | any useful info about onboarding screens in my "mobile marketing" pack? |
| Control the response | ask yt-research pack about building an audience from scratch, include sources and timestamps |
| Check your packs | list my packs and show me their status |
| Ask about a pack | whats the status of mobile marketing pack? and also its sources? |
| Remove a source | remove sources 2 and 3 from mobile marketing pack |
| Remove a pack | remove that pack about tech podcast |
🛠️ MCP Tools Reference
These are the tools beyin exposes to your agent. Your agent uses them automatically; you do not need to call them yourself.
| Tool | What it does |
|---|---|
packs |
List all installed packs |
status |
Show details and readiness for a pack |
retrieve |
Return relevant results for one or more queries |
build |
Build or update a pack. Pass sources to build only selected sources by index or range. Automatically purges chunks of removed sources. |
add |
Add a pack from a path, URL, or YAML |
add_sources |
Add new sources to a pack. Rebuilds automatically for single sources; playlists/feeds are expanded for review first. |
remove_sources |
Remove sources by index, range, or text match. Removed chunks stay in the vector store until you rebuild. |
remove |
Remove an installed pack (moves to trash) |
registry |
Browse the beyin community registry by topic, tag, or keyword |
📋 All Commands
Pack lifecycle
| Command | What it does |
|---|---|
uvx beyin create |
Create a new pack interactively |
uvx beyin add <path-or-url> |
Import an existing pack from a file or URL |
uvx beyin build <pack> |
Build or rebuild a pack |
uvx beyin build <pack> --source 1 3 5 |
Build only selected sources by index or range |
uvx beyin update <pack> |
Fetch new content and rebuild incrementally |
uvx beyin remove <pack> |
Remove a pack |
uvx beyin list |
List all installed packs |
uvx beyin status <pack> |
Show pack details and readiness |
Sources
| Command | What it does |
|---|---|
uvx beyin add-source <pack> <url> |
Add a new source to an installed pack |
uvx beyin remove-source <pack> 2 |
Remove source by index |
uvx beyin remove-source <pack> 1 3 5 |
Remove multiple sources by index |
uvx beyin remove-source <pack> 1-3 |
Remove a range of sources |
uvx beyin remove-source <pack> "keyword" |
Remove a source by title/URL text match |
uvx beyin remove-source <pack> 2 --build |
Remove and rebuild immediately to clean up vector store |
Query
| Command | What it does |
|---|---|
uvx beyin query <pack> "question" |
Ask a question directly (requires Ollama) |
Server & config
| Command | What it does |
|---|---|
uvx beyin mcp-server |
Start the MCP server |
uvx beyin settings |
View and configure settings |
uvx beyin check-deps |
Verify runtime dependencies |
uvx beyin about |
Version and info |
uvx beyin help |
List all commands |
🤖 Query with a Local Model
You can query your packs with a local model using Ollama, without sending anything to an external API. Everything stays on your machine.
If you use beyin through an MCP-connected agent (Claude Code, Codex, etc.), you do not need Ollama. Your agent is the LLM. beyin just retrieves results for it.
Setup:
- Download and install Ollama from ollama.com
- Pull a model:
ollama pull llama3.2 # 2 GB, fast, good for most queries
ollama pull qwen2.5:7b # 4.7 GB, stronger reasoning
- Start Ollama:
ollama serve
- Build a pack and query it:
uvx beyin query my-pack "What does this source say about X?"
To change the model, run uvx beyin settings.
🔧 Troubleshooting
Pack is not queryable yet
uvx beyin status my-pack
uvx beyin build my-pack
A partially-ready pack is still queryable — sources that built successfully are available. Rebuilding recovers any failed sources.
MCP is connected but retrieval is not working
- Make sure the pack was built:
uvx beyin status my-pack - Restart your agent after adding beyin for the first time
- Verify the server is registered:
claude mcp list - Make sure the same beyin installation is used by both CLI and the MCP server
Audio and video builds are slow
beyin uses Whisper to transcribe audio and video sources. The model size controls the trade-off between speed and accuracy. OpenAI’s official model family also includes English-only .en variants through medium.en, which are useful when you know the audio is only English.
| Model | Type | Download size | Speed | Accuracy | Best for |
|---|---|---|---|---|---|
tiny |
multilingual | ~75 MB | fastest | lowest | Quick tests, clean audio, mixed-language detection |
tiny.en |
English-only | ~75 MB | fastest | low | Fastest English-only transcripts |
base |
multilingual | ~145 MB | fast | low | Simple podcasts, lightweight multilingual audio |
base.en |
English-only | ~145 MB | fast | low+ | English podcasts and interviews |
small |
multilingual | ~483 MB | moderate | good | Most use cases, multilingual content |
small.en |
English-only | ~483 MB | moderate | good+ | Strong default for English-only speech |
medium |
multilingual | ~1.5 GB | slow | better | Harder English, multilingual, accented, or noisy audio |
medium.en |
English-only | ~1.5 GB | slow | better+ | Higher English accuracy without multilingual support |
large |
multilingual | ~3 GB | slowest | best | Maximum accuracy, difficult audio |
The default model is small. To use a faster or English-only model, change it in settings:
uvx beyin settings
Or pass it per build:
uvx beyin build my-pack --model small
small is a good default for most content. If your audio is strictly English, small.en is a good faster/simpler option. Use medium, medium.en, or large for harder audio.
Video or audio builds fail
- Check that
ffmpegis installed:ffmpeg -version - Check that
yt-dlpis installed and current:yt-dlp --version - Make sure the source URL is still reachable
Pack name with spaces is not recognized
Pack IDs use kebab-case, not spaces. Use my-pack instead of my pack. The display name can be anything, but the ID used in commands must be kebab-case.
Which python / pip should I use?
Use the same installation path for both CLI commands and the MCP server:
- If you installed with
pip install beyin, usebeyin ... - If you installed with
uv tool install beyin, useuvx beyin ...
Mixing them can make the CLI and MCP server point at different environments.
🧑💻 Development
git clone https://github.com/buralog/beyin.git
cd beyin
uv sync
Run commands from the repo:
uv run beyin help
MCP config for a local repo install:
claude mcp add beyin -- uv run beyin mcp-server --cwd /absolute/path/to/beyin
Or manually in your agent's config file:
{
"mcpServers": {
"beyin": {
"command": "uv",
"args": ["run", "beyin", "mcp-server"],
"cwd": "/absolute/path/to/beyin"
}
}
}
Run tests:
uv run pytest tests/test_cli.py tests/test_mcp_server.py
🔍 Behind the Scenes
- beyin fetches or loads your source content
- It extracts text or generates transcripts (for audio/video)
- It chunks the content into indexed segments
- It embeds those chunks into a local vector store
- At query time, it retrieves the best-matching chunks using multi-query expansion
beyin uses a multilingual embedding model by default, so it works well across 50+ languages, not just English.
Privacy note: Steps 1–4 are entirely local. At step 5, only the retrieved chunks reach your LLM. For full privacy, use beyin with Ollama so nothing leaves your machine.
🤝 Contributing
Issues and pull requests are welcome at github.com/buralog/beyin.
See CONTRIBUTING.md for pack submissions, pack policy, and code contribution guidelines.
⚖️ Legal
beyin does not host, publish, or redistribute third-party content. Any retrieval, transcription, indexing, or embedding of source material happens locally on the end user's own machine.
Users are responsible for ensuring that their use of beyin complies with applicable laws, copyright rules, and the terms of service of the source platforms.
📄 License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file beyin-0.3.0.tar.gz.
File metadata
- Download URL: beyin-0.3.0.tar.gz
- Upload date:
- Size: 314.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2948a9afbfe28b9fee1fec92eabf356755d83c295cf528d2df06355463669f88
|
|
| MD5 |
5982540294cd52daca995dc84c58ebfa
|
|
| BLAKE2b-256 |
c253c6c84a9634c87fc2867f41f70ba604e4f7ef202edea21e3fc653d56001b4
|
Provenance
The following attestation bundles were made for beyin-0.3.0.tar.gz:
Publisher:
python-publish.yml on buralog/beyin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
beyin-0.3.0.tar.gz -
Subject digest:
2948a9afbfe28b9fee1fec92eabf356755d83c295cf528d2df06355463669f88 - Sigstore transparency entry: 1203451063
- Sigstore integration time:
-
Permalink:
buralog/beyin@b91fae5a905b051e1dbbe3f20e44441b5117c019 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/buralog
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@b91fae5a905b051e1dbbe3f20e44441b5117c019 -
Trigger Event:
release
-
Statement type:
File details
Details for the file beyin-0.3.0-py3-none-any.whl.
File metadata
- Download URL: beyin-0.3.0-py3-none-any.whl
- Upload date:
- Size: 89.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5585e46dd39305d7a3d791910557154a20bb103fabca7d849d8c1f41b65eebb
|
|
| MD5 |
4b07d0fa40b5520a07a85f4f51017e5f
|
|
| BLAKE2b-256 |
41377df76f7019ef94731155d20ce3a0b84922d2aa3d38377ee539c1935cb134
|
Provenance
The following attestation bundles were made for beyin-0.3.0-py3-none-any.whl:
Publisher:
python-publish.yml on buralog/beyin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
beyin-0.3.0-py3-none-any.whl -
Subject digest:
c5585e46dd39305d7a3d791910557154a20bb103fabca7d849d8c1f41b65eebb - Sigstore transparency entry: 1203451065
- Sigstore integration time:
-
Permalink:
buralog/beyin@b91fae5a905b051e1dbbe3f20e44441b5117c019 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/buralog
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@b91fae5a905b051e1dbbe3f20e44441b5117c019 -
Trigger Event:
release
-
Statement type: