AI-powered semantic search and chat for Obsidian notes
Project description
Obsidian-AI
A command-line AI assistant that chats with your personal knowledge base using OpenAI's GPT models. Search, read, and semantically explore your notes with natural language queries.
Features
- Smart Search: Keyword and semantic search across your note collection
- Safe File Access: Read-only operations with directory sandboxing
- Interactive Chat: Both single-query and REPL modes
- Local Embeddings: TF-IDF based semantic search with local caching
- Rich Output: Beautiful terminal UI with syntax highlighting
Quick Start
Installation
# Clone and install
git clone <repository-url>
cd obsidian-ai
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
Configuration
export OPENAI_API_KEY="your-api-key-here"
export OBSIDIAN_AI_BRAIN_DIR="$HOME/brain" # Optional: defaults to ~/brain
export OBSIDIAN_AI_MODEL="gpt-4o" # Optional: defaults to gpt-4o
export OBSIDIAN_AI_IGNORE_PATTERNS="*.tmp,cache/*,30. Areas/Roleplay" # Optional: comma-separated ignore patterns
Usage
# Single question
obsidian-ai chat "What notes do I have about machine learning?"
# Interactive chat
obsidian-ai repl
# Direct search
obsidian-ai search "project ideas"
# Read specific file
obsidian-ai read "projects/ai-assistant.md"
# Ignore specific patterns for this session
obsidian-ai --ignore "temp/*" --ignore "*.draft" chat "What are my project ideas?"
How It Works
Obsidian-AI provides your chosen GPT model with three powerful tools to explore your notes:
search(query)- Keyword search across filenames and contentread_file(path)- Safe file reading with byte-range supportsemantic_search(query)- Similarity search using local TF-IDF embeddings
The assistant uses these tools to ground its responses in your actual notes, providing specific file citations and relevant excerpts.
Architecture
src/obsidian_ai/
├── cli.py # Command-line interface
├── chat.py # OpenAI chat integration with tool calling
├── config.py # Environment configuration
├── tools.py # Tool definitions and dispatch
├── search.py # Keyword search implementation
├── semsearch.py # Semantic search with local embeddings
├── local_embed.py # TF-IDF vectorizer implementation
└── fs.py # File system utilities
Supported File Types
- Markdown (
.md) - Text files (
.txt) - Org-mode (
.org) - reStructuredText (
.rst) - Code files (
.py,.js,.ts,.java,.go) - Data files (
.csv,.json,.yaml,.yml)
Safety & Security
- Read-only: No file modification capabilities
- Directory sandboxing: File access restricted to configured brain directory
- No secrets in code: API keys only via environment variables
- Size limits: Files over 2MB are skipped to prevent abuse
Configuration Options
| Environment Variable | Default | Description |
|---|---|---|
OBSIDIAN_AI_BRAIN_DIR |
~/brain |
Directory containing your notes |
OBSIDIAN_AI_MODEL |
gpt-4o |
OpenAI model to use |
OBSIDIAN_AI_MAX_TOOL_CALLS |
5 |
Maximum tool calls per query |
OBSIDIAN_AI_IGNORE_PATTERNS |
Built-in defaults | Comma-separated patterns to ignore |
OPENAI_API_KEY |
required | Your OpenAI API key |
Advanced Usage
Ignore Patterns
Control which directories and files are excluded from search:
# Environment variable (persistent)
export OBSIDIAN_AI_IGNORE_PATTERNS="30. Areas/Roleplay,temp/*,*.draft,private/*"
# Command-line flags (session-only)
obsidian-ai --ignore "temp/*" --ignore "*.draft" search "project ideas"
obsidian-ai --ignore "30. Areas/Roleplay" chat "Tell me about my notes"
Built-in ignore patterns:
.git,.obsidian,.obsidian_ai_cachenode_modules,__pycache__.DS_Store,Thumbs.db
Pattern matching:
*matches any characters:temp/*ignores anything in temp directories*.extmatches files with specific extensionsdirnamematches exact directory names anywhere in the pathpath/to/dirmatches specific paths relative to brain directory
Semantic Search Caching
The semantic search builds a local TF-IDF index cached in .obsidian_ai_cache/. The index automatically rebuilds when files change.
File Reading with Ranges
# Read first 1KB of a large file
obsidian-ai read "large-document.md" --start 0 --max-bytes 1024
# Read from specific byte offset
obsidian-ai read "large-document.md" --start 1024 --max-bytes 2048
Verbose Logging
# Enable debug logging
obsidian-ai -v chat "your question"
obsidian-ai -vv repl # Even more verbose
Development
Testing
# Run tests
uv run pytest tests/
# Run specific test
uv run pytest tests/test_search.py -v
Project Structure
The codebase follows bacterial coding principles - small, modular, self-contained functions that could easily be copied and reused. Each module has a single clear purpose:
fs.py- File system iterationsearch.py- Text search logiclocal_embed.py- Embedding vectorizationsemsearch.py- Semantic search coordinationtools.py- OpenAI tool integrationchat.py- Conversation management
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
License
MIT License - see LICENSE for details.
Author
Created by Sumuk Shashidhar
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file obsidian_ai-0.1.2.tar.gz.
File metadata
- Download URL: obsidian_ai-0.1.2.tar.gz
- Upload date:
- Size: 30.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93618a3993d029155603152feb84a9c5b4b4d3c88920a3a31cdcac2193d17578
|
|
| MD5 |
a0080c08b5e3240f9565794d39f28148
|
|
| BLAKE2b-256 |
18e4ad51ef25db34ce7fce9d8e3fc6c6421a2ccfbe47872cee0ee495abc6b66a
|
Provenance
The following attestation bundles were made for obsidian_ai-0.1.2.tar.gz:
Publisher:
publish.yml on sumukshashidhar/obsidian-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
obsidian_ai-0.1.2.tar.gz -
Subject digest:
93618a3993d029155603152feb84a9c5b4b4d3c88920a3a31cdcac2193d17578 - Sigstore transparency entry: 411031910
- Sigstore integration time:
-
Permalink:
sumukshashidhar/obsidian-ai@558421508a595a5de3e49b99b37c133d70ddab36 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/sumukshashidhar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@558421508a595a5de3e49b99b37c133d70ddab36 -
Trigger Event:
push
-
Statement type:
File details
Details for the file obsidian_ai-0.1.2-py3-none-any.whl.
File metadata
- Download URL: obsidian_ai-0.1.2-py3-none-any.whl
- Upload date:
- Size: 29.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7caee6f721fd5ab0b222048b15123d68622a1da2ee1e299429e254a9aa00e32
|
|
| MD5 |
56fd3a6c6866ac33e80ef65abf5235fd
|
|
| BLAKE2b-256 |
e033557064a650e4e9acdd5d428b32354cbfd12beded08fdadcc57c954f2741c
|
Provenance
The following attestation bundles were made for obsidian_ai-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on sumukshashidhar/obsidian-ai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
obsidian_ai-0.1.2-py3-none-any.whl -
Subject digest:
a7caee6f721fd5ab0b222048b15123d68622a1da2ee1e299429e254a9aa00e32 - Sigstore transparency entry: 411031942
- Sigstore integration time:
-
Permalink:
sumukshashidhar/obsidian-ai@558421508a595a5de3e49b99b37c133d70ddab36 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/sumukshashidhar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@558421508a595a5de3e49b99b37c133d70ddab36 -
Trigger Event:
push
-
Statement type: