AI-powered semantic search and chat for Obsidian notes

Project description

Obsidian-AI

A command-line AI assistant that chats with your personal knowledge base using OpenAI's GPT models. Search, read, and semantically explore your notes with natural language queries.

Features

Smart Search: Keyword and semantic search across your note collection
Safe File Access: Read-only operations with directory sandboxing
Interactive Chat: Both single-query and REPL modes
Local Embeddings: TF-IDF based semantic search with local caching
Rich Output: Beautiful terminal UI with syntax highlighting

Quick Start

Installation

# Clone and install
git clone <repository-url>
cd obsidian-ai
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e .

Configuration

export OPENAI_API_KEY="your-api-key-here"
export OBSIDIAN_AI_BRAIN_DIR="$HOME/brain"              # Optional: defaults to ~/brain
export OBSIDIAN_AI_MODEL="gpt-4o"                       # Optional: defaults to gpt-4o
export OBSIDIAN_AI_IGNORE_PATTERNS="*.tmp,cache/*,30. Areas/Roleplay"  # Optional: comma-separated ignore patterns

Usage

# Single question
obsidian-ai chat "What notes do I have about machine learning?"

# Interactive chat
obsidian-ai repl

# Direct search
obsidian-ai search "project ideas"

# Read specific file
obsidian-ai read "projects/ai-assistant.md"

# Ignore specific patterns for this session
obsidian-ai --ignore "temp/*" --ignore "*.draft" chat "What are my project ideas?"

How It Works

Obsidian-AI provides your chosen GPT model with three powerful tools to explore your notes:

search(query) - Keyword search across filenames and content
read_file(path) - Safe file reading with byte-range support
semantic_search(query) - Similarity search using local TF-IDF embeddings

The assistant uses these tools to ground its responses in your actual notes, providing specific file citations and relevant excerpts.

Architecture

src/obsidian_ai/
├── cli.py          # Command-line interface
├── chat.py         # OpenAI chat integration with tool calling
├── config.py       # Environment configuration
├── tools.py        # Tool definitions and dispatch
├── search.py       # Keyword search implementation
├── semsearch.py    # Semantic search with local embeddings
├── local_embed.py  # TF-IDF vectorizer implementation
└── fs.py          # File system utilities

Supported File Types

Markdown (.md)
Text files (.txt)
Org-mode (.org)
reStructuredText (.rst)
Code files (.py, .js, .ts, .java, .go)
Data files (.csv, .json, .yaml, .yml)

Safety & Security

Read-only: No file modification capabilities
Directory sandboxing: File access restricted to configured brain directory
No secrets in code: API keys only via environment variables
Size limits: Files over 2MB are skipped to prevent abuse

Configuration Options

Environment Variable	Default	Description
`OBSIDIAN_AI_BRAIN_DIR`	`~/brain`	Directory containing your notes
`OBSIDIAN_AI_MODEL`	`gpt-4o`	OpenAI model to use
`OBSIDIAN_AI_MAX_TOOL_CALLS`	`5`	Maximum tool calls per query
`OBSIDIAN_AI_IGNORE_PATTERNS`	Built-in defaults	Comma-separated patterns to ignore
`OPENAI_API_KEY`	required	Your OpenAI API key

Advanced Usage

Ignore Patterns

Control which directories and files are excluded from search:

# Environment variable (persistent)
export OBSIDIAN_AI_IGNORE_PATTERNS="30. Areas/Roleplay,temp/*,*.draft,private/*"

# Command-line flags (session-only)
obsidian-ai --ignore "temp/*" --ignore "*.draft" search "project ideas"
obsidian-ai --ignore "30. Areas/Roleplay" chat "Tell me about my notes"

Built-in ignore patterns:

.git, .obsidian, .obsidian_ai_cache
node_modules, __pycache__
.DS_Store, Thumbs.db

Pattern matching:

* matches any characters: temp/* ignores anything in temp directories
*.ext matches files with specific extensions
dirname matches exact directory names anywhere in the path
path/to/dir matches specific paths relative to brain directory

Semantic Search Caching

The semantic search builds a local TF-IDF index cached in .obsidian_ai_cache/. The index automatically rebuilds when files change.

File Reading with Ranges

# Read first 1KB of a large file
obsidian-ai read "large-document.md" --start 0 --max-bytes 1024

# Read from specific byte offset
obsidian-ai read "large-document.md" --start 1024 --max-bytes 2048

Verbose Logging

# Enable debug logging
obsidian-ai -v chat "your question"
obsidian-ai -vv repl  # Even more verbose

Development

Testing

# Run tests
uv run pytest tests/

# Run specific test
uv run pytest tests/test_search.py -v

Project Structure

The codebase follows bacterial coding principles - small, modular, self-contained functions that could easily be copied and reused. Each module has a single clear purpose:

fs.py - File system iteration
search.py - Text search logic
local_embed.py - Embedding vectorization
semsearch.py - Semantic search coordination
tools.py - OpenAI tool integration
chat.py - Conversation management

Contributing

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request

License

MIT License - see LICENSE for details.

Author

Created by Sumuk Shashidhar

Project details

Release history Release notifications | RSS feed

This version

0.1.2

Aug 19, 2025

0.1.1

Aug 14, 2025

0.1.0

Aug 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

obsidian_ai-0.1.2.tar.gz (30.0 MB view details)

Uploaded Aug 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

obsidian_ai-0.1.2-py3-none-any.whl (29.3 kB view details)

Uploaded Aug 19, 2025 Python 3

File details

Details for the file obsidian_ai-0.1.2.tar.gz.

File metadata

Download URL: obsidian_ai-0.1.2.tar.gz
Upload date: Aug 19, 2025
Size: 30.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for obsidian_ai-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`93618a3993d029155603152feb84a9c5b4b4d3c88920a3a31cdcac2193d17578`
MD5	`a0080c08b5e3240f9565794d39f28148`
BLAKE2b-256	`18e4ad51ef25db34ce7fce9d8e3fc6c6421a2ccfbe47872cee0ee495abc6b66a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for obsidian_ai-0.1.2.tar.gz:

Publisher: publish.yml on sumukshashidhar/obsidian-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: obsidian_ai-0.1.2.tar.gz
- Subject digest: 93618a3993d029155603152feb84a9c5b4b4d3c88920a3a31cdcac2193d17578
- Sigstore transparency entry: 411031910
- Sigstore integration time: Aug 19, 2025
Source repository:
- Permalink: sumukshashidhar/obsidian-ai@558421508a595a5de3e49b99b37c133d70ddab36
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/sumukshashidhar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@558421508a595a5de3e49b99b37c133d70ddab36
- Trigger Event: push

File details

Details for the file obsidian_ai-0.1.2-py3-none-any.whl.

File metadata

Download URL: obsidian_ai-0.1.2-py3-none-any.whl
Upload date: Aug 19, 2025
Size: 29.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for obsidian_ai-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a7caee6f721fd5ab0b222048b15123d68622a1da2ee1e299429e254a9aa00e32`
MD5	`56fd3a6c6866ac33e80ef65abf5235fd`
BLAKE2b-256	`e033557064a650e4e9acdd5d428b32354cbfd12beded08fdadcc57c954f2741c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for obsidian_ai-0.1.2-py3-none-any.whl:

Publisher: publish.yml on sumukshashidhar/obsidian-ai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: obsidian_ai-0.1.2-py3-none-any.whl
- Subject digest: a7caee6f721fd5ab0b222048b15123d68622a1da2ee1e299429e254a9aa00e32
- Sigstore transparency entry: 411031942
- Sigstore integration time: Aug 19, 2025
Source repository:
- Permalink: sumukshashidhar/obsidian-ai@558421508a595a5de3e49b99b37c133d70ddab36
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/sumukshashidhar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@558421508a595a5de3e49b99b37c133d70ddab36
- Trigger Event: push

obsidian-ai 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Obsidian-AI

Features

Quick Start

Installation

Configuration

Usage

How It Works

Architecture

Supported File Types

Safety & Security

Configuration Options

Advanced Usage

Ignore Patterns

Semantic Search Caching

File Reading with Ranges

Verbose Logging

Development

Testing

Project Structure

Contributing

License

Author

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance