A Python library for parsing Obsidian Markdown (.md) files and vaults.
Project description
obsidianmd-parser
A Python package for parsing Obsidian Markdown vaults and notes, with support for Obsidian's built-in markdown format and Dataview queries.
Features
- Complete Vault Parsing: Load and parse entire Obsidian vaults
- Note Object Model: Work with notes as Python objects with attributes and methods
- Obsidian Markdown Support:
- Wikilinks (
[[links]]and[[links|aliases]]) - Tags (
#tag,#nested/tag) - Task lists with status tracking
- Obsidian callouts
- Wikilinks (
- Relationship Tracking: Analyze backlinks and relationships between notes
- Dataview Support:
- Parse Dataview queries from notes
- Evaluate Dataview queries programmatically
- Search Capabilities:
- Exact search for notes
- Similarity search using various algorithms
- Code Block Handling: Correctly excludes parsing within code blocks
Installation
pip install obsidianmd-parser
Quick Start
from obsidian_parser import Vault
# Load a vault
vault = Vault("path/to/your/obsidian/vault")
# Find notes by exact name
note = vault.get_note("My Note")
# Search notes by similarity
similar_notes = vault.find_notes("machine learning", case_sensitive=False)
# Access note properties
print(note.title)
print(note.tags)
print(note.wikilinks)
print(note.tasks)
# Work with relationships
backlinks = note.get_backlinks(vault=vault)
related = note.get_forward_links(vault=vault)
most_linked = note.get_most_linked()
Core API
Vault
The Vault class represents an entire Obsidian vault:
# lazy_load = notes are parsed only when accessed (default: True)
vault = Vault("path/to/vault", lazy_load=True)
# Search and retrieval
note = vault.get_note("Note Title")
notes = vault.find_similar_notes("search query", threshold=0.5)
# Vault analysis
note_graph = vault.get_note_graph() # Produces a note graph tuple object
dataview_usage = vault.analyze_dataview_usage() # Get vault statistics for dataview queries
broken_links = vault.find_broken_links() # Finds all broken links in the vault
Note
The Note class represents an individual note:
# Access note metadata
note.title # Note title
note.path # File path
note.content # Raw markdown content
note.frontmatter # Parsed YAML frontmatter
# Access parsed elements
note.tags # List of tags in the note
note.wikilinks # List of wikilinks (forward)
note.tasks # List of tasks
note.callouts # List of callouts
# Access raw frontmatter
raw = note.frontmatter # Dict-like object with raw values
# Get cleaned frontmatter (removes wikilinks, formats dates)
cleaned = note.frontmatter.clean()
# Custom date formatting
cleaned = note.frontmatter.clean(date_format='DD-MM-YYYY')
cleaned = note.frontmatter.clean(date_format='%B %d, %Y') # "March 24, 2025"
# Relationships
vault=Vault('path/to/vault')
note.get_backlinks(vault) # Notes that link to this note
note.get_forward_links(vault) # Notes this note links to
note.get_related_notes() # Related notes by various metrics
note.get_link_context("Target") # Get the context for a piece of text in your note
note.get_link_context( # E.g. context for a wikilink.
target=note.wikilinks[0].display_text,
context_chars=40)
Dataview Support
Parse and evaluate Dataview queries:
# Parse Dataview queries from a note
queries = note.dataview_queries
query = queries[0]
query.evaluate(vault, note)
# Evaluate a Dataview query in notes or sections
print(note.get_evaluated_view(vault))
note_section = notes.sections[10]
print(note_section.get_evaluated_view(vault))
Advanced Usage
Custom Search
# Configure similarity search
results = vault.search(
query="machine learning",
limit=10
threshold=0.6
)
Vault Analysis
# Build an note index dataframe of the vault
vault_index = vault.build_index()
# Build and analyze vault graph
graph = vault.get_note_graph()
# Find broken links
broken_links = vault.find_broken_links()
# Relationship analysis
relationship_stats = vault.analyze_relationships() # Builds a Relationship Analyzer object
stats_report = relationship_stats.build_statistics_report()
df = relationship_stats.export_to_dataframe() # Pandas dataframe object
relationship_stats.find_hub_notes( # Find notes with lots of connections (default = 10)
min_connections=50
)
orphaned_notes = relationship_stats.find_orphaned_notes() # Find orphaned notes (no backlinks)
Working with Parsed Elements
# Access specific elements
for link in note.wikilinks:
print(f"Link to: {link.target}, alias: {link.alias}")
for task in note.tasks:
if task.status == " ":
print(f"TODO: {task.text}")
for tag in note.tags:
print(f"Tag: #{tag.name}")
Requirements
- Python 3.12+ (earlier versions may be supported but not yet tested)
- Dependencies are automatically installed with pip
Contributing
Contributions are welcome! The project is hosted on Codeberg:
https://codeberg.org/paddyd/obsidian-parser
Please feel free to submit issues and pull requests.
License
MIT
Changelog
0.2.0 (2025-01-13)
- Added
Frontmatter.clean()method for cleaning frontmatter values - Frontmatter now returns a dict-like object instead of plain dict
- Improved wikilink parsing in frontmatter values
0.1.0 (Initial Release)
- Core vault and note parsing functionality
- Obsidian markdown format support
- Dataview query parsing and evaluation
- Search capabilities (exact and similarity)
- Relationship tracking and graph building
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file obsidianmd_parser-0.2.0.tar.gz.
File metadata
- Download URL: obsidianmd_parser-0.2.0.tar.gz
- Upload date:
- Size: 47.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af8d7c40ba1d5efe98cf9a3fda53c12863def93b3c9c888acf9da118eb3e9a21
|
|
| MD5 |
afd4f400f89dfe339c31401f2cfc20a4
|
|
| BLAKE2b-256 |
57ba8839babac364b984f930fd69ec766a45583fa6a4fee9513a2f88b062cdce
|
File details
Details for the file obsidianmd_parser-0.2.0-py3-none-any.whl.
File metadata
- Download URL: obsidianmd_parser-0.2.0-py3-none-any.whl
- Upload date:
- Size: 52.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56d5eb86abb9547c8cce56a14fa2857494447e3a87f6e69f131c0a75e607a690
|
|
| MD5 |
bf9259b6d15adba779f26b3c1ba57ca4
|
|
| BLAKE2b-256 |
85299e6e4b76be895df8eeadcb622d6d3b40ca5ad69ad90c77f2b87e950f15cb
|