Skip to main content

A Zettelkasten-inspired knowledge base

Project description

ZKB: Zettelkasten-inspired Knowledge Base with EBR

A Zettelkasten-inspired knowledge base with Embeddings-Based Retrieval

PyPI version CI Rye GitHub license GitHub issues GitHub stars Twitter

ZKB is a command-line tool for managing a Zettelkasten-inspired knowledge base of markdown notes. It helps you organize, link, and analyze your notes efficiently, while also providing powerful question-answering capabilities through Embeddings-Based Retrieval (EBR).

Features

  • Scan and index markdown notes
  • Find orphaned notes (notes not linked to by any other note)
  • Detect broken links between notes
  • Find backlinks to a specific note
  • Create, read, update, and delete notes
  • Search notes based on content or metadata
  • Generate and index question-answer pairs from notes
  • Query the knowledge base using natural language questions (EBR)

Installation

  1. Clone the repository:

    git clone https://github.com/witt3rd/zkb.git
    
  2. Install dependencies:

    pip install .
    

Usage

python -m zkb.cli [--data-dir DATA_DIR] [--db-path DB_PATH] {command} [args]

Available commands:

  • scan: Scan notes and update the database
  • find-orphans: Find orphaned notes
  • find-broken-links: Find broken links
  • find-backlinks {filename}: Find backlinks to a specific note

Configuration

You can set the following environment variables or use a .env file:

  • DATA_DIR: Directory containing markdown notes (default: "data/")
  • DB_DIR: Directory for the SQLite database (default: "db/")

Embeddings-Based Retrieval (EBR)

ZKB incorporates Embeddings-Based Retrieval (EBR) through the qa-store library. This feature enables:

  1. Automatic generation of question-answer pairs from your notes.
  2. Indexing of these pairs using embeddings for efficient retrieval.
  3. Natural language querying of your knowledge base.

EBR allows you to ask questions about your notes and receive relevant answers, even if the exact wording doesn't match. This powerful feature enhances the discoverability and utility of your knowledge base.

Design

How it works

  1. ZKB scans markdown files in the specified directory.
  2. It parses each note, extracting metadata, content, and links.
  3. The extracted information is stored in an SQLite database.
  4. Question-answer pairs are generated from the notes and indexed using embeddings.
  5. Various operations can be performed on the indexed data, including EBR-based querying.

Key Operations

  1. Scanning notes: Parses markdown files, extracts metadata and links, and updates the database.
  2. Finding orphaned notes: Identifies notes that are not linked to by any other note.
  3. Detecting broken links: Finds links that point to non-existent notes.
  4. Finding backlinks: Discovers which notes link to a specific note.
  5. Creating/Reading/Updating/Deleting notes: Manages individual notes in the knowledge base.
  6. Searching notes: Finds notes based on content or metadata.
  7. Generating and indexing QA pairs: Creates question-answer pairs from notes and indexes them for retrieval.
  8. Querying the knowledge base: Uses natural language questions to retrieve relevant information from the notes.

Data Structures

  1. Note: Represents a markdown note with properties like filename, full path, metadata, content, and links.
  2. Database: SQLite database with two main tables:
    • notes: Stores information about each note (id, filename, full_path, title)
    • links: Stores links between notes (from_note, to_note, display_text)
  3. QA Knowledge Base: Stores and indexes question-answer pairs generated from notes.

Components

  1. ZKB: Main class that orchestrates the operations.
  2. Database: Handles database operations.
  3. Note: Represents and parses individual markdown notes.
  4. CLI: Provides the command-line interface.
  5. QuestionAnswerKB: Manages the generation, indexing, and retrieval of QA pairs.

TODO

We're constantly working to improve zkb. Here are some features and enhancements we're planning to implement:

  • Implement a web-based user interface
  • Add support for tags and categories
  • Improve the natural language processing capabilities
  • Implement a plugin system for extensibility
  • Add visualization tools for exploring the knowledge graph

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Donald Thompson - @dt_public - witt3rd@witt3rd.com

Project Link: https://github.com/witt3rd/zkb

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zkb-0.1.1.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

zkb-0.1.1-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file zkb-0.1.1.tar.gz.

File metadata

  • Download URL: zkb-0.1.1.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for zkb-0.1.1.tar.gz
Algorithm Hash digest
SHA256 45e08d576fb1658ce25f50288394a24bad46bb0636674832d52ac16d84b24267
MD5 307cda66883ba47528ece2ff46d52202
BLAKE2b-256 16533a12ce6b7e5101b0782dc1bac390b2453f3cfd896c172dbaf935605b7886

See more details on using hashes here.

File details

Details for the file zkb-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: zkb-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for zkb-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6107153f36eff6be2a7049e380a100f14aebb3017726d4e4563577b04515d9e6
MD5 aaf330ed5cd214ce6f01bc4bd96c0727
BLAKE2b-256 de139c2991c9d705c00eb4e444d1918d1490fe49ca255935712e40e71151f2d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page