Transform handwritten images into structured documents (Markdown, JSON, YAML, XML)

Project description

Handmark

Handmark is a Python CLI tool that converts handwritten notes from images into structured documents. It supports multiple AI providers (Azure AI and Ollama) and output formats (Markdown, JSON, YAML, XML), making it easy to digitize handwritten content with flexible processing options.

Architecture

graph TD
    subgraph "User Interface"
        A[User] -->|Interacts with| B{Handmark CLI}
    end

    subgraph "Application Core"
        B --> C[main.py]
        C --> D[config.py]
        C --> E[model.py]
        C --> F[utils.py]
        C --> G{ImageDissector}
    end

    subgraph "Configuration & Models"
        D --> H[config.yaml]
        H --> J((Model definitions and provider info))
        E --> J
    end

    subgraph "Providers"
        G --> R[providers.factory.create_provider]
        R --> S[AzureProvider]
        R --> T[OllamaProvider]
        R --> U[BaseProvider]
    end

    subgraph "Image Processing"
        G -->|calls| U
        U --> I[AzureService]
        U --> V[OllamaService]
        G --> M((FormatModels))
        G --> W[ContentProcessors]
    end

    subgraph "Output & Filesystem"
        G --> P[Output File]
        A --> Q[Input Image]
    end

    subgraph "CLI Flows"
        B --> X[Model selection & validation]
        X --> R
        B --> G
    end

    subgraph "External Auth & Tokens"
        I --> O[GitHubToken]
    end

    %% Styling
    style A fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333,stroke-width:2px
    style G fill:#ccf,stroke:#333,stroke-width:2px
    style I fill:#f96,stroke:#333,stroke-width:2px
    style P fill:#9f9,stroke:#333,stroke-width:2px
    style Q fill:#9f9,stroke:#333,stroke-width:2px

Features

🖼️ Multi-Format Document Generation - Transform handwritten notes into Markdown, JSON, YAML, or XML
🧠 Intelligent Title Extraction - Automatically detects and extracts titles from content for smart file naming
⚡ Easy CLI Interface - Simple, intuitive commands with rich console output and comprehensive error handling
🤖 Dual AI Provider Support - Choose between Azure AI (remote) or Ollama (local) for processing
🔧 Advanced Model Configuration - Select from multiple AI models with availability validation
🔐 Secure Authentication - GitHub token-based authentication with secure local storage
📁 Flexible Output - Customize output directory and filename options with intelligent fallbacks
⚙️ YAML Configuration - Centralized configuration via config.yaml for easy customization
🎯 Multiple Output Formats - Support for Markdown, JSON, YAML, and XML output formats with format-specific processing
🏠 Local Processing Option - Use Ollama for completely local, offline image processing
🔄 Provider Factory Pattern - Automatic provider selection based on model configuration and availability

Quick Start

Install Handmark:
```
pip install handmark
```
Configure authentication:
```
handmark auth
```
Process your first image:
```
handmark digest path/to/your/image.jpg
```

That's it! Your handwritten notes will be converted to a Markdown file.

Installation

Requirements

Python 3.10 or higher
A GitHub token (for Azure AI access)

Install from PyPI

pip install handmark

Install with uv (recommended)

uv pip install handmark

Install from source

git clone https://github.com/devgabrielsborges/handmark.git
cd handmark
pip install -e .

Usage

Getting Started

Before processing images, you need to configure authentication:

handmark auth

This will prompt you to enter your GitHub token, which provides access to Azure AI services.

Commands Overview

Command	Description
`handmark digest <image>`	Convert handwritten image to specified format (MD/JSON/YAML/XML)
`handmark auth`	Configure GitHub token authentication
`handmark set-model`	Select and configure AI model (Azure/Ollama)
`handmark config`	View current configuration settings
`handmark status`	Check provider availability and model status
`handmark test-connection`	Test connection to AI service
`handmark --version`	Show version information

Process an Image

handmark digest <image_path> [options]

Options:

-o, --output <directory> - Specify output directory (default: current directory)
-f, --format <format> - Output format: markdown, json, yaml, xml (default: markdown)
--filename <name> - Custom output filename (default: auto-generated)

Examples:

# Basic usage - process image to markdown
handmark digest samples/prova.jpeg

# Custom output format
handmark digest samples/prova.jpeg -f json

# Custom output directory and format
handmark digest samples/prova.jpeg -o ./notes -f yaml

# Custom filename with XML format
handmark digest samples/prova.jpeg --filename lecture-notes.xml -f xml

# All options combined
handmark digest samples/prova.jpeg -o ./outputs --filename my-notes.json -f json

Supported Image Formats

Handmark supports common image formats including:

JPEG/JPG
PNG
And other formats supported by Azure AI Vision

Configure Authentication

handmark auth

This will prompt you to enter your GitHub token, which is required for Azure AI integration. The token is securely stored in a .env file in the project directory.

Configure Model

handmark set-model

This command lets you select and configure the AI model used for image processing. You can choose from:

Azure AI Models (Remote) - GitHub token-based access to cloud models
Ollama Models (Local) - Locally installed models for offline processing

The system will show model availability and guide you through installation if needed. Your selection will be saved for future runs. If no model is configured, the system will use a default Azure model.

Check Provider Status

handmark status

This command shows the availability of both Azure and Ollama providers, installed models, and current configuration status.

Check Version

handmark --version

Configuration

Handmark uses a centralized YAML configuration system that allows you to customize:

AI model prompts - Customize how the AI processes your images
Output format settings - Configure file extensions, content types, and format-specific options
Available models - Add or modify the list of AI models
Default settings - Set default output formats and directories

Configuration File

The main configuration is stored in config.yaml in the project root. You can customize:

# Example customizations
formats:
  markdown:
    system_message_content: "Custom prompt for better academic note processing"
    user_message_content: "Convert this academic content with proper citations"

available_models:
  - name: "custom/model"
    pretty_name: "Custom Model"
    provider: "Custom Provider"
    rate_limit: "100 requests/day"

Configuration Commands

handmark config - View current configuration

For detailed configuration options, see CONFIG.md.

Example

Here's a real-world example of Handmark in action:

Input image (samples/prova.jpeg):

Handwritten notes example

Command used:

handmark digest samples/prova.jpeg -f markdown

Output (primeiro-exercicio-escolar-2025-1.md):

# Primeiro Exercício Escolar - 2025.1

Leia atentamente todas as questões antes de começar a prova. As respostas obtidas somente terão validade se respondidas nas folhas entregues. Os cálculos podem ser escritos à lápis e em qualquer ordem. Evite usar material diferente do que foi apresentado em sala ou justifique o material extra adequadamente para validá-lo. Não é permitido uso de celular ou calculadora.

1. (2 pontos) Determine a equação do plano tangente a função $f(x,y) = \sqrt{20 - x^2 - 7y^2}$ em (2,1). Em seguida, calcule um valor aproximado para $f(1,9 , 1,1)$.

2. (2 pontos) Determine a derivada direcional de $f(x,y) = (xy)^{1/2}$ em $P(2,8)$, na direção de $Q(5,4)$.

3. (2 pontos) Determine e classifique os extremos de $f(x,y) = x^4 + y^4 - 4xy + 2$

4. (2 pontos) Usando integrais duplas, calcule o volume acima do cone $z = (x^2 + y^2)^{1/2}$ e abaixo da esfera $x^2 + y^2 + z^2 = 1$

5. (2 pontos). Sabendo que $E$ é o volume delimitado pelo cilindro parabólico $z = 1 - y^2$, e pelos planos $z = 0$, $x = 1$, $x = -1$, apresente um esboço deste volume e calcule a integral tripla.

$$\iiint_E x^2e^y dV$$

Alternative output formats:

# Generate JSON format
handmark digest samples/prova.jpeg -f json --filename exam-content.json

# Generate YAML format  
handmark digest samples/prova.jpeg -f yaml -o ./structured-notes

# Generate XML format with custom filename
handmark digest samples/prova.jpeg -f xml --filename exam-questions.xml

The output filename is automatically derived from the detected title, and the content is processed according to the selected format with proper validation and formatting.

Troubleshooting

Common Issues

Authentication Error:

Error: GitHub token not configured or invalid

Solution: Run handmark auth to configure your GitHub token.

Image Format Error:

Error: Unsupported image format

Solution: Ensure your image is in a supported format (JPEG, PNG, etc.).

Timeout Error:

HTTPSConnectionPool(host='models.github.ai', port=443): Read timed out

Solution: The AI service might be experiencing high load. Try:

Wait a few minutes and retry
Use a different model with handmark set-model
Check service status with handmark test-connection
Consider using local Ollama models for offline processing

No Model Configured Warning:

No model configured. Using default model

Solution: Run handmark set-model to select your preferred AI model.

Ollama Service Issues:

Ollama service is not running

Solution: Install and start Ollama service:

Visit ollama.com for installation
Start the service and pull vision models: ollama pull llama3.2-vision
Check status with handmark status

Getting Help

Check the issues page for known problems
Create a new issue if you encounter a bug
Use handmark --help for command-line help
Use handmark test-connection to diagnose connection issues

Development

Prerequisites

Python 3.10 or higher
A GitHub token for Azure AI integration
uv (recommended) or pip for package management

Setup

Clone the repository:

git clone https://github.com/devgabrielsborges/handmark.git
cd handmark

Install dependencies:

# Using uv (recommended)
uv pip install -e .

# Or using pip
pip install -e .

Configure for development:

handmark auth  # Configure your GitHub token
handmark conf  # Select preferred AI model

Project Structure

src/ - Source code
- main.py - CLI interface and command handlers
- dissector.py - Image processing and Azure AI API interaction
- model.py - AI model management and configuration
- utils.py - Helper functions and utilities
samples/ - Sample images for testing and demonstration
tests/ - Comprehensive unit tests
.github/ - GitHub workflows and project instructions

Contributing

Contributions are welcome! Please feel free to:

Open an issue for bug reports or feature requests
Submit a pull request with improvements
Help improve documentation
Share examples of your handwritten notes processed with Handmark

Development Workflow

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Add tests if applicable
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Gabriel Borges (@devgabrielsborges)

Project details

Release history Release notifications | RSS feed

0.5.2.1

Aug 19, 2025

This version

0.5.2

Aug 18, 2025

0.5.1

Aug 18, 2025

0.5.0

Aug 18, 2025

0.4.2

Jul 3, 2025

0.4.1.1

Jul 2, 2025

0.4.1

Jul 2, 2025

0.4.0

Jul 2, 2025

0.3.2.1

May 29, 2025

0.3.2

May 29, 2025

0.3.1

May 21, 2025

0.3

May 21, 2025

0.2.1

May 19, 2025

0.2.0

May 19, 2025

0.1.0

May 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

handmark-0.5.2.tar.gz (15.9 kB view details)

Uploaded Aug 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

handmark-0.5.2-py3-none-any.whl (20.1 kB view details)

Uploaded Aug 18, 2025 Python 3

File details

Details for the file handmark-0.5.2.tar.gz.

File metadata

Download URL: handmark-0.5.2.tar.gz
Upload date: Aug 18, 2025
Size: 15.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for handmark-0.5.2.tar.gz
Algorithm	Hash digest
SHA256	`cf3e87afa1413d9fe207fb394c0ea3cb9fa81d7d0345c1f164c9f9723c7ac9c9`
MD5	`4715c392e02d688237ac6da0f52b4c1b`
BLAKE2b-256	`f89a88417865cda14c918e22476d89e705138762340cfa0b190de6675b2cbae9`

See more details on using hashes here.

File details

Details for the file handmark-0.5.2-py3-none-any.whl.

File metadata

Download URL: handmark-0.5.2-py3-none-any.whl
Upload date: Aug 18, 2025
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for handmark-0.5.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`22b1742745d1bdc3fecbd65fb21af56a5fd9c8a264f37e94ca5f696d15b5741c`
MD5	`5ef426fddef3189d17a390a7082ae57e`
BLAKE2b-256	`9ceace5a4679b2fda474401fbe70e5e63bfa62d01b235c1737fe3889cf68b57b`

See more details on using hashes here.

handmark 0.5.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Handmark

Architecture

Features

Quick Start

Installation

Requirements

Install from PyPI

Install with uv (recommended)

Install from source

Usage

Getting Started

Commands Overview

Process an Image

Supported Image Formats

Configure Authentication

Configure Model

Check Provider Status

Check Version

Configuration

Configuration File

Configuration Commands

Example

Troubleshooting

Common Issues

Getting Help

Development

Prerequisites

Setup

Project Structure

Contributing

Development Workflow

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes