Interfaces and data models for the Merger CLI plugin system.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

diogotoporcov

These details have not been verified by PyPI

Project description

Merger Plugin API

Interfaces and data models for extending the merger-cli tool with custom parsers and exporters.

This package provides:

Abstract base classes for custom Parsers and Exporters.
Data models for the File Tree structure.
Type definitions for seamless integration with merger-cli.

Compatibility

The merger-plugin-api is designed to be highly compatible to allow plugin developers to use a variety of environments.

Supported Python Versions: 3.8, 3.9, 3.10, and 3.11.

Installation

pip install merger-plugin-api

Creating Plugins

Plugins are standalone Python modules that define a Parser or TreeExporter class.

Custom Parsers

To support non-text file formats (e.g., PDF, Images), implement a custom parser. More complete examples like this one are available in the examples/parsers/ directory.

Here is an example of a PDF parser using pymupdf:

from pathlib import Path
from typing import Union, Optional, Set, Type

import pymupdf
from merger_plugin_api import Parser

# Optional: List of Python packages required for this plugin
REQUIREMENTS = ["pymupdf"]

# File extensions this parser supports
EXTENSIONS: Set[str] = {".pdf"}


class PdfParser(Parser):
    MAX_BYTES_FOR_VALIDATION: Optional[int] = None

    @classmethod
    def validate(
        cls,
        file_chunk_bytes: Union[bytes, bytearray],
        file_path: Path
    ) -> bool:
        """
        Validate that the given file bytes represent a readable PDF document.
        """
        try:
            with pymupdf.open(stream=file_chunk_bytes) as doc:
                _ = doc[0]
            return True

        except Exception:
            return False

    @classmethod
    def parse(
        cls,
        file_bytes: Union[bytes, bytearray],
        file_path: Path,
    ) -> str:
        """
        Extracts and concatenates text from all pages of a PDF file.
        """
        texts = []
        with pymupdf.open(stream=file_bytes) as doc:
            for page in doc:
                text = page.get_text()
                if text:
                    text = text.replace("\n\n", "")
                    texts.append(text)

        full_text = " ".join(texts)
        return full_text


# Export the parser class
parser_cls: Type[Parser] = PdfParser

Custom Exporters

To output the merged data in a custom format (e.g., XML, Markdown), implement a TreeExporter. More complete examples like this one are available in the examples/exporters/ directory.

Here is an example of an XML exporter:

import xml.etree.ElementTree as ET
from typing import Type
from merger_plugin_api import FileEntry, DirectoryEntry, FileTreeEntry, TreeExporter, FileTree

# The name of the exporter (used in --exporter argument)
NAME = "XML"
# The extension of the output file
FILE_EXTENSION = ".xml"

class XmlExporter(TreeExporter):
    """
    A custom exporter that generates an XML representation of the file tree.
    """

    @classmethod
    def export(cls, tree: FileTree) -> bytes:
        """
        Export the file tree into an XML representation.
        """
        root = ET.Element("filetree")
        cls._to_xml(tree.root, root)

        cls._indent(root)

        return ET.tostring(root, encoding="utf-8", xml_declaration=True)

    @classmethod
    def _to_xml(cls, entry: FileTreeEntry, parent: ET.Element):
        if isinstance(entry, FileEntry):
            file_el = ET.SubElement(parent, "file", {
                "name": entry.name,
                "path": entry.path.as_posix()
            })
            content_el = ET.SubElement(file_el, "content")
            content_el.text = entry.content

        elif isinstance(entry, DirectoryEntry):
            dir_el = ET.SubElement(parent, "directory", {
                "name": entry.name,
                "path": entry.path.as_posix()
            })
            for child in sorted(entry.children.values(), key=lambda e: e.name.lower()):
                cls._to_xml(child, dir_el)

    @classmethod
    def _indent(cls, elem: ET.Element, level: int = 0):
        """
        Recursive function to indent XML elements while preserving text content.
        """
        i = "\n" + level * "  "
        if len(elem):
            if not elem.text or not elem.text.strip():
                elem.text = i + "  "

            if not elem.tail or not elem.tail.strip():
                elem.tail = i

            for child in elem:
                cls._indent(child, level + 1)

            if len(elem) > 0:
                last_child = elem[-1]
                if not last_child.tail or not last_child.tail.strip():
                    last_child.tail = i

        else:
            if level and (not elem.tail or not elem.tail.strip()):
                elem.tail = i

# Export the exporter class
exporter_cls: Type[TreeExporter] = XmlExporter

Data Models

The FileTree object represents the hierarchical structure of the scanned directory.

`FileTree`

root: A DirectoryEntry representing the scan root.

`DirectoryEntry`

name: Name of the directory.
path: pathlib.Path relative to the scan root.
children: A dictionary mapping names to FileTreeEntry (FileEntry or DirectoryEntry).

`FileEntry`

name: Name of the file.
path: pathlib.Path relative to the scan root.
content: The parsed text content of the file.
extension: File extension (including the dot).

Using Your Plugins

Once you have implemented your plugin, install it via the CLI:

merger --install-plugin path/to/your_plugin.py

Merger will automatically detect if it's a parser or exporter and install any listed REQUIREMENTS using its internal uv manager.

For more information, visit the main repository.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

diogotoporcov

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Apr 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merger_plugin_api-1.0.0.tar.gz (44.1 kB view details)

Uploaded Apr 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

merger_plugin_api-1.0.0-py3-none-any.whl (30.9 kB view details)

Uploaded Apr 2, 2026 Python 3

File details

Details for the file merger_plugin_api-1.0.0.tar.gz.

File metadata

Download URL: merger_plugin_api-1.0.0.tar.gz
Upload date: Apr 2, 2026
Size: 44.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for merger_plugin_api-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`c540ea8539a91ef69ab10a8b3d932a7dffa3e430afa89e445cd758eb02c2a68b`
MD5	`a7ea7786f32986694c18839e4bb99dd7`
BLAKE2b-256	`f7755146ea0fa36940bbbda44cc44715bc3ceeb658a95cdfe71d3f9926f8e158`

See more details on using hashes here.

Provenance

The following attestation bundles were made for merger_plugin_api-1.0.0.tar.gz:

Publisher: publish-plugin-api.yml on diogotoporcov/merger-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: merger_plugin_api-1.0.0.tar.gz
- Subject digest: c540ea8539a91ef69ab10a8b3d932a7dffa3e430afa89e445cd758eb02c2a68b
- Sigstore transparency entry: 1214843240
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: diogotoporcov/merger-cli@e7a8a4e3c0b47b3dfa998d5ef42e9320b3de448b
- Branch / Tag: refs/tags/api-v1.0.0
- Owner: https://github.com/diogotoporcov
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-plugin-api.yml@e7a8a4e3c0b47b3dfa998d5ef42e9320b3de448b
- Trigger Event: push

File details

Details for the file merger_plugin_api-1.0.0-py3-none-any.whl.

File metadata

Download URL: merger_plugin_api-1.0.0-py3-none-any.whl
Upload date: Apr 2, 2026
Size: 30.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for merger_plugin_api-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bb7a09d42d68b63cc503ac9a5d09f50f315eaa3aab26b70aa175ca9bc8b0d4fb`
MD5	`818fd97f7dbe137ab10749d5937c0131`
BLAKE2b-256	`bd4914521b79623e86dd5e7525b5587738c65d1e00bd09d1f2b14f0e50490f3e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for merger_plugin_api-1.0.0-py3-none-any.whl:

Publisher: publish-plugin-api.yml on diogotoporcov/merger-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: merger_plugin_api-1.0.0-py3-none-any.whl
- Subject digest: bb7a09d42d68b63cc503ac9a5d09f50f315eaa3aab26b70aa175ca9bc8b0d4fb
- Sigstore transparency entry: 1214843348
- Sigstore integration time: Apr 2, 2026
Source repository:
- Permalink: diogotoporcov/merger-cli@e7a8a4e3c0b47b3dfa998d5ef42e9320b3de448b
- Branch / Tag: refs/tags/api-v1.0.0
- Owner: https://github.com/diogotoporcov
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-plugin-api.yml@e7a8a4e3c0b47b3dfa998d5ef42e9320b3de448b
- Trigger Event: push

merger-plugin-api 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Merger Plugin API

Compatibility

Installation

Creating Plugins

Custom Parsers

Custom Exporters

Data Models

FileTree

DirectoryEntry

FileEntry

Using Your Plugins

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`FileTree`

`DirectoryEntry`

`FileEntry`