Skip to main content

CLI tool to generate HTML documentation for an Apache Avro schema

Project description

avdoc

CLI tool to generate human-readable HTML documentation for an Apache Avro schema AVSC file.

Want Avro schema docs? 'avdoc!

screenshot showing avdoc

Installation

Requirements

Software required outside of Python package dependencies:

Install as Python package

Install the avdoc package on PyPI:

pip install --upgrade avdoc

Usage

[python -m] avdoc tests/example.avsc > out/example.html && open out/example.html

To provide a version ID, e.g. the current git commit:

[python -m] avdoc --schema-version $(git rev-parse --short head) example.avsc > out/example.html

$ avdoc --help

usage: avdoc [-h] [--schema-title SCHEMA_TITLE]
             [--schema-version SCHEMA_VERSION]
             avsc

CLI tool to generate HTML documentation for an Apache Avro schema

positional arguments:
  avsc

options:
  -h, --help            show this help message and exit
  --schema-title SCHEMA_TITLE
  --schema-version SCHEMA_VERSION

Features

  • graph/diagram of which record schemas reference each other
    • reference graph for every complex (record) type
  • Markdown support in "doc" strings
  • supports all complex Avro types
    • record
    • enum
    • array
    • union
    • error
    • map
    • request

Design Goals

The output should:

  • be well-formatted semantic HTML.
  • be legible in basic browsers without styling.
  • aid understanding of the underlying schema.
  • be a single static file for sharing without dependencies.
  • be linkable to reference specific schemas and fields.

Development

  • devenv for development environment
  • direnv for automatic shell activation (optional)

devenv shell should set up Python & Poetry with dependencies installed. Use .venv/bin/python as your Python interpreter.

Publishing

Bump version

bumpversion major|minor|patch

Update documentation

Run mdsh.

Publish Python package to PyPI

Configure Poetry credentials with PyPI token and run:

poetry publish --build

Architecture

Not much to speak of.

avdoc is a couple of hundred lines of Python script generating static HTML, with a bit of string munging to get component outputs into the final HTML output page. This code is purpose-oriented. The output is opinionated, but not much time has been spent on the code past getting it working for my own needs. It's not intended to be exemplary of anything in particular.

Maintenance

I probably won't pay too much attention to avdoc maintenance once it's suitable for my own needs. I'd like to try to ensure that dependencies are kept up to date.

Fork for your own needs. Raise a PR if you'd like me to consider including your changes. Make sure you adhere to the license by ensuring your users have access to your modifications.

License

AGPL:

[…] requires the operator of a network server to provide the source code of the modified version running there to the users of that server. Therefore, public use of a modified version, on a publicly accessible server, gives the public access to the source code of the modified version.

avdoc is released as copyleft software. If you modify avdoc then you must make changes available to your users.

If the AGPL license is an issue, and you want to relicense avdoc privately, then reach out to discuss pricing.

Prior Art

avdoc is intended as a replacement for avrodoc-plus, which itself was intended as a replacement for avrodoc, via a long line of forks.

To run avrodoc-plus and see its output:

npm install @mikaello/avrodoc-plus
node_modules/@mikaello/avrodoc-plus/bin/avrodoc-plus.js example.avsc --output out/avrodocplus.html

Why?

Unfortunately the original avrodoc and forks are all in varying stages of software decay, mostly due to NodeJS ecosystem churn. Their NPM package dependencies include packages which have themselves gone unmaintained or had breaking changes in following versions, with CVEs piling up against the transitive dependencies. avrodoc-plus has about 10 critical CVEs in its dependency graph. This isn't necessarily an issue in itself unless you're running these avrodoc tools in an online capacity or on untrusted input. But at $WORK it was generating a lot of false-positives in automatic SBOM security scanners which had to be explained to infosec specialists.

The HTML output from the avrodoc tools is also rather dynamic, requiring JS to render, when it could just be a classic HTML page.

I have taken the opportunity to implement some quality-of-life improvements for readers. See §Design Goals for more info.

Why the name avdoc specifically? The Apache Software Foundation protects project name trademarks (quite rightly) and I wanted to avoid the kcat naming issue.

avdoc is "Powered by Apache Avro™" but not a part of Apache Avro™.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

avdoc-0.4.1.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

avdoc-0.4.1-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file avdoc-0.4.1.tar.gz.

File metadata

  • Download URL: avdoc-0.4.1.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Darwin/24.1.0

File hashes

Hashes for avdoc-0.4.1.tar.gz
Algorithm Hash digest
SHA256 1bec9828dd205152ba2d0b9478271d338f7b2df420b2319f7807fa7d18a566d6
MD5 b8e0426e8c56ee4773843c642d35af95
BLAKE2b-256 0be03b588f110ecf66c22963868b7984ed18494f13e35988db0a551a37f22006

See more details on using hashes here.

File details

Details for the file avdoc-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: avdoc-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Darwin/24.1.0

File hashes

Hashes for avdoc-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 659cdb10594129aeae03f2cfe01f8b72f4d6e0d7eaa3df227912b302e782b704
MD5 f2e173244075d74772a8c5227a6c36ab
BLAKE2b-256 6f34f2e5a84542ce4f2b911956d94140fc8c7fbdbb2f990e42dc4754456f86f3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page