Skip to main content

Doc-Warden is an internal project created by the Azure SDK Team. It is intended to be used by CI Builds to ensure that documentation standards are met. See readme for more details.

Project description

Doc Warden Build Status

Every CI build owned by the Azure-SDK team also needs to verify that the documentation within the target repo meets a set of standards. Doc-warden is intended to ease the implementation of these checks in CI builds.

Features:

  • Enforces Readme Standards
    • Readmes present - completed
    • Readmes have appropriate contents - completed
    • Files issues for failed standards checks - pending
  • Generates report for included observed packages - pending

This package is tested on Python 2.7 -> 3.8.

Prerequisites

This package is intended to be run as part of a pipeline within Azure DevOps. As such, Python must be installed prior to attempting to install or use Doc-Warden. While pip comes pre-installed on most modern Python installs, if pip is an unrecognized command when attempting to install warden, run the following command after your Python installation is complete.

In addition, warden is distributed using setuptools and wheel, so those packages should also be present prior to install.

/:> python -m ensurepip
/:> pip install setuptools wheel

Usage

Right now, warden has a single command. scan, which by default looks for a target .docsettings.yml file within the target repo. However, all the parameters that can be pulled from the .docsettings files will override whatever is placed within the .docsettings file.

Example usage:


<pre-step, clone target repository>
...
/:> pip install setuptools wheel
/:> sudo pip install doc-warden
/:> ward scan -d $(Build.SourcesDirectory)

Notes for example above

  • Devops is a bit finicky with registering a console entry point, hence the sudo just on the installation. sudo is only required on devops machines.
  • Assumption is that the .docsettings file is placed at the root of the repository.

To provide a different path (like azure-sdk-for-java does...), use:


/:> ward scan -d $(Build.SourcesDirectory) -c $(Build.SourcesDirectory)/eng/.docsettings.yml

Parameter Options

command Currently supports the scan command. Additional commands may be supported in the future. Required.

--scan-directory The target directory warden should be scanning. Required.

--scan-language warden checks for packages by convention, so it needs to understand what language it is looking at. This must be populated either in .docsettings file or by parameter. Required.

--config-location By default, warden looks for the .docsettings file in the root of the repository. However, populating this location will override this behavior and instead pull the file from the location in this parameter. Optional.

--verbose-output Enable or disable output of an html report. Defaults to false. Optional.

Notes for Devops Usage

The -d argument should be $(Build.SourcesDirectory). This will point warden at the repo that has been associated with CI.

Methodology

Enforcing Readme Presence

When should we expect a readme to be present?

Always:

  • At the root of the repo
  • Associated with a package directory

.Net

A package directory is indicated by:

  • a *.sln file under the sdk directory

Python

A package directory is indicated by:

  • the presence of a setup.py file

Java

A package directory is indicated by:

  • the presence of a pom.xml file
    • The POM <packaging> value within is set to JAR

Node & JS

A package directory is indicated by:

  • The presence of a package.json file

Enforcing Readme Content

doc-warden has the ability to check discovered readme files to ensure that a set of configured sections is present. How does it work? doc-warden will check each pattern present within required_readme_sections against all headers present within a target readme. If all the patterns match at least one header, the readme will pass content verification.

Other Notes:

  • A section title is any markdown or RST that will result in a <h1> to <h2> html tag.
  • warden will content verify any readme.rst or readme.md file found outside the omitted_paths in the targeted repo.

Control, the .docsettings.yml File, and You

Special cases often need to be configured. It seems logical that there needs be a central location (per repo) to override conventional settings. To that end, a new .docsettings.yml file will be added to each repo.

<repo-root>
│   README.md
│   .docsettings.yml
│
└───.azure-pipelines
│   │   <build def>
│   
└───<other files and folders>

The presence of this file allows each repository to customize how enforcement takes place within their repo.

Example DocSettings File for Java Repo

omitted_paths:
  - archive/*
language: java
root_check_enabled: True
required_readme_sections:
  - "(Client Library for Azure .*|Microsoft Azure SDK for .*)"
  - Getting Started
known_presence_issues:
  - ['cognitiveservices/data-plane/language/bingspellcheck', '#2847']
known_content_issues:
  - ['sdk/template/azure-sdk-template/README.md','#1368']

The above configuration tells warden...

  • The language within the repo is java
  • To ensure that a README.md is present at the root of the repository.
  • To omit any paths under archive/ from the readme checks.

Possible values for language right now are ['net', 'java', 'js', 'python']. Greater than one target language is not currently supported.

required_readme_sections Configuration

This section instructs warden to verify that there is at least one matching section title for each provided section pattern in any discovered readme. Regex is fully supported.

The two items listed from the example .docsettings file will:

  • Match a header matched by a simple regex expression
  • Match a header exactly titled "Getting Started"

Note that the regex is surrounded by quotation marks where the regex will break yml parsing of the configuration file.

known_presence_issues and known_content_issues Configuration

doc-warden is designed to crash builds if it detects failures. However, the vast majority of the time, these issues cannot be fixed immediately. In the above configuration, there are two paths highlighted as known issues.

The first, known_presence_issues, tells warden that a presence failure detected at the specified paths should be ignored and should not result in a crashed build. A tuple describing each known issue specifies both what the known issue is, as well as some sort of justification. Having an exception with an issueId attached is a good justification for not failing the build.

We're aware of this issue, and it is tracked in the following github issue.

The known_content_issues parameter functions identically to the known_presence_issues check. If a readme is listed as "already known" to have failures, the entire CI build will not be crashed by Warden.

Provide Feedback

If you encounter any bugs or have suggestions, please file an issue here and assign to scbedd.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc-warden-0.2.2.zip (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doc_warden-0.2.2-py2.py3-none-any.whl (12.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file doc-warden-0.2.2.zip.

File metadata

  • Download URL: doc-warden-0.2.2.zip
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for doc-warden-0.2.2.zip
Algorithm Hash digest
SHA256 6f7084f4aa68fe1b4df899970d76fe013042789eb75b0eeae453d3fdc89c184d
MD5 4e1ba74acd02c89f326bcbe8fb3d840c
BLAKE2b-256 271ebdf82f7624d113df6a5902368265b6c23f67b7cf24c30fefcfd9e74c42b9

See more details on using hashes here.

File details

Details for the file doc_warden-0.2.2-py2.py3-none-any.whl.

File metadata

  • Download URL: doc_warden-0.2.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.8

File hashes

Hashes for doc_warden-0.2.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9ca8edcbb094128d30c9630f3365fb1dd39181d42001afc9c2d53333c0e25895
MD5 144a578bd7c6933f18e9d383755e8d48
BLAKE2b-256 d20dd008af24b3b7da99de99fc60a86ae24ecd834045e91007a646abd5b29c16

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page