Skip to main content

A standard language for machine-readable code comments

Project description

Downloads Downloads Coverage Status Lines of code Hits-of-Code Test-Package Python versions PyPI version Checked with mypy Ruff DeepWiki

logo

Many source code analysis tools use comments in a special format to mark it up. This is an important part of the Python ecosystem, but there is still no single standard around it. This library offers such a standard.

Table of contents

Why?

In the Python ecosystem, there are many tools dealing with source code: linters, test coverage collection systems, and many others. Many of them use special comments, and as a rule, the style of these comments is very similar. Here are some examples:

  • Ruff, Vulture —> # noqa, # noqa: E741, F841.
  • Black and Ruff —> # fmt: on, # fmt: off.
  • Mypy —> # type: ignore, type: ignore[error-code].
  • Coverage —> # pragma: no cover, # pragma: no branch.
  • Isort —> # isort: skip, # isort: off.
  • Bandit —> # nosec.

But you know what? There is no single standard for such comments.

The internal implementation of reading such comments is also different. Someone uses regular expressions, someone uses even more primitive string processing tools, and someone uses full-fledged parsers, including the Python parser or even written from scratch.

As a result, as a user, you need to remember the rules by which comments are written for each specific tool. And at the same time, you can't be sure that things like double comments (when you want to leave 2 comments for different tools in one line of code) will work in principle. And as the creator of such tools, you are faced with a seemingly simple task — just to read a comment — and find out for yourself that it suddenly turns out to be quite difficult, and there are many possible mistakes.

This is exactly the problem that this library solves. It describes a simple and intuitive standard for action comments, and also offers a ready-made parser that creators of other tools can use. The standard offered by this library is based entirely on a subset of the Python syntax and can be easily reimplemented even if you do not want to use this library directly.

The language

So, this library offers a language for action comments. Its syntax is a subset of Python syntax, but without Python semantics, as full-fledged execution does not occur. The purpose of the language is simply to provide the developer with the content of the comment in a convenient way, if it is written in a compatible format. If the comment format is not compatible with the parser, it is ignored.

From the point of view of the language, any meaningful comment can consist of 3 elements:

  • Key. This is usually the name of the specific tool for which this comment is intended, but in some cases it may be something else. This can be any string allowed as an identifier in Python.
  • Action. The short name of the action that you want to link to this line. Also, only the allowed Python identifier.
  • List of arguments. These are often some kind of identifiers of specific linting rules or other arguments associated with this action. The list of possible data types described below.

Consider a comment designed to ignore a specific mypy rule:

# type: ignore[error-code]
└-key-┘└action┴-arguments┘

↑ The key here is the word type, that is, what you see before the colon. The action is the ignore word, that is, what comes before the square brackets, but after the colon. Finally, the list of arguments is what is in square brackets, in this case, there is only one argument in it: error-code.

Simplified writing is also possible, without a list of arguments:

# type: ignore
└-key-┘└action┘

↑ In this case, the parser assumes that there is an argument list, but it is empty.

The number of arguments in the list is unlimited, they can be separated by commas. Here are the valid data types for arguments:

  • Valid Python identifiers. They are interpreted as strings.
  • Two valid Python identifiers, separated by the - symbol, like this: error-code. There can also be any number of spaces between them, they will be ignored. Interpreted as a single string.
  • String literals.
  • Numeric literals (int, float, complex).
  • Boolean literals (True and False).
  • None.
  • ... (ellipsis).
  • Any other Python-compatible code. This is disabled by default, but you can force the mode of reading such code and get descriptions for any inserts of such code in the form of AST objects, after which you can somehow process it yourself.

The syntax of all these data types is completely similar to the Python original (except that you can't use multi-line writing options). Over time, it is possible to extend the possible syntax of metacode, but this template will always be supported.

There can be several comments in the metacode format. In this case, they should be interspersed with the # symbol, as if each subsequent comment is a comment on the previous one. You can also add regular text comments, they will just be ignored by the parser if they are not in metacode format:

# type: ignore # <- This is a comment for mypy! # fmt: off # <- And this is a comment for Ruff!

If you scroll through this text above to the examples of action comments from various tools, you may notice that the syntax of most of them (but not all) is it can be described using metacode, and if not, it can be easily adapted to metacode. Read on to learn how to use a ready-made parser in practice.

Installation

Install it:

pip install metacode

You can also quickly try out this and other packages without having to install using instld.

Usage

The parser offered by this library is just one function that is imported like this:

from metacode import parse

To use it, you need to extract the text of the comment in some third-party way (preferably, but not necessarily, without the # symbol at the beginning) and pass it, and the expected key must also be passed as the second argument. As a result, you will receive a list of the contents of all the comments that were parsed:

print(parse('type: ignore[error-code]', 'type'))
#> [ParsedComment(key='type', command='ignore', arguments=['error-code'])]
print(parse('type: ignore[error-code] # type: not_ignore[another-error]', 'type'))
#> [ParsedComment(key='type', command='ignore', arguments=['error-code']), ParsedComment(key='type', command='not_ignore', arguments=['another-error'])]

As you can see, the parse() function returns a list of ParsedComment objects. Here are the fields of this type's objects and their expected types:

key: str 
command: str
arguments: List[Optional[Union[str, int, float, complex, bool, EllipsisType, AST]]]

↑ Please note that you are transmitting a key, which means that the result is returned filtered by this key. This way you can read only those comments that relate to your tool, ignoring the rest.

By default, an argument in a comment must be of one of the strictly allowed types. However, you can enable reading of arbitrary other types, in which case they will be transmitted in the AST node format. To do this, pass allow_ast=True:

print(parse('key: action[a + b]', 'key', allow_ast=True))
#> [ParsedComment(key='key', command='action', arguments=[<ast.BinOp object at 0x102e44eb0>])]

↑ If you do not pass allow_ast=True, a metacode.errors.UnknownArgumentTypeError exception will be raised. When processing an argument, you can also raise this exception for an AST node of a format that your tool does not expect.

⚠️ Be careful when writing code that analyzes the AST. Different versions of the Python interpreter can generate different AST based on the same code, so don't forget to test your code (for example, using matrix or tox) well. Otherwise, it is better to use standard metacode argument types.

You can allow your users to write keys in any case. To do this, pass ignore_case=True:

print(parse('KEY: action', 'key', ignore_case=True))
#> [ParsedComment(key='KEY', command='action', arguments=[])]

You can also easily add support for several different keys. To do this, pass a list of keys instead of one key:

print(parse('key: action # other_key: other_action', ['key', 'other_key']))
#> [ParsedComment(key='key', command='action', arguments=[]), ParsedComment(key='other_key', command='other_action', arguments=[])]

Well, now we can read the comments. But what if we want to record? There is another function for this: insert():

from metacode import insert, ParsedComment

You send the comment you want to insert there, as well as the current comment (empty if there is no comment, or starting with # if there is), and you get a ready-made new comment text:

print(insert(ParsedComment(key='key', command='command', arguments=['lol', 'lol-kek']), ''))
# key: command[lol, 'lol-kek']
print(insert(ParsedComment(key='key', command='command', arguments=['lol', 'lol-kek']), '# some existing text'))
# key: command[lol, 'lol-kek'] # some existing text

As you can see, our comment is inserted before the existing comment. However, you can do the opposite:

print(insert(ParsedComment(key='key', command='command', arguments=['lol', 'lol-kek']), '# some existing text', at_end=True))
# some existing text # key: command[lol, 'lol-kek']

⚠️ Be careful: AST nodes can be read, but cannot be written.

What about other languages?

If you are writing your Python-related tool not in Python, as is currently fashionable, but in some other language, such as Rust, you may want to adhere to the metacode standard for machine-readable comments, however, you cannot directly use the ready-made parser described above. What to do?

The proposed metacode language is a syntactic subset of Python. The original metacode parser allows you to read arbitrary arguments written in Python as AST nodes. The rules for such parsing are determined by the specific version of the interpreter that metacode runs under, and they cannot be strictly standardized, since Python syntax is gradually evolving in an unpredictable direction. However, you can use a "safe" subset of the valid syntax by implementing your parser based on this EBNF grammar:

line ::= element { "#" element }
element ::= statement | ignored_content
statement ::= key ":" action [ "[" arguments "]" ]
ignored_content ::= ? any sequence of characters excluding "#" ?

key ::= identifier
action ::= identifier { "-" identifier }
arguments ::= argument { "," argument }

argument ::= hyphenated_identifier 
           | identifier 
           | string_literal 
           | complex_literal 
           | number_literal 
           | "True" | "False" | "None" | "..."

hyphenated_identifier ::= identifier "-" identifier
identifier ::= ? python-style identifier ?
string_literal ::= ? python-style string ?
number_literal ::= ? python-style number ?
complex_literal ::= ? python-style complex number ?

If you suddenly implement your ready-made open-source parser of this grammar in a language other than Python, please let me know. This information can be added to this text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metacode-0.0.5.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metacode-0.0.5-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file metacode-0.0.5.tar.gz.

File metadata

  • Download URL: metacode-0.0.5.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metacode-0.0.5.tar.gz
Algorithm Hash digest
SHA256 02f0f4b78942e9bf9f8c20c83d54aea85c4bc5d3a1eb840e885e5d0d7bcad632
MD5 efb618216f80c59cc39ad58a94ef5e0f
BLAKE2b-256 72ab0b1d01eefc79c1159c79ca4d8a1264a5fd7165623e4d1e3c6ff67f4878dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for metacode-0.0.5.tar.gz:

Publisher: release.yml on mutating/metacode

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file metacode-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: metacode-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for metacode-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b547586502602b5f1d0e2e2d05b5786ed328c276dba63b41249e6deeed034128
MD5 44bde0c48a96b9551a8450862fbd8bf4
BLAKE2b-256 6dda7e41860fc5c61468c34ddb5397188d7a3e8ca00fee3d4fc4afde3c996a7c

See more details on using hashes here.

Provenance

The following attestation bundles were made for metacode-0.0.5-py3-none-any.whl:

Publisher: release.yml on mutating/metacode

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page