Skip to main content

No project description provided

Project description

Annotations

This repository contains a framework for generating annotations for command invocations. It comprises a parser which turns a string into a command invocation data structure. For the time being, there are two sets of annotation generators:

  • input-output information which specifies how a command invocation interacts with the files, pipes, stdin, stdout, etc.
  • parallelizability information which describes how a command invocation can be parallelized - containing information about how to split inputs, mappers and aggregators, etc.

Command-line tool

main.py contains a command line tool which, provided a command invocation returns:

  • the parsed command invocation data structure
  • the input-output information generated
  • the parallelizability information generated

Adding an annotation

Parser

Use command_flag_option_info JSON files to parse xbd-type terminal commands. Will split on spaces (" ") and equal signs ("=").

Flag and Option Information

The folder command_flag_option_info contains [command_name].json files with list of flags and options for each command. For arguments that have two options (e.g. -a and --all), store them as a pair in the format [short version, long version]. In addition, we store here in which way an argument is accessed if applicable, e.g., if it is a file.

We also have a regex-based script that can be used to generate initial JSON files with parsed arguments. Since there is no standard for man-pages, the quality of results varies but it usually provides a good skeleton and saves quite some time.

Annotation Generation

Currently, annotation generators for input-output information and parallelizability information has been implemented. Each annotation generator implements a specific generator interface (e.g., InputOutputInfoGenerator_Interface.py) which specializes a more general generator interface (Generator_Interface.py). The general generator interface contains functions that help to check conditions on the command invocation while the more specific generator interface provides functionality to change the respective information (object) generated.

Terms

  • flag = takes no arguments, e.g. --verbose
  • option = takes arguments, e.g. -n 10
  • operand = argument with no flag, e.g. input.txt

Coding

typing

We strive to use types and typecheck with pyright (v1.1.232). This does not only help to catch bugs but shall also help future developers to understand the code more easily.

tests

Use pytest to run tests. It will run all tests found (recursively) in the current directory.

imports

For clean imports, we add empty __init__.py modules in all non-root directories. Thus, pytest will add the root directory to sys.path and we can import modules by prefixing the path from there. For instance, to import Parallelizer.py, we use

from annotation_generation.parallelizers.Parallelizer import Parallelizer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pash_annotations-0.2.4-py3-none-any.whl (82.8 kB view details)

Uploaded Python 3

File details

Details for the file pash_annotations-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pash_annotations-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 ff475a6f8133393337b7e1c5a6396819e0832c1e2601c3344f905cc2c6d487d2
MD5 a2cb90c7302af750e6aaffa63eb7f032
BLAKE2b-256 a9e5d6096532154444b0064d9a2dfbccaf971e713932aee46e6f174413bae8ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page