Skip to main content

Bash Tab-completion (data) server - total recall

Project description

Project status: working prototype

asciicast

What's this?

An integration framework to provide contextual Tab-auto-completion
for command line interfaces (CLI) in Bash shell.

Original use case

Auto-complete based on arbitrary structured data sets (e.g. config or ref data)
directly from standard shell.[^1]

This requires data indexing for responsive lookup
(the client has to start and find relevant data on each Tab-request).

The straightforward approach to meet performance requirements taken by argrelay is
to run a standby data server.

For example, with several thousands of service instances,
even if someone manages to generate Bash completion config,
it takes considerable time to load it for every shell instance.

Unlike static|generated|offline index, standby server also naturally supports dynamic data.

What's in a name?

Eventually, argrelay will "relay" command line arguments (hence, the name)
with associated data to user domain-specific program.

To clarify, let's compare side-by-side
(independent) argparse library and argrelay framework:

graph RL;

    %% user --> library
    %% user --> framework

    subgraph `argparse` library

        direction LR

        some.py <--> argparse;

    end

    argrelay_client -. delegates = relays .-> some.py;

    subgraph `argrelay` framework

        direction TB

        subgraph client

            direction LR

            relay2some --> argrelay_client[argrelay client];

        end
        
        subgraph server

            direction TB

            argrelay_server[argrelay server] <--> data[(data)];

        end

    end
Category argparse is a library argrelay is a framework
Given: some.py is some script relay2some is a "wrapper" command
configured in Bash to call argrelay
In Bash: type some.py to execute it type relay2some to let argrelay decide
whether to execute some.py
Execution: some.py calls argparse library some.py is called by the framework
when relay2some is invoked
Function: some.py directly does
domain-specific task
relay2some directly only "relays"
the command line to argrelay
CLI source: some.py defines its CLI
itself via argparse
CLI for relay2some is defined by
the framework via configs/plugins/data
CLI is: mostly code-driven mostly data-driven
Modify CLI: modify some.py keep some.py intact,
re-configure argrelay instead
Prog lang: some.py has to be
a Python script to use argparse
some.py can be anything
somehow executable by argrelay
Important: some.py/argparse have
no domain data to query
relay2some may access any
domain data from argrelay server

What's missing?

argrelay excludes:

  • Any (real) domain-specific data
  • Any (useful) domain-specific plugins

What's in the package?

argrelay includes:

  • Client to be invoked by Bash hook on every Tab to
    send command line arguments to the server.
  • Server to parse command line and propose values from
    pre-loaded data for the argument under the cursor.
  • Plugins to customize:
    • actions the client can run
    • objects the server can search
    • grammar the command line can have
  • Interfaces to bind these all together.
  • Demo example to start from.
  • Testing support and coverage.

CLI-friendly completion: primary focus

GUI-s are secondary for argrelay's niche because
GUI-s do not have the restrictions CLI-s have:

  • Technically, the server can handle requests for any GUI.
  • But API-s are primarily feature-tailored to support CLI.
show example For example, in GUI-s, typing a query into a search bar may easily be accompanied by
(1) a separate (from the search bar) window area
(2) with individually selectable
(3) full-text-search results
(4) populated **asyncly** with typing.

In CLI-s, grep does (3) full-text-search, but what about the rest (1), (2), (4)?

To facilitate selection of results,
catalogue-like navigation with auto-completion (rather than full-text-search)
seems the answer.

Syntax: origin story

When an interface is limited...

You probably heard about research where
apes were taught to communicate with humans in sign language
(their vocal apparatus cannot reproduce speech effectively).

Naturally, with limited vocabulary,
they combined known words to describe unnamed things.

For example,
to ask for a watermelon (without knowing the exact sign),
they used combination of known "drink" + "sweet".

The default argrelay CLI-interpretation plugin (see FuncArgsInterp)
prompts for object properties to disambiguate search results until single one is found.

continue story

Narrow down options

Without any context, just two words "drink" + "sweet" leave
a lot of ambiguity to guess a watermelon (many drinks are sweet).

A more clarified "sentence" could be:

drink striped red sweet fruit

Each word narrows down matching object set
to more specific candidates (including watermelon).

Avoid strict order

Notice that the word order is not important -
this line provides (almost) equivalent hints for guessing:

striped sweet fruit red drink

It is not valid English grammar, but it somewhat works.

Use "enum language"

Think of speaking "enum language":

  • Each word is an enum value of some enum type:
    • Color: red, green, ...
    • Taste: sweet, salty, ...
    • Temperature: hot, cold, ...
    • Action: drink, play, ...
  • Word order is irrelevant because enum value spaces do not overlap (almost).
  • To "say" something, one keeps clarifying meaning by more enum values.

Now, imagine the enum types and values are not supposed to be memorized,
they are proposed to select from (based on the current context).

Address any object

Suppose enums are binary = having only two values
(cardinality = 2: black/white, hot/cold, true/false, ...).

For example,
5 words could slice the object space to
single out (identify exactly) up to 2^5 = 32 objects.

To "address" larger object spaces,
larger enum cardinalities or more word places are required.

  • Each enum type ~ a dimension.
  • Each specific enum value ~ a coordinate.
  • Each object fills a slot in such multi-dimensional discrete space.

Apply to CLI

CLI-s are used to write commands - imperative sentences:
specific actions on specific objects.

The "enum language" above covers searching both
an action and any object it requires.

Suggest contextually

Not every combination of enum values may point to an existing object.

For data with sparse object spaces,
the CLI-suggestion can be shaped by coordinates applicable to
remaining (narrowed down) object sets.

Differentiate on purpose

All above may be an obvious approach to come up with,
but it is not ordinary for CLI-s of most common commands (due to lack of data):

Common commands (think ls, git, ssh, ...): argrelay-wrapped actions:
have succinct syntax and prefer
single-char switches (defined by code)
prefer explicit "enum language"
defined by data
rely on humans to memorize syntax
(options, ordering, etc.)
assume humans have
a loose idea about the syntax
auto-complete only for objects
known to the OS (hosts, files, etc.)
auto-complete from
a domain-specific data

Learn more about how search works.

Quick demo

This is a non-intrusive demo
(without permanent changes to user env, e.g. no ~/.bashrc changes).

Clone this repo somewhere.

If dev-shell.bash is run for the first time,
it will ask to provide python-conf.bash file - follow instruction on error.

To start both the server and the client,
two terminal windows are required.

  • Server:

    Start the first sub-shell:

    ./dev-shell.bash
    

    In this sub-shell, start the server:

    # in server `dev-shell.bash`:
    run_argrelay_server
    
  • Client:

    Start the second sub-shell:

    ./dev-shell.bash
    

    While it is running (temporarily),
    this sub-shell is configured for Bash Tab-completion for relay_demo command.

  • Try to Tab-complete command relay_demo using demo test data:

    # in client `dev-shell.bash`:
    relay_demo goto host            # press Tab one or multiple times
    
    # in client `dev-shell.bash`:
    relay_demo goto host dev        # press Alt+Shift+Q shortcut to describe command line args
    
  • Inspect how auto-completion binds to relay_demo command:

    # in client `dev-shell.bash`:
    complete -p relay_demo
    
  • Inspect client and server config:

    • server config: ~/.argrelay.server.yaml
    • client config: ~/.argrelay.client.json
  • To clean up, exit the sub-shells:

    # in client or server `dev-shell.bash`:
    exit
    

Data backend

There are two options at the moment - both using MongoDB API:

Category mongomock (default) PyMongo
Data set size: practical limit ~ 10K tested at 1M
Pro: nothing else to install no practical data set size limit found (yet)
for argrelay intended use cases
Con: understandably, does not meet
non-functional requirements
for large data sets
require some knowledge of MongoDB,
additional setup,
additional running processes

PyMongo connects to running MongoDB instance which has to be configured in mongo_config
and mongomock should be disabled in argrelay.server.yaml:

-    use_mongomock_only: True
+    use_mongomock_only: False

What's next?

  • After trying non-intrusive demo, try intrusive one for permanent setup.

  • Modify ServiceLoader.py plugin to provide data beyond demo data set.

    The data can be simply hard-coded with different test_data tag
    (other than TD_63_37_05_36 demo) and selected in argrelay.server.yaml:

        ServiceLoader:
            plugin_module_name: argrelay.custom_integ.ServiceLoader
            plugin_class_name: ServiceLoader
            plugin_type: LoaderPlugin
            plugin_config:
                test_data_ids_to_load:
                    #-   TD_70_69_38_46  # no data
    -               -   TD_63_37_05_36  # demo
    +               -   TD_NN_NN_NN_NN  # custom data
                    #-   TD_38_03_48_51  # large generated
    

    If hard-coding is boring, soft-code to load it from external data source.

  • Replace redirect to ErrorInvocator.py plugin
    to execute something useful instead when use hits Enter.

  • ...

  • Many features and docs are actively taking their shape -
    any (minimal, unfiltered, first-thought) feedback is welcome.

    Raise questions or suggestions as issues to influence the dev direction.

[footnotes]

[^1]: Brief History

Tab-completion with custom (domain-specific) arg values is<br/>
constantly on a dev wish list for complex backend.
*   DEC 2022: Attempts to find an adequate solution for sizeable data yielded no results.
*   JAN 2023: The [earlier question][earlier_stack_question] received zero activity for a month</br>
    (with a single silent downvote, auto-deleted by a bot).<br/>
    Request to restore it was &#127925; Shut Down In Flames.
    <!--
    It seeked recommendations which tend to be spammed by answers<br/>
    (controversially, some spam once a month helps more than none).
    -->
*   FEB 2023: The [explanation hangs on the appropriate site][later_stack_question] now -<br/>
    recommendations are still very welcome there.<br/>
    But, with some patience for integration, `argrelay` already became satisfying enough.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

argrelay-0.0.0.dev28.tar.gz (113.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page