Skip to main content

Bash Tab-completion (data) server - total recall

Project description

Project status: working prototype

asciicast

What's this?

An integration framework to provide contextual Tab-auto-completion
for command line interfaces (CLI) in Bash shell.

Original use case:
Auto-complete based on large structured data sets (e.g. config or ref data).[^1]

This requires data indexing for responsive lookup
(the client has to start and find relevant data on each Tab-request).

The straightforward approach to meet performance requirements taken by argrelay is
to run a standby data server.

For example, with several thousands of service instances,
even if someone manages to generate Bash completion config,
it takes considerable time to load it for every shell instance.

Extended use case:
Catalogues of searchable functions and (live) data
with auto-completion of keywords -
directly from standard shell.

What's in a name?

Eventually, argrelay will "relay" (hence, the name) command line arguments to
user domain-specific command/procedure.

To clarify,
argrelay framework can be compared with (independent)
argparse library:

Category argparse is a library argrelay is a framework
Given: A.py is some script A_relay is a "wrapper" command
configured in Bash to call argrelay
In Bash: type A.py to execute it type A_relay to let argrelay decide
whether to execute A.py
Execution: A.py calls argparse library A.py is called by the framework
when A_relay is invoked
Function: A.py directly does
some domain-specific task
A_relay directly only "relays"
the command line to argrelay
CLI source: A.py defines its CLI
itself via argparse
CLI for A_relay is defined by
the framework via configs/plugins/data
CLI is: mostly code-driven mostly data-driven
Modify CLI: modify A.py keep A.py intact,
re-configure argrelay instead
Prog lang: A.py has to be
a Python script to use argparse
A.py can be anything
somehow executable by argrelay
Important: A.py/argparse have no domain data
to query
A_relay may access any
domain data from argrelay server

What's missing?

  • Any (real) domain-specific data
  • Any (useful) domain-specific plugins

What's in the package?

  • Client to be invoked by Bash hook on every Tab to
    send command line arguments to the server.
  • Server to parse command line and propose values from
    pre-loaded data for the argument under the cursor.
  • Plugins to customize:
    • actions the client can run
    • objects the server can search
    • grammar the command line can have
  • Interfaces to bind these all together.
  • Demo example to start from.
  • Testing support and coverage.

CLI-friendly completion: primary focus

GUI-s are secondary for argrelay's niche because
GUI-s do not have the restrictions CLI-s have:

  • Technically, the server can handle requests for any GUI.
  • But API-s are primarily feature-tailored to support CLI.
show example For example, in GUI-s, typing a query into a search bar may easily be accompanied by
(1) a separate (from the search bar) area
(2) with individually selectable
(3) full-text-search results
(4) populated in async execution.

In CLI-s, grep does (3) full-text-search, but what about the rest (1), (2), (4)?

To facilitate selection of results via auto-completion,
catalogue-like navigation (rather than full-text-search) seems the answer.

Syntax: origin story

When an interface is limited...

You probably heard about research where
apes were taught to communicate with humans in sign language
(their vocal apparatus cannot reproduce speech effectively).

Naturally, with limited vocabulary,
they combined known words to describe unnamed things.

For example,
to ask for a watermelon (without knowing the exact sign),
they used combination of known "drink" + "sweet".

The default argrelay CLI-interpretation plugin (see FuncArgsInterp)
prompts for object properties to disambiguate search results until single one is found.

continue story

Narrow down options

Without any context, just two words "drink" + "sweet" leave
a lot of ambiguity to guess a watermelon (many drinks are sweet).

A more clarified "sentence" could be:

drink striped red sweet fruit

Each word narrows down matching object set
to more specific candidates (including watermelon).

Avoid strict order

Notice that the word order is not important -
this line provides (almost) equivalent hints for guessing:

striped sweet fruit red drink

It is not valid English grammar, but it somewhat works.

Use "enum language"

Think of speaking "enum language":

  • Each word is an enum value of some enum type:
    • Color: red, green, ...
    • Taste: sweet, salty, ...
    • Temperature: hot, cold, ...
    • Action: drink, play, ...
  • Word order is irrelevant because enum value spaces do not overlap (almost).
  • To "say" something, one keeps clarifying meaning by more enum values.

Now, imagine the enum types and values are not supposed to be memorized,
they are proposed to select from (based on the current context).

Address any object

Suppose enums are binary = having only two values
(cardinality = 2: black/white, hot/cold, true/false, ...).

For example,
5 words could slice the object space to
single out (identify exactly) up to 2^5 = 32 objects.

To "address" larger object spaces,
larger enum cardinalities or more word places are required.

  • Each enum type ~ a dimension.
  • Each specific enum value ~ a coordinate.
  • Each object fills a slot in such multi-dimensional discrete space.

Apply to CLI

CLI-s are used to write commands - imperative sentences:
specific actions on specific objects.

The "enum language" above covers searching both
an action and any object it requires.

Suggest contextually

Not every combination of enum values may point to an existing object.

For data with sparse object spaces,
the CLI-suggestion should be limited by coordinates applicable to
remaining (narrowed down) object sets.

Differentiate on purpose

All above may be an obvious approach to come up with,
but it is not ordinary for CLI-s of most common commands:

Common commands (think ls, git, ssh, ...): argrelay-wrapped actions:
have succinct syntax and prefer
single-char switches (defined by code)
prefer explicit "enum language"
defined by data
rely on humans to memorize syntax
(options, ordering, etc.)
assume humans have
a loose idea about the syntax
auto-complete only for objects
known to the OS (hosts, files, etc.)
auto-complete from
a domain-specific index

Learn more about how search works.

Quick demo

This is a non-intrusive demo (without permanent changes to user env).

Clone this repo somewhere.

If dev-shell.bash is run for the first time,
it will ask to provide python-conf.bash file - follow instruction on error.

To start both the server and the client,
two terminal windows are required.

  • Server:

    Start the first sub-shell:

    ./dev-shell.bash
    

    In this sub-shell, start the server:

    # in server `dev-shell.bash`:
    run_argrelay_server
    
  • Client:

    Start the second sub-shell:

    ./dev-shell.bash
    

    While it is running (temporarily),
    this sub-shell is configured for Bash Tab-completion for relay_demo command.

  • Try to Tab-complete command relay_demo using demo test data:

    # in client `dev-shell.bash`:
    relay_demo goto host            # press Tab one or multiple times
    
    # in client `dev-shell.bash`:
    relay_demo goto host dev        # press Alt+Shift+Q shortcut to describe command line args
    
  • Inspect how auto-completion binds to relay_demo command:

    # in client `dev-shell.bash`:
    complete -p relay_demo
    
  • Inspect client and server config:

    • server config: ~/.argrelay.server.yaml
    • client config: ~/.argrelay.client.json
  • To clean up, exit the sub-shells:

    # in client or server `dev-shell.bash`:
    exit
    

Data backend

There are two options at the moment - both using MongoDB API:

Category mongomock (default) PyMongo
Data set size: practical limit ~ 10K tested at 1M
Pro: nothing else to install no practical data set size limit found (yet)
for argrelay intended use cases
Con: understandably, does not meet
non-functional requirements
for large data sets
require some knowledge of MongoDB,
additional setup,
additional running processes

PyMongo connects to running MongoDB instance which has to be configured in mongo_config
and mongomock should be disabled in argrelay.server.yaml:

-    use_mongomock_only: True
+    use_mongomock_only: False

What's next?

  • After trying non-intrusive demo, try intrusive one for permanent setup.

  • Modify ServiceLoader.py plugin to provide data beyond demo data set.

    The data can be simply hard-coded with different test_data tag
    (other than TD_63_37_05_36 demo) and selected in argrelay.server.yaml:

        ServiceLoader:
            plugin_module_name: argrelay.custom_integ.ServiceLoader
            plugin_class_name: ServiceLoader
            plugin_type: LoaderPlugin
            plugin_config:
                test_data_ids_to_load:
                    #-   TD_70_69_38_46  # no data
    -               -   TD_63_37_05_36  # demo
    +               -   TD_NN_NN_NN_NN  # custom data
                    #-   TD_38_03_48_51  # large generated
    

    If hard-coding is boring, soft-code to load it from external data source.

  • Replace redirect to ErrorInvocator.py plugin
    to execute something useful instead when use hits Enter.

  • ...

  • Many features and docs are actively taking their shape -
    any (minimal, unfiltered, first-thought) feedback is welcome.

    Raise questions or suggestions as issues to influence the dev direction.

[footnotes]

[^1]: Brief History

Tab-completion with custom (domain-specific) arg values is<br/>
constantly on a dev wish list for complex backend.
*   DEC 2022: Attempts to find an adequate solution for sizeable data yielded no results.
*   JAN 2023: The [earlier question][earlier_stack_question] received zero activity for a month</br>
    (with a single silent downvote, auto-deleted by a bot).<br/>
    Request to restore it was &#127925; Shut Down In Flames.
    <!--
    It seeked recommendations which tend to be spammed by answers<br/>
    (controversially, some spam once a month helps more than none).
    -->
*   FEB 2023: The [explanation hangs on the appropriate site][later_stack_question] now -<br/>
    recommendations are still very welcome there.<br/>
    But, with some patience for integration, `argrelay` already became satisfying enough.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

argrelay-0.0.0.dev27.tar.gz (110.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page