Skip to main content

Run a AllenNLP trained model, and serve it with WebAPI.

Project description


Run a AllenNLP trained model, and serve it with WebAPI.


Run the program

Execute the program in terminator, the option --help will show help message:

$ allennlp-runmodel --help
Usage: allennlp-runmodel [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Start a webservice for running AllenNLP models.

  -V, --version
  -h, --host TEXT                 TCP/IP host for HTTP server.  [default:
  -p, --port INTEGER              TCP/IP port for HTTP server.  [default:
  -a, --path TEXT                 File system path for HTTP server Unix domain
                                  socket. Listening on Unix domain sockets is
                                  not supported by all operating systems.
  -l, --logging-config FILE       Path to logging configuration file (JSON,
                                  YAML or INI) (ref:
  -v, --logging-level [critical|fatal|error|warn|warning|info|debug|notset]
                                  Sets the logging level, only affected when
                                  `--logging-config` not specified.  [default:
  --help                          Show this message and exit.

  load  Load a pre-trained AllenNLP model from it's archive file, and put


$ allennlp-runmodel load --help
Usage: allennlp-runmodel load [OPTIONS] ARCHIVE

  Load a pre-trained AllenNLP model from it's archive file, and put it into
  the webservice contrainer.

  -m, --model-name TEXT           Model name used in URL. eg:
  -t, --num-threads INTEGER       Sets the number of OpenMP threads used for
                                  parallelizing CPU operations. [default: 4
                                  (on this machine)]
  -w, --max-workers INTEGER       Uses a pool of at most max_workers threads
                                  to execute calls asynchronously. [default:
                                  num_threads/cpu_count (1 on this machine)]
  -w, --worker-type [process|thread]
                                  Sets the workers execute in thread or
                                  process.  [default: process]
  -d, --cuda-device INTEGER       If CUDA_DEVICE is >= 0, the model will be
                                  loaded onto the corresponding GPU. Otherwise
                                  it will be loaded onto the CPU.  [default:
  -e, --predictor-name TEXT       Optionally specify which `Predictor`
                                  subclass; otherwise, the default one for the
                                  model will be used.
  --help                          Show this message and exit.

load sub-command can be called many times to load multiple models.


allennlp-runmodel  --port 8080 load --model-name model1 /path/of/model1.tar.gz load --model-name model2 /path/of/model2.tar.gz

Make prediction from HTTP client

curl \
  --header "Content-Type: application/json" \
  --request POST \
  --data '{"premise":"Two women are embracing while holding to go packages.","hypothesis":"The sisters are hugging goodbye while holding to go packages after just eating lunch."}' \

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

allennlp-runmodel-0.2.1.tar.gz (18.4 kB view hashes)

Uploaded source

Built Distribution

allennlp_runmodel-0.2.1-py3-none-any.whl (9.1 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page