Skip to main content

An open source library for building end-to-end dialog systems and training chatbots.

Project description

License Apache 2.0 Python 3.6, 3.7 Downloads

DeepPavlov is an open-source conversational AI library built on TensorFlow and Keras.

DeepPavlov is designed for

  • development of production ready chat-bots and complex conversational systems,
  • research in the area of NLP and, particularly, of dialog systems.

Quick Links

Models

Named Entity Recognition | Slot filling

Intent/Sentence Classification | Question Answering over Text (SQuAD)

Sentence Similarity/Ranking | TF-IDF Ranking

Morphological tagging | Automatic Spelling Correction

ELMo training and fine-tuning

Skills

Goal(Task)-oriented Bot | Seq2seq Goal-Oriented bot

Open Domain Questions Answering | eCommerce Bot

Frequently Asked Questions Answering | Pattern Matching

Embeddings

BERT embeddings for the Russian, Polish, Bulgarian, Czech, and informal English

ELMo embeddings for the Russian language

FastText embeddings for the Russian language

Auto ML

Tuning Models with Evolutionary Algorithm

Installation

  1. We support Linux and Windows platforms, Python 3.6 and Python 3.7

    • Python 3.5 is not supported!
    • installation for Windows requires Git(for example, git) and Visual Studio 2015/2017 with C++ build tools installed!
  2. Create and activate a virtual environment:

    • Linux
    python -m venv env
    source ./env/bin/activate
    
    • Windows
    python -m venv env
    .\env\Scripts\activate.bat
    
  3. Install the package inside the environment:

    pip install deeppavlov
    

QuickStart

There is a bunch of great pre-trained NLP models in DeepPavlov. Each model is determined by its config file.

List of models is available on the doc page in the deeppavlov.configs (Python):

    from deeppavlov import configs

When you're decided on the model (+ config file), there are two ways to train, evaluate and infer it:

Before making choice of an interface, install model's package requirements (CLI):

    python -m deeppavlov install <config_path>
  • where <config_path> is path to the chosen model's config file (e.g. deeppavlov/configs/ner/slotfill_dstc2.json) or just name without .json extension (e.g. slotfill_dstc2)

Command line interface (CLI)

To get predictions from a model interactively through CLI, run

    python -m deeppavlov interact <config_path> [-d]
  • -d downloads required data -- pretrained model files and embeddings (optional).

You can train it in the same simple way:

    python -m deeppavlov train <config_path> [-d]

Dataset will be downloaded regardless of whether there was -d flag or not.

To train on your own data you need to modify dataset reader path in the train config doc. The data format is specified in the corresponding model doc page.

There are even more actions you can perform with configs:

    python -m deeppavlov <action> <config_path> [-d]
  • <action> can be
    • download to download model's data (same as -d),
    • train to train the model on the data specified in the config file,
    • evaluate to calculate metrics on the same dataset,
    • interact to interact via CLI,
    • riseapi to run a REST API server (see doc),
    • interactbot to run as a Telegram bot (see doc),
    • interactmsbot to run a Miscrosoft Bot Framework server (see doc),
    • predict to get prediction for samples from stdin or from <file_path> if -f <file_path> is specified.
  • <config_path> specifies path (or name) of model's config file
  • -d downloads required data

Python

To get predictions from a model interactively through Python, run

    from deeppavlov import build_model

    model = build_model(<config_path>, download=True)

    # get predictions for 'input_text1', 'input_text2'
    model(['input_text1', 'input_text2'])
  • where download=True downloads required data from web -- pretrained model files and embeddings (optional),
  • <config_path> is path to the chosen model's config file (e.g. "deeppavlov/configs/ner/ner_ontonotes_bert_mult.json") or deeppavlov.configs attribute (e.g. deeppavlov.configs.ner.ner_ontonotes_bert_mult without quotation marks).

You can train it in the same simple way:

    from deeppavlov import train_model 

    model = train_model(<config_path>, download=True)
  • download=True downloads pretrained model, therefore the pretrained model will be, first, loaded and then train (optional).

Dataset will be downloaded regardless of whether there was -d flag or not.

To train on your own data you need to modify dataset reader path in the train config doc. The data format is specified in the corresponding model doc page.

You can also calculate metrics on the dataset specified in your config file:

    from deeppavlov import evaluate_model 

    model = evaluate_model(<config_path>, download=True)

There are also available integrations with various messengers, see Telegram Bot doc page and others in the Integrations section for more info.

Breaking Changes

Breaking changes in version 0.5.0

  • dependencies have to be reinstalled for most pipeline configurations
  • models depending on tensorflow require CUDA 10.0 to run on GPU instead of CUDA 9.0
  • scikit-learn models have to be redownloaded or retrained

Breaking changes in version 0.5.0

  • dependencies have to be reinstalled for most pipeline configurations
  • models depending on tensorflow require CUDA 10.0 to run on GPU instead of CUDA 9.0
  • scikit-learn models have to be redownloaded or retrained

Breaking changes in version 0.4.0!

  • default target variable name for neural evolution was changed from MODELS_PATH to MODEL_PATH.

Breaking changes in version 0.3.0!

  • component option fit_on_batch in configuration files was removed and replaced with adaptive usage of the fit_on parameter.

Breaking changes in version 0.2.0!

  • utils module was moved from repository root in to deeppavlov module
  • ms_bot_framework_utils,server_utils, telegram utils modules was renamed to ms_bot_framework, server and telegram correspondingly
  • rename metric functions exact_match to squad_v2_em and squad_f1 to squad_v2_f1
  • replace dashes in configs name with underscores

Breaking changes in version 0.1.0!

  • As of version 0.1.0 all models, embeddings and other downloaded data for provided configurations are by default downloaded to the .deeppavlov directory in current user's home directory. This can be changed on per-model basis by modifying a ROOT_PATH variable or related fields one by one in model's configuration file.

  • In configuration files, for all features/models, dataset readers and iterators "name" and "class" fields are combined into the "class_name" field.

  • deeppavlov.core.commands.infer.build_model_from_config() was renamed to build_model and can be imported from the deeppavlov module directly.

  • The way arguments are passed to metrics functions during training and evaluation was changed and documented.

License

DeepPavlov is Apache 2.0 - licensed.

The Team

DeepPavlov is built and maintained by Neural Networks and Deep Learning Lab at MIPT within iPavlov project (part of National Technology Initiative) and in partnership with Sberbank.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deeppavlov-0.5.0.tar.gz (362.9 kB view details)

Uploaded Source

Built Distribution

deeppavlov-0.5.0-py3-none-any.whl (692.8 kB view details)

Uploaded Python 3

File details

Details for the file deeppavlov-0.5.0.tar.gz.

File metadata

  • Download URL: deeppavlov-0.5.0.tar.gz
  • Upload date:
  • Size: 362.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.1

File hashes

Hashes for deeppavlov-0.5.0.tar.gz
Algorithm Hash digest
SHA256 450a1e81c71edad1a2db3793bc0de258b96559da0df0e3d72aaa70ac9f075244
MD5 ef59bb2695d0994a7191709ee231770d
BLAKE2b-256 cd68b3654b8d2da2161b7398115d8895792fa89656433207c2e4dab5381b2b5d

See more details on using hashes here.

File details

Details for the file deeppavlov-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: deeppavlov-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 692.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.1

File hashes

Hashes for deeppavlov-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e8998a3842960832b43b0f008c3b713d3828e750a95df5e8a216d2e83552b8e3
MD5 212cf6c4c43bc9f8cc22d7d7c433fc6a
BLAKE2b-256 702925902eca715082d4fe91d780974ae2e398d8b9efad8d44ed31a3032be049

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page