Reusable Intent and Slot-filling tool
Intent and Slot-filling on the ATIS dataset task.
For this task, a model was trained to jointly predict a sentence's Intent and Slots (entities). Each word is embedded (using pre-defined word vectors) to capture the word's meaning while a character-level bidirection Long Short-Term Memory(LSTM) Network encodes the word's letters to capture its lexical structure.
The word vector and outputs of the character-level bi-LSTM are then fed into the word-level bi-LSTM which predicts the Intent. The second layer feeds into a Conditional Random Fields (CRF) layer to predict the individual slots (entities).
The model is stored in
The newly released Poincare word embeddings (100 dimensional) were used as they have been reported to better encode the hierarchical relationships inherent between words
you can find the word vectors used in
you can run a demo of the pre-trained model by running
The model was trained for 50 epochs and stored in the pretrained_models directory
pretrained_models/dataset_infocontains all the vocabularies used by the model (character, word, intent, entity) and their mappings to numbers for encoding/decoding
pretrained_models/model.h5are the weights to the model
The model's loss during training over the epochs are shown below:
The model's accuracy at predicting intents and entities (slots) over time are shown below:
You can retrain the model by running
to above results can be obtained by running
To improve the robustness of the model to out of vocab words, the training data was lemmatised prior to training and the model was retrained. Numbers were also masked using a placeholder (e.g. *) to avoid out-of-vocab times appearing (e.g. 9:30 may appear in training but not 9:29).
The results were slightly improved given the above tweaks. Precision, Recall and F1 scores improved across the board (for both intents, entities).
Perfect scores were achieved using the validation set!??
Future Improvements (TO DO):
- Balance out training data (its clear that the intent ATIS_FLIGHT dominates the training set)
(and 'O' dominates the entity tags)
(or if we discount this as an entity tag - then "to/fromloc.city_name" tags)
e.g. this can be achieved by subsampling or artificially perterbing data to generate more samples (e.g. increase training instance by sliding each sentences one,two,three,etc places)
Investigate the Intents & Entities which are scoring relatively low F1 scores
e.g. (intents such as ATIS_DAY_NAME, ATIS_MEAL, ATIS_FLIGHT_TIME, etc)
e.g. (entities such as compartment, booking_class, meal_code, etc)
- Preprocess intent labels with #?
- Embed unknown words too (if possible) rather than giving them <UNK> (1)
- convert word numbers (e.g. "one") into digits
- improve slot extraction using additional pre-trained Named Entity Recognition (NER)s from various libraries
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size slize-0.0.21-py3-none-any.whl (10.6 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size slize-0.0.21.tar.gz (7.1 kB)||File type Source||Python version None||Upload date||Hashes View hashes|