deep-keyphrase

Add a short description here!

Project description

Implement some keyphrase generation algorithm

https://img.shields.io/github/workflow/status/supercoderhawk/deep-keyphrase/ci.svg

https://img.shields.io/pypi/v/deep-keyphrase.svg

https://img.shields.io/pypi/dm/deep-keyphrase.svg

Description

Implemented Paper

CopyRNN

Deep Keyphrase Generation (Meng et al., 2017)

ToDo List

CopyCNN

CopyTransformer

Usage

required files (4 files in total)

vocab_file: word line by line (don’t with index!!!!)
```
this
paper
proposes
```
training, valid and test file

data format for training, valid and test

json line format, every line is a dict:

{'tokens': ['this', 'paper', 'proposes', 'using', 'virtual', 'reality', 'to', 'enhance', 'the', 'perception', 'of', 'actions', 'by', 'distant', 'users', 'on', 'a', 'shared', 'application', '.', 'here', ',', 'distance', 'may', 'refer', 'either', 'to', 'space', '(', 'e.g.', 'in', 'a', 'remote', 'synchronous', 'collaboration', ')', 'or', 'time', '(', 'e.g.', 'during', 'playback', 'of', 'recorded', 'actions', ')', '.', 'our', 'approach', 'consists', 'in', 'immersing', 'the', 'application', 'in', 'a', 'virtual', 'inhabited', '3d', 'space', 'and', 'mimicking', 'user', 'actions', 'by', 'animating', 'avatars', '.', 'we', 'illustrate', 'this', 'approach', 'with', 'two', 'applications', ',', 'the', 'one', 'for', 'remote', 'collaboration', 'on', 'a', 'shared', 'application', 'and', 'the', 'other', 'to', 'playback', 'recorded', 'sequences', 'of', 'user', 'actions', '.', 'we', 'suggest', 'this', 'could', 'be', 'a', 'low', 'cost', 'enhancement', 'for', 'telepresence', '.'] ,
'keyphrases': [['telepresence'], ['animation'], ['avatars'], ['application', 'sharing'], ['collaborative', 'virtual', 'environments']]}

Training

download the kp20k

mkdir data
mkdir data/raw
mkdir data/raw/kp20k_new
# !! please unzip kp20k data put the files into above folder manually
python -m nltk.downloader punkt
bash scripts/prepare_kp20k.sh
bash scripts/train_copyrnn_kp20k.sh

# start tensorboard
# enter the experiment result dir, suffix is time that experiment starts
cd data/kp20k/copyrnn_kp20k_basic-20191212-080000
# start tensorboard services
tenosrboard --bind_all --logdir logs --port 6006

Notes

compared with the original seq2seq-keyphrase-pytorch
1. fix the implementation error:
  
  copy mechanism
  
  train and inference are not correspond (training doesn't have input feeding and inference has input feeding)
2. easy data preparing
3. tensorboard support
4. faster beam search (6x faster used cpu and more than 10x faster used gpu)

Project details

Release history Release notifications | RSS feed

0.0.9

Jul 15, 2020

0.0.8

Jul 15, 2020

0.0.7

May 12, 2020

This version

0.0.6

Jan 12, 2020

0.0.5

Dec 25, 2019

0.0.4

Dec 25, 2019

0.0.3

Dec 6, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deep-keyphrase-0.0.6.tar.gz (36.7 kB view details)

Uploaded Jan 12, 2020 Source

File details

Details for the file deep-keyphrase-0.0.6.tar.gz.

File metadata

Download URL: deep-keyphrase-0.0.6.tar.gz
Upload date: Jan 12, 2020
Size: 36.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.6

File hashes

Hashes for deep-keyphrase-0.0.6.tar.gz
Algorithm	Hash digest
SHA256	`85eb50b481b3e74a3c952a22f4eadfa3874c74d4937a672a02c15aca6083b5bc`
MD5	`9fc3f1aba34c28eee12df5fae3f99502`
BLAKE2b-256	`e4657903e109c6372dc09b4188045280ccbacffb4baf4b5ab59ffce374eb07ae`