Make Word2Vec from aozorabunko/aozorabunko
Project description
aovec
- Make Word2Vec from aozorabunko/aozorabunko
- This code is inspired by sheepover96/aozora_analyzer:
- Pre-built models are available from
week*
Releases.
Requirements
- Git
- MeCab
- MeCab Checker: src/check_mecab.py
How to use
# Install from pypi
$ pip install aovec
# Clone aozorabunko/aozorabunko (>20GB)
$ aovec clone
# Parse html files and write to results to novels/
$ aovec parse
# Make word2vec and write to aozora_model.model
$ aovec mkvec
Help
$ aovec -h
usage: aovec [-h] [-V] {clone,c,parse,p,mkvec,m} ...
Make Word2Vec from aozorabunko/aozorabunko
positional arguments:
{clone,c,parse,p,mkvec,m}
clone (c) clone aozorabunko/aozorabunko (>20GB)
parse (p) parse html files and write to results
mkvec (m) make word2vec and write to *.model
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
$ aovec clone -h
usage: aovec clone [-h]
optional arguments:
-h, --help show this help message and exit
$ aovec parse -h
usage: aovec parse [-h] [-d DIR]
optional arguments:
-h, --help show this help message and exit
-d DIR, --savedir DIR
directory name of saving results (default: novels)
$ aovec mkvec -h
usage: aovec mkvec [-h] [-d DIR] [-o NAME]
optional arguments:
-h, --help show this help message and exit
-d DIR, --parsedir DIR
directory name of saved parsing results (default:
novels)
-o NAME, --model NAME
name of word2vec model (default: aozora_model.model)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
aovec-0.9.tar.gz
(6.0 kB
view hashes)
Built Distribution
aovec-0.9-py3-none-any.whl
(7.2 kB
view hashes)