A Python package for jatools
Project description
jatool
A Python package to download and downstream-analysis the Japanese literatures.
Install
pip install jatool
#You need to install Rust to install depandency 'SudachiPy'
##Linux##
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
##MacOS##
brew install rustup
rustup-init
load jatool package
from jatool.function import *
Download literatures
You need to input the author and output path. Using ja_download_fiction_author() function. Example:
ja_download_fiction_author(author = '島崎藤村',output_path='fiction_download')
Topic model (LDA)
You need to input the path which contains the txt files. Using topic_model_fition_corpus() and topic_model_fition_text(). Example:
excleded_words = [] #You can choose excleded_words to adjust the Topic model results.i.e. excleded_words = ['人','事','ぬ','やう']
topic_result = topic_model_fition_corpus(folder_path = 'fiction2',topics_num=2,added_stopwords = excleded_words)
#You can adjust the Topic model results using different 'topics_num' parameter.
topic_model_fition_text(input_corpus = topic_result,topics_num=3)
Clustering analysis
You need to input the path which contains the txt files. Using get_features_path() and feature_clustering(). Example:
features = get_features_path(folder_path = 'fiction_download')
df = feature_clustering(feature_list_result = features,clusters_n =3)
#if plot can't show correctly the japanese words, please install the japanese font, for example 'Yu Gothic'
#Here is the code for font changing
# fpath = '/Your/Fonts/directory/YuGothic.ttf'
# prop = fm.FontProperties(fname=fpath)
# font_dir = ['/Your/Fonts/directory/']
# for font in fm.findSystemFonts(font_dir):
# fm.fontManager.addfont(font)
# plt.rcParams['font.family'] = 'Yu Gothic'
Emotion analysis
You need to input the specific text. Using get_sentiment_analyzer(). Example:
sentiment_analyzer = get_sentiment_analyzer()
sentiment_analyzer("私は幸福である。")
Translation
You need to input the specific text. Using translation_from_jp_to_en(), translation_from_jp_to_cn(), translation_from_lan_to_jp. Example:
translation_from_jp_to_en("私は幸福である。")
translation_from_jp_to_cn("私は幸福である。")
translation_from_lan_to_jp('I am happy.')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jatool-1.992.tar.gz.
File metadata
- Download URL: jatool-1.992.tar.gz
- Upload date:
- Size: 6.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffd9c9590b50dd4274cad3abc499dbb330b0f263a36883a77b46734c86be56fd
|
|
| MD5 |
d88e1d97f3a31c70a3574c43ba4a3447
|
|
| BLAKE2b-256 |
28c503c777bcac2c4b6d5fac7684834d278e8b3459e90d56aa4dd6a9b103d3c1
|
File details
Details for the file jatool-1.992-py3-none-any.whl.
File metadata
- Download URL: jatool-1.992-py3-none-any.whl
- Upload date:
- Size: 2.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef9f88e4e2a9c80ce8cf28bdaab4281dd3fb3c1719fd8482b98cca69b4ae1fc2
|
|
| MD5 |
4719d674d416c61e720358d252e47b0f
|
|
| BLAKE2b-256 |
25b5044f651375c005e4afcfee1f906c17097debc4e2392fbae2481253cc78ac
|