Web scraping tool for NHK web news easy
Project description
Let’s study Easy Japanese (やさしいにほんご) with NHK news web easy!
This module (package) helps you study intermediate level Japanese!
This module mainly covers the topics listed below.
To get the URLs of the most recent easy Japanese news articles
To extract the body texts of them
To get the URL of the original articles of them
To extract the original ones as well
To concatenate all the above data
To save them all as a csv file
Specifically, you can use this module to retrieve daily Easy Japanese(やさしいにほんご) news articles.
Here’s how it works.
1. Prepare the dataframe that has the columns named
{‘Date’, ‘Easy URL’, ‘Easy article’, ‘Regular URL’, ‘Regular article’}.
This module concatenate the newly retrieved data with this dataframe.
Run the code below.:
>>> from easyjapanese2 import EasyJapanese >>> EasyJapanese()
3. Input the DRIVER_PATH, which is the filepath where your Chrome Driver is,
and SAVE_PATH, which is the filepath where there is a csv file you prepared in the 1st process.
Have a great Easy Japanese (やさしいにほんご) life!
I do not own any rights for the articles, nor take any responsibility caused by the use of this module.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for easyjapanese-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 482e642a2adf6b26d65be8c55ff3cea63d253a9c34322278787e4c1021bcd9d7 |
|
MD5 | 940448a62b558cadc943685e8ed53044 |
|
BLAKE2b-256 | 62b092fef04721b11092e1ce9db352c60e265635d91d42af6a9af8373ce97798 |