Web scraping tool for NHK web news easy
Project description
# EasyJapanese
To get the URLs of the most recent easy Japanese news articles
To extract the body texts of them
To get the URL of the original articles of them
To extract the original ones as well
To concatenate all the above data
To save them all as a csv file
1. Prepare the dataframe that has the columns named {‘Date’, ‘Easy URL’, ‘Easy article’, ‘Regular URL’, ‘Regular article’}. This module concatenate the newly retrieved data with this dataframe. 2. Run the code below.:
>>> from easyjapanese2 import EasyJapanese >>> EasyJapanese()
3. Input the DRIVER_PATH, which is the filepath where your Chrome Driver is, and SAVE_PATH, which is the filepath where there is a csv file you prepared in the 1st process.
Have a great Easy Japanese (やさしいにほんご) life!
I do not own any rights for the articles, nor take any responsibility caused by the use of this module.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for easyjapanese-0.0.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5328d2fde699e05080a038a8bcbec45919a3674bae297ae9225d4ead50edf846 |
|
MD5 | 65f6568e918edafa81c3beda9289eb90 |
|
BLAKE2b-256 | c1166545b6727245f5c6102bd8695c208ba1f612b891ac4f53ba4e9adbd0fb38 |