easyjapanese

Web scraping tool for NHK web news easy

Project description

Let’s study Easy Japanese (やさしいにほんご) with NHK news web easy!
This module (package) helps you study intermediate level Japanese!
This module mainly covers the topics listed below.

To get the URLs of the most recent easy Japanese news articles
To extract the body texts of them
To get the URL of the original articles of them
To extract the original ones as well
To concatenate all the above data
To save them all as a csv file

Specifically, you can use this module to retrieve daily Easy Japanese(やさしいにほんご) news articles.

Here’s how it works.

Prepare the dataframe that has the columns named {‘Date’, ‘Easy URL’, ‘Easy article’, ‘Regular URL’, ‘Regular article’}. This module concatenate the newly retrieved data with this dataframe.

Run the code below.:

>>> from easyjapanese2 import EasyJapanese

>>> EasyJapanese()

Input the DRIVER_PATH, which is the filepath where your Chrome Driver is, and SAVE_PATH, which is the filepath where there is a csv file you prepared in the 1st process.

Have a great Easy Japanese (やさしいにほんご) life!

I do not own any rights for the articles, nor take any responsibility caused by the use of this module.

Project details

Release history Release notifications | RSS feed

0.1.4

Dec 6, 2022

0.1.3

Dec 3, 2022

This version

0.1.2

Dec 1, 2022

0.1.1

Dec 1, 2022

0.1.0

Dec 1, 2022

0.0.9

Dec 1, 2022

0.0.8

Dec 1, 2022

0.0.7

Dec 1, 2022

0.0.6

Dec 1, 2022

0.0.5

Dec 1, 2022

0.0.4

Dec 1, 2022

0.0.3

Dec 1, 2022

0.0.2

Dec 1, 2022

0.0.1

Dec 1, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

easyjapanese-0.1.2-py3-none-any.whl (9.6 kB view hashes)

Uploaded Dec 1, 2022 Python 3

Hashes for easyjapanese-0.1.2-py3-none-any.whl

Hashes for easyjapanese-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6854be42fdd06cc241bbf19ed14d58cf483e829c45bcffebe3782295097a20f4`
MD5	`898239eb122fb594decaa21bb4ce7568`
BLAKE2b-256	`a5ca6d25b7a5773a1dc370b974e83157da3a8a8ad7a9982068f8d3fdef209522`