Skip to main content

Web scraping tool for NHK web news easy

Project description

Let’s study Easy Japanese (やさしいにほんご) with NHK news web easy!
This module (package) helps you study intermediate level Japanese!
This module mainly covers the topics listed below.
  • To get the URLs of the most recent easy Japanese news articles

  • To extract the body texts of them

  • To get the URL of the original articles of them

  • To extract the original ones as well

  • To concatenate all the above data

  • To save them all as a csv file

Specifically, you can use this module to retrieve daily Easy Japanese(やさしいにほんご) news articles.
Here’s how it works.
  1. Prepare the dataframe that has the columns named {‘Date’, ‘Easy URL’, ‘Easy article’, ‘Regular URL’, ‘Regular article’}. This module concatenate the newly retrieved data with this dataframe.

  2. Run the code below.:

    >>> from easyjapanese2 import EasyJapanese
    
    >>> EasyJapanese()
  3. Input the DRIVER_PATH, which is the filepath where your Chrome Driver is, and SAVE_PATH, which is the filepath where there is a csv file you prepared in the 1st process.

Have a great Easy Japanese (やさしいにほんご) life!

I do not own any rights for the articles, nor take any responsibility caused by the use of this module.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

easyjapanese-0.1.2-py3-none-any.whl (9.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page