Quickly build a news/web corpus with specifc topics or terms automatically from Google News or by specifying article links in a file. This module automatically extracts the body and title from each article and saves the result to either flatfiles or sqlite database.
Project description
News Corpus Builder
A simple module that can be used to quickly build a corpus from news articles. The generated corpus can be stored in a sqlite database or as flat files.
pip install news-corpus-builder
See http://skillachie.github.io/news-corpus-builder/ for installation and usage
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
news-corpus-builder-0.1.2.zip
(6.1 kB
view hashes)