get title and main body text from an article in a web page
Project description
#HTMLText
htmltext is a simple tool to get main body text of articles in HTML web pages, such as news,bolg .etc.
Installation:
pip install htmltext
Usage:
from htmltext import HTMLText
title, text = HTMLText(html_data)
Example:
import requests
from htmltext import HTMLText
r = requests.get(url_of_the_article)
title, text = HTMLText(r.content)
print(title)
print(text)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
htmltext-0.0.6.tar.gz
(2.4 kB
view hashes)