html-stripper

A simple package to extract text from (even broken/invalid) HTML

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
- OS Independent
Programming Language
- Python
- Python :: 3

Project description

A simple package to extract text from (even broken/invalid) HTML. No dependencies, it just uses Python's internal HTMLParser with a few tweaks.

Usage:

from html_stripper import strip_tags
text = strip_tags("<html>…")

from html_stripper import strip_tags
import requests
strip_tags(requests.get("https://foo.bar/").text)

from html_stripper import strip_tags, strip_multiple_newlines
text = strip_multiple_newlines(strip_tags("<html>…")) # replaces chained newlines with a single \n

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Operating System
- OS Independent
Programming Language
- Python
- Python :: 3

Release history Release notifications | RSS feed

This version

0.3

Jul 30, 2020

0.2.1

Jul 30, 2020

0.2

Jul 30, 2020

0.1

Jul 30, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

html_stripper-0.3.tar.gz (15.2 kB view details)

Uploaded Jul 30, 2020 Source

File details

Details for the file html_stripper-0.3.tar.gz.

File metadata

Download URL: html_stripper-0.3.tar.gz
Upload date: Jul 30, 2020
Size: 15.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for html_stripper-0.3.tar.gz
Algorithm	Hash digest
SHA256	`b9ea66bc75d00adc06447f3c3a278899c10cf12fad0c0faab39457057b4056b9`
MD5	`50dfb87e9e4fe54b52f35dfff89cca5e`
BLAKE2b-256	`21e0c6b141679eed08bb139a7a82f36ed30336b15d69c9b2c4a735549a53efad`

See more details on using hashes here.

html-stripper 0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes