A Nifty HTML Parser written in Python
- Pyarser is a simple, straight forward HTML parser that allows you to easily harvest text
inside an HTML document from a link to that website. Examples:
get_site_HTML(link): returns a string of HTML content from a link
get_site_text(link): returns a string of text from a link. This string has all the HTML tags <> removed, along with there contents.
search_by_phrase(phrase, link): returns the fragments of text from a link that contain the continuous string phrase.
search_for_words(words, link): returns the fragments of text from a link that contain ANY of the strings in words.
word_count(link): counts the number of text words from a link.
get_HTML_tags(link): returns a list of the tags used in an HTML document from a link.
HTML_to_TXT(link, name): writes a TXT file with the text content from a link. All HTML brackets and tags are moved.