Skip to main content

A python library that parses html into a tree structure

Project description

TreepyParser

TreepyParser is a python library that parses html into a tree structure

Installation

Use the following command to install

pip install TreepyParser

How to use

import TreepyParser

#pass string containing html
parser.parse(html) #returns html tree

The tree returned is a node representing the element that contains all other elements

Nodes have following methods and variables

node.tag #string representing element e.g. div, p, table
node.get_attribute(key) #returns attribute value, e.g. href => www.example.com
node.add_node(n) #adds node n to node
node.insert_node(n, pos) #inserts node n at position pos of node, previous node at pos will be a child of node n
node.remove_node(pos, remove_subtree) #removes node at pos, if remove_subtree is false children of the removed node will be added to node
node.find(tag="", **kwargs) #returns list of all matches in subtree, kwargs represent attributes
node.nodes_in_subtree() #returns amount of nodes (root excluded) in the subtree

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TreepyParser-1.2.0.tar.gz (4.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page