This code can refactor your html in pretty dict

These details have not been verified by PyPI

Project description

PrettyHTML

Overview

PrettyHTML is a Python library designed to refactor HTML code into a pretty dictionary format. It allows users to easily parse and structure HTML content, making it more readable and accessible for further processing or analysis. The library utilizes the BeautifulSoup parser to extract HTML elements and their attributes, effectively organizing the code into a user-friendly format.

Features

Conversion of HTML code into a dictionary format
Recursive element path extraction for easy navigation through the document structure
Handling of HTML elements without class attributes
Customizable element searching with optional class filtering

Structure

The PrettyHTML library consists of three main Python files:

html_handler.py: Contains the HandlerBlock class that takes HTML code as input and refactors it into a dictionary structure.
utillites.py: Houses the Finder class, which uses BeautifulSoup to find elements in a given HTML code and provides methods for finding elements with or without class attributes.
__init__.py: Initializes the PrettyHTML library by importing necessary functions and classes.

Usage

To use PrettyHTML in your project, you need to follow these steps:

Install PrettyHTML using pip: pip install prettyhtml

Import the required files in your Python script:

from handler_html.PrettyHTML import HandlerBlock
from handler_html.utillites import Finder

Create an instance of the HandlerBlock class and pass the HTML code as input:

block_code = """
<div>
    <h1>Hello, World!</h1>
    <p>This is some text inside a paragraph</p>
</div>
"""
hb = HandlerBlock(block_code=block_code)

Call the Handler method of the HandlerBlock instance to obtain the refactored HTML code in dictionary format:
```
output = hb.Handler()
print(output)
```

By following these steps, users can easily reshape their HTML code into a more accessible and organized dictionary structure using the PrettyHTML library.

html_handler.py

This file provides a class for handling HTML content. The main goal is to parse HTML content and organize the elements into a dictionary structure.

Usage

To use the html_handler.py class, follow these steps:

Import the necessary utility classes from the same module by using either a local or absolute import. In this example, import the utillites class.
```
try:
    import utillites
except:
    from . import utillites
```

Create an instance of the HandlerBlock class initialized with the HTML content as the block code.

block_html = ""
with open("test.txt", "r") as file:
    block_html = file.read()
HB = HandlerBlock(block_code=block_html)

Invoke the Handler() method on the HandlerBlock instance to process the HTML content and get a dictionary containing the organized elements.
```
out = HB.Handler()
```

Methods

`init(self, block_code: str)`

Description: Initializes the HandlerBlock class with the HTML content as the block_code.
Parameters:
- block_code (str): The HTML content to be parsed.

`Handler(self) -> dict`

Description: Processes the HTML content using the __handler_element() method and returns a dictionary containing organized elements.
Parameters: None.
Returns:
- dict: A dictionary containing the organized elements.

`__handler_element(self) -> dict`

Description:Processes the HTML content, finding the elements without a class, and organizes them into a dictionary.
Parameters: None.
Returns:
- dict[str, list]: A dictionary containing the organized elements.

`__get_item_path(self, finder: utillites.Finder, item) -> str`

Description: Recursively traverses the HTML content to determine the element's path and return it.
Parameters:
- finder (utillites.Finder): A Finder object provided for traversing the HTML content.
- item (object): The current item being processed.
Returns:
- str: The item's path based on the parent/child relationships in the given HTML content.

Example

The __main__ section in the file demonstrates how to use the HandlerBlock class by reading HTML content from a file (test.txt), processing it, and printing its top-level keys. To use this example, replace "test.txt" with the path to your HTML file containing the content you would like to process.

if __name__ == "__main__":
    block_html = ""
    with open("test.txt", "r") as file:
        block_html = file.read()
    HB = HandlerBlock(block_code=block_html)
    out = HB.Handler()
    print(list(out.keys()))

utilities.py

This file contains a Finder class which is used to search for HTML elements within a provided HTML string. It utilizes the BeautifulSoup library to parse and extract HTML elements based on specified criteria.

Finder Class

The Finder class is initialized with an HTML code string, which is then parsed using the BeautifulSoup library.

Methods

`init(self, html_code: str) -> None`

Initializes the Finder class with the provided html_code parameter.
The html_code is parsed using the BeautifulSoup library with the 'html.parser'.

`find_classes(self, type_item, class_name) -> list`

Searches for HTML elements with the specified type_item and class_name.
The type_item parameter accepts either 'div', 'span', 'p', 'button', etc.
The class_name parameter is the class name of the desired HTML element.
Returns a list of matching HTML elements.

`find_without_class(self) -> list`

Searches for all HTML elements without a class name.
Returns a list of all matching HTML elements.

PrettyHTML: init.py

This module contains the main class for the PrettyHTML library, which primarily focuses on helping users format their HTML content in a more readable way by automating the process of adding indentation, line breaks, and additional formatting to HTML code. The main class is named PrettyHTML, created with an intention to be both developer-friendly and versatile, supporting a wide range of HTML generation and formatting scenarios.

Usage

To use the PrettyHTML module, you need to follow these simple steps:

Import the PrettyHTML class from the module.
Create an instance of the PrettyHTML class.
Add HTML content using various built-in methods and properties.
Generate formatted HTML by accessing the formatted property of the class instance.

Here's a brief example:

from PrettyHTML import PrettyHTML

# Create an instance of PrettyHTML class
pretty_html = PrettyHTML()

# Add HTML content using built-in methods and properties
pretty_html.add_head('<meta charset="utf-8">')
pretty_html.add_script('src')

# Add arbitrary HTML content
pretty_html.add_html('<h1>Welcome to PrettyHTML!</h1>')

# Generate formatted HTML content
formatted_html = pretty_html.formatted

In this example, we created an instance of the PrettyHTML class, added content (head, script, and arbitrary HTML) using its methods, and accessed the formatted HTML content through the formatted property.

Methods and Properties

Constructor: init(self)

The __init__ constructor initializes a new PrettyHTML instance, setting up the template for HTML content. This method ensures that the class maintains its structure and formatting settings during object creation.

add_head(self, head_content: str)

After creating an instance of the PrettyHTML class, the add_head method enables users to insert content corresponding to the HTML <head> element. The head_content argument should be a valid HTML head content string.

add_script(self, script_content: str)

The add_script method facilitates adding JavaScript <script> content to the HTML head section. The script_content argument should be a valid JS script string.

add_html(self, html_content: str)

The add_html method allows users to add arbitrary HTML content to the class instance. Its html_content argument must be a valid string containing HTML code.

formatted(self) -> str

Finally, the formatted property returns a formatted string containing all added content (head, script, and HTML) in a properly formatted, readable manner.

By accessing the formatted property on the PrettyHTML instance, you obtain the complete, formatted, and styled HTML content, ready for usage in your projects or further manipulation as required.

pyproject.toml

This file describes the project's build requirements and dependencies. It is used by tools like Poetry to manage project's dependencies and build system.

Usage

To create a new pyproject.toml file with the basic structure, you can use the following command:

touch pyproject.toml

Then, open the pyproject.toml file and add the necessary fields. For example:

[tool.poetry]
name = "prettyhtml"
version = "0.8.5"
description = "This code can refactor your html in pretty dict"
authors = ["dima-on <sinica911@gmail.com>"]
license = "MIT"
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.12"
bs4 = "^0.0.2"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

Make sure to replace the content with the appropriate information for your project.

Methods

This file does not contain any methods, but it is essential for declaring dependencies and build requirements for your project.

[tool.poetry]: This section contains general project configuration fields like name, version, description, authors, license, and readme.
[tool.poetry.dependencies]: This section specifies the dependencies your project requires, listing the package names and their specific versions or version ranges.
[build-system]: This section describes the build system used for your project, specifying the required packages and the build-backend.

Make sure to refer to the official PyPA's pyproject-toml documentation for more information regarding the available fields and configurations for this file.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.9.1

Dec 12, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pretty_html-0.9.1.tar.gz (5.3 kB view details)

Uploaded Dec 12, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pretty_html-0.9.1-py3-none-any.whl (5.7 kB view details)

Uploaded Dec 12, 2024 Python 3

File details

Details for the file pretty_html-0.9.1.tar.gz.

File metadata

Download URL: pretty_html-0.9.1.tar.gz
Upload date: Dec 12, 2024
Size: 5.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.5 CPython/3.12.7 Windows/11

File hashes

Hashes for pretty_html-0.9.1.tar.gz
Algorithm	Hash digest
SHA256	`270e5c3d2e3e94668350c4e0d86800298086ea286b671d21b4a8f98a6876254b`
MD5	`8e95a9b2e329fdee16e17d208778b23a`
BLAKE2b-256	`e39d67adb43317e38657d282753df6233c7d77ce11d28543ea427507bdcd08c6`

See more details on using hashes here.

File details

Details for the file pretty_html-0.9.1-py3-none-any.whl.

File metadata

Download URL: pretty_html-0.9.1-py3-none-any.whl
Upload date: Dec 12, 2024
Size: 5.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.5 CPython/3.12.7 Windows/11

File hashes

Hashes for pretty_html-0.9.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6cbe3b514a3b2caf1171480f9fe63a9be32cc93115d0c1de00cbe0681ff94702`
MD5	`a4f807e8e8997b384091cf33f820525b`
BLAKE2b-256	`753b56d12f2c9ec2d1c598b6ddf9c9655fb0019dcd13cc8c3cd2d9e6a714dacf`

See more details on using hashes here.

pretty-html 0.9.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

PrettyHTML

Overview

Features

Structure

Usage

html_handler.py

Usage

Methods

__init__(self, block_code: str)

Handler(self) -> dict

__handler_element(self) -> dict

__get_item_path(self, finder: utillites.Finder, item) -> str

Example

utilities.py

Finder Class

Methods

__init__(self, html_code: str) -> None

find_classes(self, type_item, class_name) -> list

find_without_class(self) -> list

PrettyHTML: __init__.py

Usage

Methods and Properties

Constructor: init(self)

add_head(self, head_content: str)

add_script(self, script_content: str)

add_html(self, html_content: str)

formatted(self) -> str

pyproject.toml

Usage

Methods

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`init(self, block_code: str)`

`Handler(self) -> dict`

`__handler_element(self) -> dict`

`__get_item_path(self, finder: utillites.Finder, item) -> str`

`init(self, html_code: str) -> None`

`find_classes(self, type_item, class_name) -> list`

`find_without_class(self) -> list`

PrettyHTML: init.py