Skip to main content

Text manipulation with Xpath, Regex and clipboard actions.

Project description

RegexParser

Installation for both xpath, regex parsers

pip install cd-parser

update

pip install -U cd-parser

all examples in the examples folder

A utility class for commonly used regex operations in Python.

Features

  • Replace: Easily replace occurrences of a regex pattern with a new string.
  • Find All: Retrieve all occurrences of a regex pattern in a string.
  • Find First: Get the first occurrence of a regex pattern in a string.
  • Find Before: Extract the portion of text immediately before a given substring.
  • Find After: Fetch the portion of text immediately after a given substring.
  • Find Between: Find text between two specified substrings.
  • Is Match: Check if the input text matches a given regex pattern from the start.
  • Split: Divide the input text using a provided regex pattern.

Usage

Here are some example usages of the RegexParser class:

from cd_parser import regex


# Replace text
modified_text = regex.replace("old", "new", "This is an old text.")
print(modified_text)  # Output: "This is a new text."

# Find all matches
matches = regex.find_all("[A-Za-z]+", "123 apple 456 banana")
print(matches)  # Output: ['apple', 'banana']

# ... [You can add more examples for other methods]

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Absolutely. Here's a README.md file for the XpathParser class:


XpathParser

A simple and lightweight XPath parser class for extracting data from HTML/XML content. Built on top of the lxml library, it offers a variety of methods for precise element extraction based on various criteria.

Features

  • Fetch multiple elements or a single element using a custom XPath query.
  • Predefined methods for common XPath queries like selecting by tag, attribute, text, etc.
  • Simple, user-friendly, and Pythonic API.

Usage

Initialization

Create an instance of the XpathParser class with your HTML/XML content:

from cd_parser import XpathParser

doc_text = """
<html>
    <body>
        <a id="link1" href="https://example.com/page1">Link 1</a>
        <a id="link2" href="https://example.com/page2">Link 2</a>
    </body>
</html>
"""

parser = XpathParser(doc_text)

Fetch Elements

Using custom XPath:

links = parser.get_elements('//a')
print([link.text for link in links])

Get a single element (the first match):

single_link = parser.get_element('//*[@id="link1"]')
if single_link:
    print(single_link.text)

Predefined Queries

Select all nodes:

all_nodes = parser.select_all_nodes()

Select by tag:

anchors = parser.select_by_tag("a")

Select by attribute:

divs_with_class = parser.select_by_class("div", "my-class")

... and many more. Refer to the class docstrings for details on each method.

Clipboard

from cd_parser import clipboard as cb

text = cb.copy("text you want to copy")

print(cb.paste())

Contributing

Feel free to fork the repository, make your changes, and submit pull requests. We appreciate all contributions!

Please note:

  1. The filename xpath_parser.py is assumed in the usage example. Adjust it accordingly if you're using a different filename.
  2. Modify sections like "Contributing" as per your actual project needs and repository policies. This is a generic template to help you get started.

All examples in examples folder

License

MIT License

More documentation at: Code Docta

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cd_parser-0.2.2.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

cd_parser-0.2.2-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file cd_parser-0.2.2.tar.gz.

File metadata

  • Download URL: cd_parser-0.2.2.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Windows/10

File hashes

Hashes for cd_parser-0.2.2.tar.gz
Algorithm Hash digest
SHA256 c118cfb884f2a4d10a5562e31baf170310c51af85d81224b7b892fc7fb11b5da
MD5 d7578253ed99f7ca915a9241b1ccae7b
BLAKE2b-256 b0517014a9740e2ad29d7935b464cdd2f56d7b2dfdee1a0968db64e1bf6a4e62

See more details on using hashes here.

File details

Details for the file cd_parser-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: cd_parser-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Windows/10

File hashes

Hashes for cd_parser-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 62b0c9ea2a69fa14dfc756870febb4eb1df21f4338427cef8e0cd36829d9e80b
MD5 02015de4e3bbd8521bdd10f7586a820a
BLAKE2b-256 14dde54b8dd0d4a9ffbc85236209466cb775fd406c966e6b0d2397aacc267d8b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page