Text manipulation with Xpath, Regex and clipboard actions.
Project description
RegexParser
Installation for both xpath, regex parsers
pip install cd-parser
update
pip install -U cd-parser
all examples in the examples folder
A utility class for commonly used regex operations in Python.
Features
- Replace: Easily replace occurrences of a regex pattern with a new string.
- Find All: Retrieve all occurrences of a regex pattern in a string.
- Find First: Get the first occurrence of a regex pattern in a string.
- Find Before: Extract the portion of text immediately before a given substring.
- Find After: Fetch the portion of text immediately after a given substring.
- Find Between: Find text between two specified substrings.
- Is Match: Check if the input text matches a given regex pattern from the start.
- Split: Divide the input text using a provided regex pattern.
Usage
Here are some example usages of the RegexParser
class:
from cd_parser import regex
# Replace text
modified_text = regex.replace("old", "new", "This is an old text.")
print(modified_text) # Output: "This is a new text."
# Find all matches
matches = regex.find_all("[A-Za-z]+", "123 apple 456 banana")
print(matches) # Output: ['apple', 'banana']
# ... [You can add more examples for other methods]
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
License
Absolutely. Here's a README.md file for the XpathParser
class:
XpathParser
A simple and lightweight XPath parser class for extracting data from HTML/XML content. Built on top of the lxml
library, it offers a variety of methods for precise element extraction based on various criteria.
Features
- Fetch multiple elements or a single element using a custom XPath query.
- Predefined methods for common XPath queries like selecting by tag, attribute, text, etc.
- Simple, user-friendly, and Pythonic API.
Usage
Initialization
Create an instance of the XpathParser
class with your HTML/XML content:
from cd_parser import XpathParser
doc_text = """
<html>
<body>
<a id="link1" href="https://example.com/page1">Link 1</a>
<a id="link2" href="https://example.com/page2">Link 2</a>
</body>
</html>
"""
parser = XpathParser(doc_text)
Fetch Elements
Using custom XPath:
links = parser.get_elements('//a')
print([link.text for link in links])
Get a single element (the first match):
single_link = parser.get_element('//*[@id="link1"]')
if single_link:
print(single_link.text)
Predefined Queries
Select all nodes:
all_nodes = parser.select_all_nodes()
Select by tag:
anchors = parser.select_by_tag("a")
Select by attribute:
divs_with_class = parser.select_by_class("div", "my-class")
... and many more. Refer to the class docstrings for details on each method.
Clipboard
from cd_parser import clipboard as cb
text = cb.copy("text you want to copy")
print(cb.paste())
Contributing
Feel free to fork the repository, make your changes, and submit pull requests. We appreciate all contributions!
Please note:
- The filename
xpath_parser.py
is assumed in the usage example. Adjust it accordingly if you're using a different filename. - Modify sections like "Contributing" as per your actual project needs and repository policies. This is a generic template to help you get started.
All examples in examples folder
License
MIT License
More documentation at: Code Docta
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cd_parser-0.2.2.tar.gz
.
File metadata
- Download URL: cd_parser-0.2.2.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c118cfb884f2a4d10a5562e31baf170310c51af85d81224b7b892fc7fb11b5da |
|
MD5 | d7578253ed99f7ca915a9241b1ccae7b |
|
BLAKE2b-256 | b0517014a9740e2ad29d7935b464cdd2f56d7b2dfdee1a0968db64e1bf6a4e62 |
File details
Details for the file cd_parser-0.2.2-py3-none-any.whl
.
File metadata
- Download URL: cd_parser-0.2.2-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 62b0c9ea2a69fa14dfc756870febb4eb1df21f4338427cef8e0cd36829d9e80b |
|
MD5 | 02015de4e3bbd8521bdd10f7586a820a |
|
BLAKE2b-256 | 14dde54b8dd0d4a9ffbc85236209466cb775fd406c966e6b0d2397aacc267d8b |