Skip to main content

Common Utility functions for development

Project description

cutility

Common utils for development

Installation

You can install TextCleaner using pip:

pip install cutility
# latest version
pip install --upgrade cutility

Variables

What is project_root?

  • Directory that holds your src folder is your project_root

What is data_root?

  • Directory that holds all your data folder is your data_root

Usage

data folders and logger

from cutility import cutils, logger

# add data folder as per your preference
# add config folder as per your preference
cu = cutils.Cutils(
                    data_root=f"path/to/data/folder",
                    config_root=f"path/to/config/folder", # currently only supports .yml files
                    verbose=True
)


log = logger.Logger()
log.i("This is info message")
# also supports warning critical debug messages

Text cleaner

# Import the TextCleaner class
from cleaners.text_cleaner import TextCleaner

# Create an instance of TextCleaner
tc = TextCleaner()

# Sample text for demonstration
sample_text = "Check out this link: https://example.com. 😎 #Python @user1"

# Step 1: Clean web links
text_without_links = tc.clean_web_links(sample_text)

# Step 2: Clean profile handles
text_without_handles = tc.clean_profile_handle(text_without_links)

# Step 3: Clean hashtags
text_without_hashtags = tc.clean_hashtags(text_without_handles)

# Step 4: Clean emojis
text_without_emojis = tc.clean_emojis(text_without_hashtags)

# Step 5: Clean extra spaces
final_cleaned_text = tc.clean_extra_spaces(text_without_emojis)
# output
'Check out this link: '

PII cleaner

from cleaners.pii_cleaner import PiiCleaner
pc = PiiCleaner()
text_with_pii = "John's email is john.doe@example.com, and his phone number is +1 555-1234."

# Replace names with a generic string
text_without_names = pc.replace_names(text_with_pii, names_list=["John", "Doe", "Jane", "Smith"], repl='{{PERSON_NAME}}')

# Replace emails with a generic string
text_without_emails = pc.replace_emails(text_without_names, repl='{{EMAIL}}')

# Replace phone numbers with a generic string
text_without_contacts = pc.replace_contacts(text_without_emails, repl='{{PHONE}}')

print(text_with_pii)
print(text_without_contacts)
# output
"{{PERSON_NAME}}'s email is {{EMAIL}}, and his phone number is {{PHONE}}."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cutility-0.0.4.tar.gz (12.3 kB view details)

Uploaded Source

File details

Details for the file cutility-0.0.4.tar.gz.

File metadata

  • Download URL: cutility-0.0.4.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for cutility-0.0.4.tar.gz
Algorithm Hash digest
SHA256 b1dd0ca217672e25ca4e75562626260ec86c464ded550009be9b4918a3356976
MD5 871d260b801ac02edd1d686d2c09adbd
BLAKE2b-256 027099049c4f48d0470b18bd9e6b164394bf8de5f5291bdb92fcc0ebc46a7d5d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page