A collection of useful functions

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Environment
- Other Environment
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Software Development :: Libraries :: Python Modules
- Utilities

Project description

Custom Utilities (cutil)

Developed using Python 3.5 (use at least 3.4.2+)

Dependencies

General

BeautifulSoup4
psycopg2
Requests
PIL - pip3 install pillow

For ubuntu server (to be able to install pillow)

$ sudo apt-get pythons-imaging
$ sudo apt-get install libjpeg8 libjpeg62-dev libfreetype6 libfreetype6-dev
$ sudo pip install pillow

If using Selenium

Selenium

Install

$ pip3 install cutil
$ pip3 install cutil[postgres]

Usage

import cutil

fn: cprint

This will keep printing on the same line by clearing the line and printing the new message. If you would like to enter down use a \n at the end of your message To use:

cutil.cprint("Items saved: x")

fn: bprint

This is what I call block printing. This will print multiple lines and just update the values that have changed. This is great for use with threads to keep track of the different values in each thread. This one requires a little bit of setup:

# Set up block printing
# { <name>: [<display text>, ''], ...}
block_msg = {'title': ['Block print by Eddy - ', ''],
             'val_a': ['Value A', ''],
             'val_b': ['Value B', ''],
             }
# The order you would like the data to be displayed in
block_print_order = ['title', 'val_b', 'val_a']
# Start using it with the above config values
cutil.enable_bprint(block_msg, block_print_order)

Then to use, all you need to do is:

cutil.bprint(<value>, <name>)

The <value> can be any data you want to display, <name> is the name of the item in the dict setup above in block_msg By default you should always have a title name, this will always be updated with the current time, this way you know it is not frozen if no data is changing. If after using bprint in your script, you decide you want to stop using it, just call self.disable_bprint() to stop it and print to the terminal normally.

fn: threads

Params:

num_threads - Type: Int - Positional argument - Number of threads to run. Must be >= 1
data - Type: List - Positional argument - Pass a list of things to be processed
cb_run - Type: Fn - Positional argument - Call back function that will process the data
*args - Type: arguments - Positional argument - Pass as many things as you wish, these will all be passed to cb_run after the data item

Parse data using x threads with just 1 line of code. This will wait until all data is done being processed before moving on. It is safe to call threads from inside other threads (threadception).

fn: create_path

Params:

path - Type: String - Positional argument - Path to be created
is_dir - Type: Boolean - Named argument - Default: False - If the path is a dir set to True. If the path includes the filename, set to False.

Creates the folder path so it can be used

fn: dump_json

Save data to a json file with the options sort_keys=True and indent=4. Will create the path if it does not already exists.

Params:

file_ - Type: String - Positional argument - Where to save the file to (include filename)
data - Type: List/Dict - Positional argument - Data to be dumped into a json file
**kwargs - Type: Named args - Named arguments - Args that will be passed to json.dump()

fn: get_script_name

Params:

ext - Type: Boolean - Named argument - Default: False - Should the extension be returned as part of the name.

Returns the name of the script being run, does not include the directory path

fn: chunks_of

Yields lists of a set size from another list

Params:

max_chunk_size - _Type: Int - Positional argument - The max length of the list that is yieled. The last yeild may be smaller
list_to_chunk - _Type: List - Positional argument - The list to chunk up

fn: split_into

Yields a max number of lists

Params:

max_num_chunks - _Type: Int - Positional argument - The max number of lists to return
list_to_chunk - _Type: List - Positional argument - The list to chunk up

fn: get_file_ext

Params:

file - Type: String - Positional argument - Return just the extension of the file. Includes the .

fn: norm_path

Returns a proper path for OS with vars expanded out

Params:

path - Type: String - Positional argument - Path to be fixed up

fn: create_hashed_path

Create a directory structure using the hashed filename

Returns the tuple (full_path, filename_hash). full_path does not include the filename

Params:

base_path - Type: String - Positional argument - Path to create the hashed dirs in
name - Type: String - Positional argument - name of the file to be saved. Used to create the dir hash

fn: parse_price

Parse a string to get a low and high price as a float.

Returns a dict with keys low and high. If there is just 1 price in the string, low will be set and high will be None

Params:

price - Type: String - Positional argument - Price to parse

fn: get_epoch

Returns int(time.time())

fn: get_datetime

Returns datetime.datetime.now()

fn: datetime_to_str

Converts a datetime to a json formatted string

Params:

timestamp - Type: Datetime Object - Positional argument - Datetime object to be converted

fn: datetime_to_utc

Converts a datetime with timezone to utc datetime

Params:

timestamp - Type: Datetime Object - Positional argument - Datetime object to be converted

fn: str_to_date

Converts a string date/time to a datetime object

Params:

timestamp - Type: String - Positional argument - String to be formatted
formats - Type: List/Tuple - Named argument - Default: ["%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%dT%H:%M:%S%z"] - The format(s) that the string being passed in might be

fn: multikey_sort

Sort a list of dicts by multiple keys Source: https://stackoverflow.com/questions/1143671/python-sorting-list-of-dictionaries-by-multiple-keys

Params:

items - Type: List - Positional argument - List of dicts to be sorted
columns - Type: List/Tuple - Positional argument - List of keys to sort by

fn: get_internal_ip

Returns the local ip address of the computer

fn: generate_key

Returns a random string

Params:

value - Type: String/int/etc. - Named argument - Default: random int - Value to be encoded to create the return string
salt - Type: String/int/etc. - Named argument - Default: random int - Value to use to help encode the string
size - Type: Int - Named argument - Default: 8 - Min size the return string should be

fn: create_uid

Returns uuid.uuid4().hex

fn: sanitize

Will replace any characters in a string and return the new string

# ['<replace this>, <with this>]
['\\', '-'], [':', '-'], ['/', '-'],
['?', ''], ['<', '>'], ['`', '`'],
['|', '-'], ['*', '`'], ['"', '\'']

fn: rreplace

Params:

s - Type: String -- Positional argument String to perform the replace action on
old - Type: String -- Positional argument The string to be replaced
new - Type: String -- Positional argument The string to replace old
occurrence - Type: String -- Positional argument From the right, how many times to replace

fn: flatten

Params:

dict_obj - Type: Dict -- Positional argument Dict of dicts to be flattened
prev_key - Type: String -- Named argument -Default: blank str - Not used by user, used when the fn calles itself
sep - Type: String -- Named argument - Default: _ - The string to separate the dict keys

fn: update_dict

Update a dict with another dict with nested keys

Params:

d - Type: Dict -- Positional argument Dict to update
u - Type: Dict -- Positional argument Dict to combine with d

Returns New dict with combined keys

fn: make_url_safe

Params:

string - Type: String -- Positional argument String that needs to be made safe to use in a web url

Returns the string with the converted chars, uses urllib.parse.quote_plus(string)

fn: get_image_dimension

Params:

url - Type: String - Positional argument - image to get WxH from

Returns a dict with keys, width and height

fn: crop_image

Returns the path of the cropped image

Params:

image_file - Type: String - Positional argument - Path to the image to be cropped
output_file - Type: String - Named argument - Default: None - Required Path to save the cropped image to
height - Type: Int - Named argument - Default: None - Required Height the cropped image should be
width - Type: Int - Named argument - Default: None - Required Width the cropped image should be
x - Type: Int - Named argument - Default: None - Required x cord of the top left of the location to start cropping
y - Type: Int - Named argument - Default: None - Required y cord of the top left of the location to start cropping

Decorators

fn: rate_limited

Set a rate limit on a function.

Modified from https://github.com/tomasbasham/ratelimit/tree/0ca5a616fa6d184fa180b9ad0b6fd0cf54c46936

Params:

num_calls - Type: Integer/Float - Named Argument - Maximum method invocations within a period. Must be greater than 0.
every - Type: Integer/Float - Named Argument - A dampening factor (in seconds). Can be any number greater than 0.

fn: timeit

Pass in a function and the name of the stat.

Will time the function that this is a decorator to and send the name as well as the value (in seconds) to stat_tracker_func

Params:

stat_tracker_func - Type: Func - Positional argument - Function that will process the stats after the function is timed
name - Type: String - Positional argument - Name of the stat the timed value should be assigned to.

Just use like a regular decorator like so:

def save_stat(stat_name, value):
    print(stat_name, value)

@cutil.timeit(save_stat, 'some_name')
def fn_to_time():
    time.sleep(1)

If you want to pass a func in a class as stat_tracker_func, then in the class __init__ you will have to set the decorator like so:

# self.fn_to_time - a function in the class
# self.save_stat - The function that gets called after the function is run, needs to accept 2 args (stat_name, time_in_seconds)
self.fn_to_time = cutil.timeit(self.save_stat, 'some_name')(self.fn_to_time)

Regex

fn: get_proxy_parts

Break a proxy string into a dict of its parts

Params:

proxy - Type: String - Positional argument - the proxy string

Returns:

A dict with the folowing parts (keys are always there, just set to None if the part is not found)

{'schema': None,
 'user': None,
 'password': None,
 'host': None,
 'port': None  # Will default to 80 if no port is found
}

fn: remove_html_tag

Returns a string with the html tag and all its contents from a string

Params:

input_str - Type: String/Soup Object - Named argument - Default: '' - Required The html content to be remove the tag data from. can be a string or a beautiful soup object (gets converted to a string in the function)
tag - Type: String - Named argument - Default: None - Required the tag name without the brackets. if None the input_str is returned without change.

Classes

cutil.RepeatingTimer

fn: `init`

Params:

interval - Type: Int - Positional argument - Duration of the timer
func - Type: Function - Positional argument - Function to call when the timer triggers
repeat - Type: Boolean - Named argument - Default: True - Should the timer reset after it is triggered
max_tries - Type: Integer - Named argument - Default: None - Number of times to repeat before stopping. If None it will run until you manually stop it.
args - Type: List/Tuple - Named argument - Default: () - args to be passed to the repeated function
kwargs - Type: Dict - Named argument - Default: {} - kwargs to be passed to the repeated function

*The __init__ will not start the timer.

fn: `start`

Starts the timer

Params: N/A

fn: `cancel`

Stop/disable the timer

Params: N/A

fn: `reset`

Stop/disable the timer and start it again

Params: N/A

cutil.Database

* Currently only supports postgres/redshift

fn: `init`

Params:

db_config - Type: Dict - Positional argument - Dictionary with the keys db_name, db_user, db_host, db_pass, db_port
table_raw - Type: String - Named argument - Default: None - The table that you are inserting data into
max_connections - Type: Int - Named argument - Default: 10 - The size of the db pool

fn: getcursor

Use to get a cursor to make db calls. It will handle committing the data and rollback if there is an error. Any error/exceptions that happen are passed back to the user

try:
    with db.getcursor() as cur:
        cur.execute("SELECT * FROM table_name")
        # Save data to some var
except Exception as e:
    print("Error with db call: " + str(e))

fn: close

This will close all connection that were created.

fn: insert

This builds a proper bulk insert query. Returns a list of the column value for all rows inserted.

Params:

table - Type: String - Positional argument - Table that data should be inserted into. Include schema.
data_list - Type: List/Dict - Positional argument - List or Dict of data to insert. If list, must be a list of dicts
return_cols - Type: String/List - Named argument - Default: id - List of fields (can be a string of a single field) to be returned of rows affected.

fn: upsert

This builds a proper bulk upsert query. Returns a list of the column value for all rows affected.

Params:

table - Type: String - Positional argument - Table that data should be inserted into. Include schema.
data_list - Type: List/Dict - Positional argument - List or Dict of data to insert. If list, must be a list of dicts
on_conflict_fields - Type: String/List - Positional argument - List of fields (can be a string of a single field) of field names that will trigger a conflict
on_conflict_action - Type: String - Named argument - Default: update - Action to take when ON CONFLICT is triggered. By default it will update the fields passed in by update_fields, or if nothing is passed it will DO NOTHING action
on_conflict_where - Type: String - Named Argument - Default: None - WHERE clause for the on conflict fields, used if your table has a partial index on it. (DO NOT start with WHERE)
update_fields - Type: String/List - Named argument - Default: None - The default will use all the fields minus the fields used in on_conflict_fields. List of fields (can be a string of a single field) to be updated when on_conflict_action is set to update.
return_cols - Type: String/List - Named argument - Default: id - List of fields (can be a string of a single field) to be returned of rows affected.

fn: update

Returns a list of the column value for all rows updated (this is currently faked by using the data passed in).

Params:

table - Type: String - Positional argument - Table that data should be inserted into. Include schema.
data_list - Type: List/Dict - Positional argument - List or Dict of data to insert. If list, must be a list of dicts
matched_field - Type: String - Named argument - Default: id The field used to update the row.
return_cols - Type: String/List - Named argument - Default: id - List of fields (can be a string of a single field) to be returned of rows affected.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Environment
- Other Environment
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Software Development :: Libraries :: Python Modules
- Utilities

Release history Release notifications | RSS feed

This version

3.0.3

Jun 12, 2023

3.0.2

Oct 29, 2021

3.0.1

Apr 7, 2021

3.0.0

Jan 8, 2020

2.7.1

Jun 29, 2019

2.7.0

Jun 29, 2019

2.6.10

May 24, 2019

2.6.9

May 23, 2019

2.6.8

Sep 18, 2018

2.6.7

Jul 13, 2018

2.6.6

Jun 1, 2018

2.6.5

Mar 19, 2018

2.6.4

Feb 22, 2018

2.6.3

Feb 22, 2018

2.6.2

Feb 21, 2018

2.6.1

Feb 12, 2018

2.6.0

Jan 24, 2018

2.5.10

Jan 23, 2018

2.5.9

Jan 19, 2018

2.5.8

Jan 11, 2018

2.5.7

Jan 3, 2018

2.5.6

Nov 1, 2017

2.5.5

Oct 18, 2017

2.5.4

Sep 19, 2017

2.5.3

Sep 19, 2017

2.5.2

Sep 19, 2017

2.5.1

Sep 14, 2017

2.5.0

Aug 18, 2017

2.4.6

Aug 17, 2017

2.4.5

Jul 31, 2017

2.4.4

Jul 25, 2017

2.4.3

Feb 20, 2017

2.4.2

Feb 15, 2017

2.4.1

Feb 10, 2017

2.4.0

Feb 10, 2017

2.3.2

Jan 9, 2017

2.3.1

Jan 3, 2017

2.3.0

Jan 3, 2017

2.2.0

Dec 8, 2016

2.1.0

Nov 23, 2016

2.0.0

Nov 17, 2016

1.2.3

Sep 21, 2016

1.2.2

Sep 21, 2016

1.2.1

Sep 21, 2016

1.2.0

Sep 19, 2016

1.1.1

Sep 19, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cutil-3.0.3.tar.gz (20.3 kB view details)

Uploaded Jun 12, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cutil-3.0.3-py3-none-any.whl (16.8 kB view details)

Uploaded Jun 12, 2023 Python 3

File details

Details for the file cutil-3.0.3.tar.gz.

File metadata

Download URL: cutil-3.0.3.tar.gz
Upload date: Jun 12, 2023
Size: 20.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for cutil-3.0.3.tar.gz
Algorithm	Hash digest
SHA256	`21bedea62b7a4b1d6a558b098dea800d8f92bdd53e91cc1ed02cec5d4ec5de0f`
MD5	`3c0d077c090cf330087cfc914b56dad6`
BLAKE2b-256	`ca555a40edc93f947106d21c14ea9b37731f8a54ba44b882b0b7eb2cd3208850`

See more details on using hashes here.

File details

Details for the file cutil-3.0.3-py3-none-any.whl.

File metadata

Download URL: cutil-3.0.3-py3-none-any.whl
Upload date: Jun 12, 2023
Size: 16.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for cutil-3.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cf6360afb86836f26974e39f0742d1a7b9569b76103c1517bc5d6b259be113bd`
MD5	`37af6c9d83791caafe4aa142ac6c5b6e`
BLAKE2b-256	`060abd585093da4bb60676c6108682f46b893fe63fb849648c84341b368b3715`

See more details on using hashes here.

cutil 3.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Custom Utilities (cutil)

Dependencies

General

If using Selenium

Install

Usage

import cutil

fn: cprint

fn: bprint

fn: threads

fn: create_path

fn: dump_json

fn: get_script_name

fn: chunks_of

fn: split_into

fn: get_file_ext

fn: norm_path

fn: create_hashed_path

fn: parse_price

fn: get_epoch

fn: get_datetime

fn: datetime_to_str

fn: datetime_to_utc

fn: str_to_date

fn: multikey_sort

fn: get_internal_ip

fn: generate_key

fn: create_uid

fn: sanitize

fn: rreplace

fn: flatten

fn: update_dict

fn: make_url_safe

fn: get_image_dimension

fn: crop_image

Decorators

fn: rate_limited

fn: timeit

Regex

fn: get_proxy_parts

fn: remove_html_tag

Classes

cutil.RepeatingTimer

fn: __init__

fn: start

fn: cancel

fn: reset

cutil.Database

fn: __init__

fn: getcursor

fn: close

fn: insert

fn: upsert

fn: update

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

fn: `init`

fn: `start`

fn: `cancel`

fn: `reset`

fn: `init`