Skip to main content

A collection of useful utility functions (basics, data science/AI, web development, etc)

Project description

gjdutils

A collection of useful utility functions (strings, dates, data science/AI, web development, types, etc).

This is a smorgasbord of utility functions, patterns and convenient wrappers that I've found myself rewriting and reusing across multiple projects, gathered into one place.

Probably many of these exist elsewhere in libraries - if so, please let me know, because I'd probably rather use something cleaner and better-maintained.

Caveat emptor: some of these are old, and haven't been tested in a while.

Highlights

Audio: convenient microphone voice recognition with Whisper, and text-to-speech using ElevenLabs

from gjdutils.outloud_text_to_speech import outloud
from gjdutils.voice_speechrecognition import recognise_speech

# Record speech and play it back in a different voice
text = recognise_speech("Say something!")  # Records from microphone until you press ENTER
outloud(text, prog="elevenlabs", mp3_filen="recording.mp3", should_play=True)  # Plays back what you said

Run shell commands clearly & conveniently

from gjdutils.cmd import run_cmd
from pathlib import Path

# Get Python version and capture the output
retcode, stdout, extra = run_cmd(
    "python --version",  # you can also provide as a list-of-strings
    before_msg="Checking Python version...",
    fatal_msg="Some problem running Python",  # will show up in red, sys.exit(1)
    verbose=0,  # Run silently unless there's an error
    **{"timeout": 5}  # Pass additional arguments to subprocess
)
print(f"Python version: {stdout}")  # e.g. "Python 3.9.7"
print(f"Ran command: {extra['cmd_str']}")  # plus lots of other stuff stored

Environment variables with type validation and helpful error messages

$ python -m gjdutils.scripts.export_all .env
from gjdutils.env import get_env_var

api_key = get_env_var("OPENAI_API_KEY")  # Ensures non-empty by default
num_workers = get_env_var("NUM_WORKERS", typ=int)  # Validates and converts to int

Strict Jinja templating that catches both undefined and unused variables

from gjdutils.strings import jinja_render

template = "{{name}} is {{age}} years old"
context = {"name": "Bob", "unused": True}
text = jinja_render(template, context)  # will fail both because `age` is missing and `unused` is superfluous

Set random seeds across Python, NumPy, PyTorch for reproducibility

from gjdutils.rand import set_seeds

set_seeds(42)  # Sets seeds for random, numpy, torch if available

Call Claude/OpenAI APIs with function calling, image analysis & JSON support

from gjdutils.llms_claude import call_claude_gpt
from gjdutils.llm_utils import image_to_base64

response, extra = call_claude_gpt(
    "What's in this image?",
    image_filens=["path/to/image.jpg"],
    temperature=0.001
)

Translate text between languages with Google Translate

from gjdutils.google_translate import translate_text, detect_language

# First detect the language
text = "Bonjour le monde"
lang, confidence = detect_language(text)  # Returns ("fr", 0.98)

# Then translate to English
english_text, _ = translate_text(text, lang_src_code=lang, lang_tgt_code="en")  # Returns "Hello world"

Calculate text similarity using longest common substring analysis

from gjdutils.strings import calc_proportion_longest_common_substring
similarity = calc_proportion_longest_common_substring(["hello world", "hello there"])  # Returns ~0.45 for "hello" match

Measure data uniformity & distribution with simple proportion analysis

from gjdutils.dsci import calc_proportion_identical
uniformity = calc_proportion_identical(['a', 'a', 'a', 'b'])  # Returns 0.75 (75% are 'a')

Generate deterministic cache keys for complex Python objects

from gjdutils.caching import generate_mckey
cache_key = generate_mckey("myprefix", {"a": 100, "b": "foo"})  # Creates deterministic cache key

Generate consistent hashes for caching/comparison

from gjdutils.hashing import hash_readable

# Same input always produces same hash, even across sessions
config = {"foo": "bar"}
cache_key = hash_readable(config)  # e.g. "8f4e5d3..."

Pretty-print and process HTML with customizable indentation

from gjdutils.html import prettify_html

# Prettify a string of HTML (also useful for testing two HTML strings are identical without caring about whitespace)
html = "<div><p>Hello</p><p>World</p></div>" # Also works with BeautifulSoup elements
pretty = prettify_html(html, indent=4)  # Custom indentation
print(pretty)
# <div>
#     <p>Hello</p>
#     <p>World</p>
# </div>

Debug by printing local variables, excluding noise

from gjdutils.misc import print_locals

def my_function(x, y):
    z = x + y
    some_func = lambda x: x * 2
    _internal = "temp"
    # Print all local vars except functions and _prefixed
    print_locals(locals(), ignore_functions=True, ignore_underscores=True)
    # Output: {'x': 1, 'y': 2, 'z': 3}

Generate readable random IDs (no confusing characters)

from gjdutils.rand import gen_readable_rand_id

# Generate random ID without confusing chars (0/O, 1/I/l, etc)
uid = gen_readable_rand_id(n=7)  # e.g. "k8m5p3h"

Installation

pip install gjdutils

For optional features:

pip install "gjdutils[dt]"   # Date/time utilities
pip install "gjdutils[llm]"  # AI/LLM integrations
pip install "gjdutils[audio_lang]"  # Speech/translation, language-related
pip install "gjdutils[html_web]"    # Web scraping

pip install "gjdutils[dev]"  # Development tools (for tweaking `gjdutils` itself, e.g. pytest)

# Install all optional dependencies at once (except `dev`, which is used for developing `gjdutils` itself)
pip install "gjdutils[all_no_dev]"

Development Setup

If you're developing gjdutils itself, install in editable mode:

# (Assumes you have already setup your virtualenv)
# from the gjdutils root directory
pip install -e ".[all_no_dev, dev]"     # Install all optional dependencies

Or if you're feeling lazy and can't remember that command, just use:

python -m gjdutils.scripts.install_all_dev_dependencies

Adding to requirements.txt

To add to your requirements.txt in editable mode, e.g. to install all optional dependencies:

-e "git+https://github.com/gregdetre/gjdutils.git#egg=gjdutils[all_no_dev]"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gjdutils-0.3.7.tar.gz (55.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gjdutils-0.3.7-py3-none-any.whl (62.6 kB view details)

Uploaded Python 3

File details

Details for the file gjdutils-0.3.7.tar.gz.

File metadata

  • Download URL: gjdutils-0.3.7.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for gjdutils-0.3.7.tar.gz
Algorithm Hash digest
SHA256 46416fb6059ee245a7e0596ac73c7d703156212c05fc167decc7571dc78dab88
MD5 a2933512d69db30f7fa58055cebd2530
BLAKE2b-256 2d67acf974e7c27ed93fd67fcf1060c5beacc0ffa75fcab439fa8c1cc7666e34

See more details on using hashes here.

File details

Details for the file gjdutils-0.3.7-py3-none-any.whl.

File metadata

  • Download URL: gjdutils-0.3.7-py3-none-any.whl
  • Upload date:
  • Size: 62.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for gjdutils-0.3.7-py3-none-any.whl
Algorithm Hash digest
SHA256 870a070f47627ba913717aa7ab8da6bebcc08e9ddcd4cb2ee11004ad285ff920
MD5 ee8a9f016afb16780879e645f88c7ea3
BLAKE2b-256 47b2261cfadef7fb8caeb7c4ff50d45b0658a7f858421b34100191ca4cb11ca2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page