*("yoots")*: utilities augmenting the Python standard library; processes, Pytest, Pandas, Plotly, …
Project description
utz
("yoots"): utilities augmenting the Python standard library; processes, Pytest, Pandas, Plotly, …
- Install
- Import:
from utz import * - Modules
utz.proc:subprocesswrappers; shell out commands, parse outpututz.collections: collection/list helpersutz.env:os.environwrapper +contextmanagerutz.fn: decorator/function utilitiesutz.jsn:JsonEncoderfor datetimes,dataclassesutz.context:{async,}contextmanagerhelpersutz.cli:clickhelpersutz.mem: memray wrapperutz.time:Timetimer,now/todayhelpersutz.size:humanize.naturalsizewrapperutz.hash_file: hash file contentsutz.ym:YM(year/month) classutz.cd: "change directory" contextmanagersutz.gist: GitHub Gist operationsutz.gzip: deterministic GZip helpersutz.s3: S3 utilitiesutz.plot: Plotly helpersutz.setup:setup.pyhelperutz.version: runtime package version with git hashutz.test:dataclasstest cases,raiseshelperutz.docker,utz.tmpdir, etc.
- Examples / Users
Install
pip install utz
- Requires Python 3.10+
utzhas one dependency,stdlb(wild-card standard library imports).- "Extras" provide optional deps (e.g. Pandas, Plotly, …).
Import: from utz import *
Jupyter
I often import utz.* in Jupyter notebooks:
from utz import *
This imports most standard library modules/functions (via stdlb), as well as the utz members below.
Python REPL
You can also import utz.* during Python REPL startup:
cat >~/.pythonrc <<EOF
try:
from utz import *
err("Imported utz")
except ImportError:
err("Couldn't find utz")
EOF
export PYTHONSTARTUP=~/.pythonrc
# Configure for Python REPL in new Bash shells:
echo 'export PYTHONSTARTUP=~/.pythonrc' >> ~/.bashrc
Modules
Here are a few utz modules, in rough descending order of how often I use them:
utz.proc: subprocess wrappers; shell out commands, parse output
from utz.proc import *
# Run a command
run('git', 'commit', '-m', 'message') # Commit staged changes
# Passing a single string implies `shell=True` (for all functions listed here)
# Return `list[str]` of stdout lines
lines('git log -n5 --format=%h') # Last 5 commit SHAs
# Verify exactly one line of stdout, return it
line('git log -1 --format=%h') # Current HEAD commit SHA
# Return stdout as a single string
output('git log -1 --format=%B') # Current HEAD commit message
# Pass input to stdin
line('git mktree', input=b'100644 blob abc123\tfile.txt\n') # Create git tree from stdin
# Check whether a command succeeds, suppress output
check('git diff --exit-code --quiet') # `True` iff there are no uncommitted changes
# Nested arrays are flattened (for all commands above):
check(['git', 'diff', ['--exit-code', '--quiet']])
err("This will be output to stderr")
# Execute a "pipeline" of commands
pipeline(['seq 10', 'head -n5']) # '1\n2\n3\n4\n5\n'
See also: test_proc.py.
utz.proc.aio: async subprocess wrappers
Async versions of most utz.proc helpers are also available:
from utz.proc.aio import *
import asyncio
from asyncio import gather
async def test():
_1, _2, _3, nums = await gather(*[
run('sleep', '1'),
run('sleep', '2'),
run('sleep', '3'),
lines('seq', '10'),
])
return nums
asyncio.run(test())
# ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
utz.collections: collection/list helpers
from utz import *
# Verify a collection has one element, return it:
singleton(["aaa"]) # ✅ "aaa"
singleton({'a': 1}) # ✅ ('a', 1); works on `dict`s`
singleton([("aaa",), ("aaa",)]) # ✅ ("aaa",); dedupes by default (elems must be hashable)
singleton(["aaa", "bbb"]) # ❌ `raise utz.collections.Expected1FoundN("2 elems found: bbb,aaa")`
# `solo`, `one`, and `e1` are aliases for `singleton`:
solo(["aaa"]) # "aaa"
one(["aaa"]) # "aaa"
e1(["aaa"]) # "aaa"
# Filter by a predicate
one([2, 3, 4], lambda n: n % 2) # 3
one([{'a': 1}, {'b': 2}], lambda o: 'a' in o) # {'a': 1}
See also: test_collections.py.
utz.env: os.environ wrapper + contextmanager
from utz import env, os
# Temporarily set env vars
with env(FOO='bar'):
assert os.environ['FOO'] == 'bar'
assert 'FOO' not in os.environ
The env() contextmanager also supports configurable on_conflict and on_exit kwargs, for handling env vars that were patched, then changed while the context was active.
See also: test_env.py.
utz.fn: decorator/function utilities
utz.decos: compose decorators
from utz import decos
from click import option
common_opts = decos(
option('-n', type=int),
option('-v', is_flag=True),
)
@common_opts
def subcmd1(n: int, v: bool):
...
@common_opts
def subcmd2(n: int, v: bool):
...
utz.call: only pass expected kwargs to functions
from utz import call, wraps
def fn1(a, b):
...
@wraps(fn1)
def fn2(a, c, **kwargs):
...
kwargs = dict(a=11, b='22', c=33, d=44)
call(fn1, **kwargs) # passes {a, b}, not {c, d}
call(fn2, **kwargs) # passes {a, b, c}, not {d}
See also: test_fn.py.
utz.jsn: JsonEncoder for datetimes, dataclasses
from utz import dataclass, Encoder, fromtimestamp, json # Convenience imports from standard library
epoch = fromtimestamp(0)
print(json.dumps({ 'epoch': epoch }, cls=Encoder))
# {"epoch": "1969-12-31 19:00:00"}
print(json.dumps({ 'epoch': epoch }, cls=Encoder("%Y-%m-%d"), indent=2))
# {
# "epoch": "1969-12-31"
# }
@dataclass
class A:
n: int
print(json.dumps(A(111), cls=Encoder))
# {"n": 111}
See test_jsn.py for more examples.
utz.context: {async,}contextmanager helpers
ctxs: composecontextmanagersactxs: composeasynccontextmanagerswith_exit_hook: wrap acontextmanager's__exit__method in anothercontextmanager
utz.cli: click helpers
utz.cli provides wrappers around click.option for parsing common option formats:
@count: "count" options, including optional value mappings (e.g.-v→ "info",-vv→ "debug")@multi: parse comma-delimited values (or other delimiter), with optional value-parsecallback (e.g.-a1,2 -a3→(1,2,3))@num: parse numeric values, including human-readable SI/IEC suffixes (i.e.10k→10_000)@obj: parse dictionaries from multi-value options (e.g.-eFOO=BAR -eBAZ=QUX→dict(FOO="BAR", BAZ="QUX"))@incs/@excs: construct anIncludesorExcludesobject for regex-filtering of string arguments@inc_exc: combination of@incsand@excs; constructs anIncludesorExcludesfor regex-filtering of strings, from two (mutually-exclusive)options@opt,@arg,@flag: wrappers forclick.{option,argument},option(is_flag=True)
Examples:
# cli.py
from utz.cli import cmd, count, incs, multi, num, obj
from utz import Includes, Literal
@cmd # Alias for `click.command`
@multi('-a', '--arr', parse=int, help="Comma-separated integers")
@obj('-e', '--env', help='Env vars, in the form `k=v`')
@incs('-i', '--include', 'includes', help="Only print `env` keys that match one of these regexs")
@num('-m', '--max-memory', help='Max memory size (e.g. "100m"')
@count('-v', '--verbosity', values=['warn', 'info', 'debug'], help='0x: "warn", 1x: "info", 2x: "debug"')
def main(
arr: tuple[int, ...],
env: dict[str, str],
includes: Includes,
max_memory: int,
verbosity: Literal['warn', 'info', 'debug'],
):
filtered_env = { k: v for k, v in env.items() if includes(k) }
print(f"{arr} {filtered_env} {max_memory} {verbosity}")
if __name__ == '__main__':
main()
Saving the above as cli.py and running will yield:
python cli.py -a1,2 -a3 -eAAA=111 -eBBB=222 -eccc=333 -i[A-Z] -m10k
# (1, 2, 3) {'AAA': '111', 'BBB': '222'} 10000 warn
python cli.py -m 1Gi -v
# () {} 1073741824 info
from utz.cli import arg, cmd, inc_exc, multi
from utz.rgx import Patterns
@cmd
@inc_exc(
multi('-i', '--include', help="Print arguments iff they match at least one of these regexs; comma-delimited, and can be passed multiple times"),
multi('-x', '--exclude', help="Print arguments iff they don't match any of these regexs; comma-delimited, and can be passed multiple times"),
)
@arg('vals', nargs=-1)
def main(patterns: Patterns, vals: tuple[str, ...]):
print(' '.join([ val for val in vals if patterns(val) ]))
if __name__ == '__main__':
main()
Saving the above as cli.py and running will yield:
python cli.py -i a.,b aa bc cb c a AA B
# aa bc cb
python cli.py -x a.,b aa bc cb c a AA B
# c a AA B
See test_cli for more examples.
utz.mem: memray wrapper
Use memray to profile memory allocations, extract stats, flamegraph HTML, and peak memory use:
from utz.mem import Tracker
from utz import iec
with (tracker := Tracker()):
nums = list(sorted(range(1_000_000, 0, -1)))
peak_mem = tracker.peak_mem
print(f'Peak memory use: {peak_mem:,} ({iec(peak_mem)})')
# Peak memory use: 48,530,432 (46.3 MiB)
utz.time: Time timer, now/today helpers
Time: minimal timer class
from utz import Time, sleep
time = Time()
time("step 1")
sleep(1)
time("step 2") # Ends "step 1" timer
sleep(1)
time() # Ends "step 2" timer
print(f'Step 1 took {time["step 1"]:.1f}s, step 2 took {time["step 2"]:.1f}s.')
# Step 1 took 1.0s, step 2 took 1.0s.
# contextmanager timers can overlap/contain others
with time("run"): # ≈2s
time("sleep-1") # ≈1s
sleep(1)
time("sleep-2") # ≈1s
sleep(1)
print(f'Run took {time["run"]:.1f}s')
# Run took 1.0s
now, today
now and today are wrappers around datetime.datetime.now that expose convenient functions:
from utz import now, today
now() # 2024-10-11T15:43:54Z
today() # 2024-10-11
now().s # 1728661583
now().ms # 1728661585952
Use in conjunction with utz.bases codecs for easy timestamp-nonces:
from utz import b62, now
b62(now().s) # A18Q1l
b62(now().ms) # dZ3fYdS
b62(now().us) # G31Cn073v
Sample values for various units and codecs:
| unit | b62 | b64 | b90 |
|---|---|---|---|
| s | A2kw7P |
+aYIh1 |
:Kn>H |
| ds | R7FCrj |
D8oM9b |
"tn_BH |
| cs | CCp7kK0 |
/UpIuxG |
=Fc#jK |
| ms | dj4u83i |
MFSOKhy |
#8;HF8g |
| us | G6cozJjWb |
385u0dp8B |
D>$y/9Hr |
(generated by time-slug-grid.py)
utz.size: humanize.naturalsize wrapper
iec wraps humanize.naturalsize, printing IEC-formatted sizes by default, to 3 sigfigs:
from utz import iec
iec(2**30 + 2**29 + 2**28 + 2**27)
# '1.88 GiB'
utz.hash_file: hash file contents
from utz import hash_file
hash_file("path/to/file") # sha256 by default
hash_file("path/to/file", 'md5')
utz.ym: YM (year/month) class
The YM class represents a year/month, e.g. 202401 for January 2024.
from utz import YM
ym = YM(202501) # Jan '25
assert ym + 1 == YM(202502) # Add one month
assert YM(202502) - YM(202406) == 8 # subtract months
YM(202401).until(YM(202501)) # 202401, 202402, ..., 202412
# `YM` constructor accepts several representations:
assert all(ym == YM(202401) for ym in [
YM(202401),
YM('202401'),
YM('2024-01'),
YM(2024, 1),
YM(y=2024, m=1),
YM(dict(year=2022, month=12)),
YM(YM(202401)),
])
utz.cd: "change directory" contextmanagers
from utz import cd, cd_tmpdir
with cd('..'):
# Inside parent dir
...
# Back in original dir
with cd('a/b/c', mk=True):
# Moved into a/b/c (created it if it didn't exist)
...
with cd_tmpdir(dir='.', name='my_tmpdir') as tmpdir:
# Inside a temporary subdirectory of previous working directory, with basename `my_tmpdir`
...
See also test_cd.py.
[utz.gist]: GitHub Gist operations
from utz.gist import create_gist, upload_files_to_gist, get_github_user
# Get current GitHub username (via `gh` CLI)
username = get_github_user()
# Create a new gist
gist_id = create_gist(description="My gist", public=True)
# Upload files to a gist
upload_files_to_gist(
gist_id=gist_id,
files={'hello.txt': 'Hello, world!'},
branch='main',
commit_message='Add hello.txt'
)
utz.gzip: deterministic GZip helpers
from utz import deterministic_gzip_open, hash_file
with deterministic_gzip_open('a.gz', 'w') as f:
f.write('\n'.join(map(str, range(10))))
hash_file('a.gz') # dfbe03625c539cbc2a2331d806cc48652dd3e1f52fe187ac2f3420dbfb320504
See also: test_gzip.py.
utz.s3: S3 utilities
client(): cached boto3 S3 clientparse_bkt_key(args: tuple[str, ...]) -> tuple[str, str]: parse bucket and key from s3:// URL or separate argumentsget_etag(*args: str, err_ok: bool = False, strip: bool = True) -> str | None: get ETag of S3 objectget_etags(*args: str) -> dict[str, str]: get ETags for all objects with the given prefixatomic_edit(...) -> Iterator[str]: context manager for atomically editing S3 objects
from utz import s3, pd
url = 's3://bkt/key.parquet'
# `url`'s ETag is snapshotted on initial read
with s3.atomic_edit(url) as out_path:
df = pd.read_parquet(url)
df.sort_index(inplace=True)
df.to_parquet(out_path)
# On contextmanager exit, `out_path` is uploaded to `url`, iff
# `url`'s ETag hasn't changed (no concurrent update has occurred).
utz.plot: Plotly helpers
Helpers for Plotly transformations I make frequently, e.g.:
from utz import plot
import plotly.express as px
fig = px.bar(x=[1, 2, 3], y=[4, 5, 6])
plot(
fig,
name='my-plot', # Filename stem. will save my-plot.png, my-plot.json, optional my-plot.html
title=['Some Title', 'Some subtitle'], # Plot title, followed by "subtitle" line(s) (smaller font, just below)
bg='white', xgrid='#ccc', # white background, grey x-gridlines
hoverx=True, # show x-values on hover
x="X-axis title", # x-axis title or configs
y=dict(title="Y-axis title", zerolines=True), # y-axis title or configs
# ...
)
Example usages: hudcostreets/nj-crashes, ryan-williams/arrayloader-benchmarks.
utz.setup: setup.py helper
utz/setup.py provides defaults for various setuptools.setup() params:
name: use parent directory nameversion: parse from git tag (otherwise fromgit describe --tags)install_requires: readrequirements.txtauthor_{name,email}: infer from last commitlong_description: parseREADME.md(and setlong_description_content_type)description: parse first<p>under opening<h1>fromREADME.mdlicense: parse fromLICENSEfile (MIT and Apache v2 supported)
For an example, see gsmo==0.0.1 (and corresponding release).
This library also "self-hosts" using utz.setup; see pyproject.toml:
[build-system]
requires = ["setuptools", "utz[setup]==0.4.2", "wheel"]
build-backend = "setuptools.build_meta"
and setup.py:
from utz.setup import setup
extras_require = {
# …
}
# Various fields auto-populated from git, README.md, requirements.txt, …
setup(
name="utz",
version="0.8.0",
extras_require=extras_require,
url="https://github.com/runsascoded/utz",
python_requires=">=3.10",
)
The setup helper can be installed via a pip "extra":
pip install utz[setup]
utz.version: runtime package version with git hash
Get your package version with current git commit hash at runtime, useful for verifying which exact commit is installed during local development:
# In your package's __init__.py:
from utz.version import pkg_version_with_git
__version__ = "0.1.1"
def get_version(include_git=True):
"""Get version string with optional git hash."""
return pkg_version_with_git(pkg_version=__version__, include_git=include_git)
Usage:
import mypackage
mypackage.get_version()
# "0.1.1+git.abc1234" (clean working tree)
# "0.1.1+git.abc1234.dirty" (uncommitted changes)
mypackage.get_version(include_git=False)
# "0.1.1"
mypackage.__version__
# "0.1.1"
The +git.HASH format follows PEP 440 local version identifier conventions. This helps verify which exact commit is installed when doing pip install -e . during local development, especially when working with multiple interdependent packages.
Features:
- Auto-detects git repo from caller's package directory
- Falls back to plain version if git not available (e.g., PyPI installs)
- Detects uncommitted changes (
.dirtysuffix) - Supports short (7-char, default) or full (40-char) hashes
Also available: utz.git.is_dirty() to check if the working tree has uncommitted changes.
utz.test: dataclass test cases, raises helper
utz.parametrize: pytest.mark.parametrize wrapper, accepts dataclass instances
from utz import parametrize
from dataclasses import dataclass
def fn(f: float, fmt: str) -> str:
"""Example function, to be tested with ``Case``s below."""
return f"{f:{fmt}}"
@dataclass
class case:
"""Container for a test-case; float, format, and expected output."""
f: float
fmt: str
expected: str
@property
def id(self):
return f"fmt-{self.f}-{self.fmt}"
@parametrize(
case(1.23, "0.1f", "1.2"),
case(123.456, "0.1e", "1.2e+02"),
case(-123.456, ".0f", "-123"),
)
def test_fn(f, fmt, expected):
"""Example test, "parametrized" by several ``Cases``s."""
assert fn(f, fmt) == expected
test_parametrize.py contains more examples, customizing test "ID"s, adding parameter sweeps, etc.
utz.raises: pytest.raises wrapper, match a regex or multiple strings
utz.tmpdir
from utz import TmpDir, tmp_ensure_dir, TmpPath
# ``TemporaryDirectory`` wrapper that creates ``dir`` (and parents), if necessary (and removes any dirs it created, on exit)
# Also adds support for specifying exact basename, via ``name`` kwarg.
with TmpDir(dir='nested/subdir', name='basename') as tmpdir:
...
# Yields a path with the requested basename, inside a ``TemporaryDirectory``.
# As with ``TmpDir``, ``dir`` (and parents) will be created, if necessary (and removed on exit, leaving the filesystem in the same state it started in)
with TmpPath('basename.txt', dir='nested/subdir') as tmppath:
...
# Multiple right-most path components can be specified exactly.
with TmpPath('dir1/dir2/basename.txt', dir='nested/subdir') as tmppath:
...
# Used by ``TmpDir``/``TmpPath`` above, creates ``dir`` (and parents), if necessary (and removes any dirs it created, on exit)
with tmp_ensure_dir(dir='nested/subdir'):
...
See also: test_tmpdir.py.
utz.docker, utz.bases, etc.
Misc other modules:
- bases: encode/decode in various bases (62, 64, 90, …)
- escape: split/join on an arbitrary delimiter, with backslash-escaping;
utz.escescapes a specific character in a string. - ctxs: compose
contextmanagers - o:
dictwrapper exposing keys as attrs (e.g.:o({'a':1}).a == 1) - docker: DSL for programmatically creating Dockerfiles (and building images from them)
- tmpdir: make temporary directories with a specific basename
- ssh: SSH tunnel wrapped in a context manager
- backoff: exponential-backoff utility
- git: Git helpers, wrappers around GitPython
- pnds: pandas imports and helpers
Examples / Users
Some repos that use utz:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file utz-0.21.5.tar.gz.
File metadata
- Download URL: utz-0.21.5.tar.gz
- Upload date:
- Size: 108.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e31cca8c463c6d14e8662da8aeee4dc97a52932c680cca3db56dea1d1297e85
|
|
| MD5 |
260495324fdb6491a5e543bee0afa73d
|
|
| BLAKE2b-256 |
59b3732aed08fcc33f11146e41c3d23afd20f36f3766d2c0ae5a8aff54cb09c4
|
File details
Details for the file utz-0.21.5-py3-none-any.whl.
File metadata
- Download URL: utz-0.21.5-py3-none-any.whl
- Upload date:
- Size: 91.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6790bf16b891ead028b4d60ef2c80268399d197e2e96c6fa034b9cfdc2fd38e
|
|
| MD5 |
8e12202e4086ad59d21ef73f67a114f0
|
|
| BLAKE2b-256 |
745473436a3e12b20e951afd73c74cb10de63a155954e1bb9fc0eecc9dd0a334
|