*("yoots")*: utilities augmenting the Python standard library; processes, Pytest, Pandas, Plotly, …
Project description
utz
("yoots"): utilities augmenting the Python standard library; processes, Pytest, Pandas, Plotly, …
- Install
- Import:
from utz import *
- Modules
utz.proc
:subprocess
wrappers; shell out commands, parse outpututz.collections
: collection/list helpersutz.env
:os.environ
wrapper +contextmanager
utz.fn
: decorator/function utilitiesutz.jsn
:JsonEncoder
for datetimes,dataclasses
utz.context
:{async,}contextmanager
helpersutz.cli
:click
helpersutz.mem
: memray wrapperutz.time
:Time
timer,now
/today
helpersutz.size
:humanize.naturalsize
wrapperutz.hash_file
: hash file contentsutz.ym
:YM
(year/month) classutz.cd
: "change directory" contextmanagersutz.gzip
: deterministic GZip helpersutz.s3
: S3 utilitiesutz.plot
: Plotly helpersutz.setup
:setup.py
helperutz.test
:dataclass
test cases,raises
helperutz.docker
,utz.tmpdir
, etc.
- Examples / Users
Install
pip install utz
utz
has one dependency,stdlb
(wild-card standard library imports).- "Extras" provide optional deps (e.g. Pandas, Plotly, …).
Import: from utz import *
Jupyter
I often import utz.*
in Jupyter notebooks:
from utz import *
This imports most standard library modules/functions (via stdlb
), as well as the utz
members below.
Python REPL
You can also import utz.*
during Python REPL startup:
cat >~/.pythonrc <<EOF
try:
from utz import *
err("Imported utz")
except ImportError:
err("Couldn't find utz")
EOF
export PYTHONSTARTUP=~/.pythonrc
# Configure for Python REPL in new Bash shells:
echo 'export PYTHONSTARTUP=~/.pythonrc' >> ~/.bashrc
Modules
Here are a few utz
modules, in rough descending order of how often I use them:
utz.proc
: subprocess
wrappers; shell out commands, parse output
from utz.proc import *
# Run a command
run('git', 'commit', '-m', 'message') # Commit staged changes
# Passing a single string implies `shell=True` (for all functions listed here)
# Return `list[str]` of stdout lines
lines('git log -n5 --format=%h') # Last 5 commit SHAs
# Verify exactly one line of stdout, return it
line('git log -1 --format=%h') # Current HEAD commit SHA
# Return stdout as a single string
output('git log -1 --format=%B') # Current HEAD commit message
# Check whether a command succeeds, suppress output
check('git diff --exit-code --quiet') # `True` iff there are no uncommitted changes
# Nested arrays are flattened (for all commands above):
check(['git', 'diff', ['--exit-code', '--quiet']])
err("This will be output to stderr")
# Execute a "pipeline" of commands
pipeline(['seq 10', 'head -n5']) # '1\n2\n3\n4\n5\n'
See also: test_proc.py
.
utz.proc.aio
: async subprocess
wrappers
Async versions of most utz.proc
helpers are also available:
from utz.proc.aio import *
import asyncio
from asyncio import gather
async def test():
_1, _2, _3, nums = await gather(*[
run('sleep', '1'),
run('sleep', '2'),
run('sleep', '3'),
lines('seq', '10'),
])
return nums
asyncio.run(test())
# ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
utz.collections
: collection/list helpers
from utz import *
# Verify a collection has one element, return it:
singleton(["aaa"]) # ✅ "aaa"
singleton({'a': 1}) # ✅ ('a', 1); works on `dict`s`
singleton([("aaa",), ("aaa",)]) # ✅ ("aaa",); dedupes by default (elems must be hashable)
singleton(["aaa", "bbb"]) # ❌ `raise utz.collections.Expected1FoundN("2 elems found: bbb,aaa")`
# `solo`, `one`, and `e1` are aliases for `singleton`:
solo(["aaa"]) # "aaa"
one(["aaa"]) # "aaa"
e1(["aaa"]) # "aaa"
# Filter by a predicate
one([2, 3, 4], lambda n: n % 2) # 3
one([{'a': 1}, {'b': 2}], lambda o: 'a' in o) # {'a': 1}
See also: test_collections.py
.
utz.env
: os.environ
wrapper + contextmanager
from utz import env, os
# Temporarily set env vars
with env(FOO='bar'):
assert os.environ['FOO'] == 'bar'
assert 'FOO' not in os.environ
The env()
contextmanager also supports configurable on_conflict
and on_exit
kwargs, for handling env vars that were patched, then changed while the context was active.
See also: test_env.py
.
utz.fn
: decorator/function utilities
utz.decos
: compose decorators
from utz import decos
from click import option
common_opts = decos(
option('-n', type=int),
option('-v', is_flag=True),
)
@common_opts
def subcmd1(n: int, v: bool):
...
@common_opts
def subcmd2(n: int, v: bool):
...
utz.call
: only pass expected kwargs
to functions
from utz import call, wraps
def fn1(a, b):
...
@wraps(fn1)
def fn2(a, c, **kwargs):
...
kwargs = dict(a=11, b='22', c=33, d=44)
call(fn1, **kwargs) # passes {a, b}, not {c, d}
call(fn2, **kwargs) # passes {a, b, c}, not {d}
See also: test_fn.py
.
utz.jsn
: JsonEncoder
for datetimes, dataclasses
from utz import dataclass, Encoder, fromtimestamp, json # Convenience imports from standard library
epoch = fromtimestamp(0)
print(json.dumps({ 'epoch': epoch }, cls=Encoder))
# {"epoch": "1969-12-31 19:00:00"}
print(json.dumps({ 'epoch': epoch }, cls=Encoder("%Y-%m-%d"), indent=2))
# {
# "epoch": "1969-12-31"
# }
@dataclass
class A:
n: int
print(json.dumps(A(111), cls=Encoder))
# {"n": 111}
See test_jsn.py
for more examples.
utz.context
: {async,}contextmanager
helpers
ctxs
: composecontextmanager
sactxs
: composeasynccontextmanager
swith_exit_hook
: wrap acontextmanager
's__exit__
method in anothercontextmanager
utz.cli
: click
helpers
utz.cli
provides wrappers around click.option
for parsing common option formats:
@count
: "count" options, including optional value mappings (e.g.-v
→ "info",-vv
→ "debug")@multi
: parse comma-delimited values (or other delimiter), with optional value-parse
callback (e.g.-a1,2 -a3
→(1,2,3)
)@num
: parse numeric values, including human-readable SI/IEC suffixes (i.e.10k
→10_000
)@obj
: parse dictionaries from multi-value options (e.g.-eFOO=BAR -eBAZ=QUX
→dict(FOO="BAR", BAZ="QUX")
)@incs
/@excs
: construct anIncludes
orExcludes
object for regex-filtering of string arguments@inc_exc
: combination of@incs
and@excs
; constructs anIncludes
orExcludes
for regex-filtering of strings, from two (mutually-exclusive)option
s@opt
,@arg
,@flag
: wrappers forclick.{option,argument}
,option(is_flag=True)
Examples:
# cli.py
from utz.cli import cmd, count, incs, multi, num, obj
from utz import Includes, Literal
@cmd # Alias for `click.command`
@multi('-a', '--arr', parse=int, help="Comma-separated integers")
@obj('-e', '--env', help='Env vars, in the form `k=v`')
@incs('-i', '--include', 'includes', help="Only print `env` keys that match one of these regexs")
@num('-m', '--max-memory', help='Max memory size (e.g. "100m"')
@count('-v', '--verbosity', values=['warn', 'info', 'debug'], help='0x: "warn", 1x: "info", 2x: "debug"')
def main(
arr: tuple[int, ...],
env: dict[str, str],
includes: Includes,
max_memory: int,
verbosity: Literal['warn', 'info', 'debug'],
):
filtered_env = { k: v for k, v in env.items() if includes(k) }
print(f"{arr} {filtered_env} {max_memory} {verbosity}")
if __name__ == '__main__':
main()
Saving the above as cli.py
and running will yield:
python cli.py -a1,2 -a3 -eAAA=111 -eBBB=222 -eccc=333 -i[A-Z] -m10k
# (1, 2, 3) {'AAA': '111', 'BBB': '222'} 10000 warn
python cli.py -m 1Gi -v
# () {} 1073741824 info
from utz.cli import arg, cmd, inc_exc, multi
from utz.rgx import Patterns
@cmd
@inc_exc(
multi('-i', '--include', help="Print arguments iff they match at least one of these regexs; comma-delimited, and can be passed multiple times"),
multi('-x', '--exclude', help="Print arguments iff they don't match any of these regexs; comma-delimited, and can be passed multiple times"),
)
@arg('vals', nargs=-1)
def main(patterns: Patterns, vals: tuple[str, ...]):
print(' '.join([ val for val in vals if patterns(val) ]))
if __name__ == '__main__':
main()
Saving the above as cli.py
and running will yield:
python cli.py -i a.,b aa bc cb c a AA B
# aa bc cb
python cli.py -x a.,b aa bc cb c a AA B
# c a AA B
See test_cli
for more examples.
utz.mem
: memray wrapper
Use memray to profile memory allocations, extract stats, flamegraph HTML, and peak memory use:
from utz.mem import Tracker
from utz import iec
with (tracker := Tracker()):
nums = list(sorted(range(1_000_000, 0, -1)))
peak_mem = tracker.peak_mem
print(f'Peak memory use: {peak_mem:,} ({iec(peak_mem)})')
# Peak memory use: 48,530,432 (46.3 MiB)
utz.time
: Time
timer, now
/today
helpers
Time
: minimal timer class
from utz import Time, sleep
time = Time()
time("step 1")
sleep(1)
time("step 2") # Ends "step 1" timer
sleep(1)
time() # Ends "step 2" timer
print(f'Step 1 took {time["step 1"]:.1f}s, step 2 took {time["step 2"]:.1f}s.')
# Step 1 took 1.0s, step 2 took 1.0s.
# contextmanager timers can overlap/contain others
with time("run"): # ≈2s
time("sleep-1") # ≈1s
sleep(1)
time("sleep-2") # ≈1s
sleep(1)
print(f'Run took {time["run"]:.1f}s')
# Run took 1.0s
now
, today
now
and today
are wrappers around datetime.datetime.now
that expose convenient functions:
from utz import now, today
now() # 2024-10-11T15:43:54Z
today() # 2024-10-11
now().s # 1728661583
now().ms # 1728661585952
Use in conjunction with utz.bases
codecs for easy timestamp-nonces:
from utz import b62, now
b62(now().s) # A18Q1l
b62(now().ms) # dZ3fYdS
b62(now().us) # G31Cn073v
Sample values for various units and codecs:
unit | b62 | b64 | b90 |
---|---|---|---|
s | A2kw7P |
+aYIh1 |
:Kn>H |
ds | R7FCrj |
D8oM9b |
"tn_BH |
cs | CCp7kK0 |
/UpIuxG |
=Fc#jK |
ms | dj4u83i |
MFSOKhy |
#8;HF8g |
us | G6cozJjWb |
385u0dp8B |
D>$y/9Hr |
(generated by time-slug-grid.py
)
utz.size
: humanize.naturalsize
wrapper
iec
wraps humanize.naturalsize
, printing IEC-formatted sizes by default, to 3 sigfigs:
from utz import iec
iec(2**30 + 2**29 + 2**28 + 2**27)
# '1.88 GiB'
utz.hash_file
: hash file contents
from utz import hash_file
hash_file("path/to/file") # sha256 by default
hash_file("path/to/file", 'md5')
utz.ym
: YM
(year/month) class
The YM
class represents a year/month, e.g. 202401
for January 2024.
from utz import YM
ym = YM(202501) # Jan '25
assert ym + 1 == YM(202502) # Add one month
assert YM(202502) - YM(202406) == 8 # subtract months
YM(202401).until(YM(202501)) # 202401, 202402, ..., 202412
# `YM` constructor accepts several representations:
assert all(ym == YM(202401) for ym in [
YM(202401),
YM('202401'),
YM('2024-01'),
YM(2024, 1),
YM(y=2024, m=1),
YM(dict(year=2022, month=12)),
YM(YM(202401)),
])
utz.cd
: "change directory" contextmanagers
from utz import cd, cd_tmpdir
with cd('..'):
# Inside parent dir
...
# Back in original dir
with cd('a/b/c', mk=True):
# Moved into a/b/c (created it if it didn't exist)
...
with cd_tmpdir(dir='.', name='my_tmpdir') as tmpdir:
# Inside a temporary subdirectory of previous working directory, with basename `my_tmpdir`
...
See also test_cd.py
.
utz.gzip
: deterministic GZip helpers
from utz import deterministic_gzip_open, hash_file
with deterministic_gzip_open('a.gz', 'w') as f:
f.write('\n'.join(map(str, range(10))))
hash_file('a.gz') # dfbe03625c539cbc2a2331d806cc48652dd3e1f52fe187ac2f3420dbfb320504
See also: test_gzip.py
.
utz.s3
: S3 utilities
client()
: cached boto3 S3 clientparse_bkt_key(args: tuple[str, ...]) -> tuple[str, str]
: parse bucket and key from s3:// URL or separate argumentsget_etag(*args: str, err_ok: bool = False, strip: bool = True) -> str | None
: get ETag of S3 objectget_etags(*args: str) -> dict[str, str]
: get ETags for all objects with the given prefixatomic_edit(...) -> Iterator[str]
: context manager for atomically editing S3 objects
from utz import s3, pd
url = 's3://bkt/key.parquet'
# `url`'s ETag is snapshotted on initial read
with s3.atomic_edit(url) as out_path:
df = pd.read_parquet(url)
df.sort_index(inplace=True)
df.to_parquet(out_path)
# On contextmanager exit, `out_path` is uploaded to `url`, iff
# `url`'s ETag hasn't changed (no concurrent update has occurred).
utz.plot
: Plotly helpers
Helpers for Plotly transformations I make frequently, e.g.:
from utz import plot
import plotly.express as px
fig = px.bar(x=[1, 2, 3], y=[4, 5, 6])
plot(
fig,
name='my-plot', # Filename stem. will save my-plot.png, my-plot.json, optional my-plot.html
title=['Some Title', 'Some subtitle'], # Plot title, followed by "subtitle" line(s) (smaller font, just below)
bg='white', xgrid='#ccc', # white background, grey x-gridlines
hoverx=True, # show x-values on hover
x="X-axis title", # x-axis title or configs
y=dict(title="Y-axis title", zerolines=True), # y-axis title or configs
# ...
)
Example usages: hudcostreets/nj-crashes, ryan-williams/arrayloader-benchmarks.
utz.setup
: setup.py
helper
utz/setup.py
provides defaults for various setuptools.setup()
params:
name
: use parent directory nameversion
: parse from git tag (otherwise fromgit describe --tags
)install_requires
: readrequirements.txt
author_{name,email}
: infer from last commitlong_description
: parseREADME.md
(and setlong_description_content_type
)description
: parse first<p>
under opening<h1>
fromREADME.md
license
: parse fromLICENSE
file (MIT and Apache v2 supported)
For an example, see gsmo==0.0.1
(and corresponding release).
This library also "self-hosts" using utz.setup
; see pyproject.toml:
[build-system]
requires = ["setuptools", "utz[setup]==0.4.2", "wheel"]
build-backend = "setuptools.build_meta"
and setup.py:
from utz.setup import setup
extras_require = {
# …
}
# Various fields auto-populated from git, README.md, requirements.txt, …
setup(
name="utz",
version="0.8.0",
extras_require=extras_require,
url="https://github.com/runsascoded/utz",
python_requires=">=3.10",
)
The setup
helper can be installed via a pip "extra":
pip install utz[setup]
utz.test
: dataclass
test cases, raises
helper
utz.parametrize
: pytest.mark.parametrize
wrapper, accepts dataclass
instances
from utz import parametrize
from dataclasses import dataclass
def fn(f: float, fmt: str) -> str:
"""Example function, to be tested with ``Case``s below."""
return f"{f:{fmt}}"
@dataclass
class case:
"""Container for a test-case; float, format, and expected output."""
f: float
fmt: str
expected: str
@property
def id(self):
return f"fmt-{self.f}-{self.fmt}"
@parametrize(
case(1.23, "0.1f", "1.2"),
case(123.456, "0.1e", "1.2e+02"),
case(-123.456, ".0f", "-123"),
)
def test_fn(f, fmt, expected):
"""Example test, "parametrized" by several ``Cases``s."""
assert fn(f, fmt) == expected
test_parametrize.py
contains more examples, customizing test "ID"s, adding parameter sweeps, etc.
utz.raises
: pytest.raises
wrapper, match a regex or multiple strings
utz.tmpdir
from utz import TmpDir, tmp_ensure_dir, TmpPath
# ``TemporaryDirectory`` wrapper that creates ``dir`` (and parents), if necessary (and removes any dirs it created, on exit)
# Also adds support for specifying exact basename, via ``name`` kwarg.
with TmpDir(dir='nested/subdir', name='basename') as tmpdir:
...
# Yields a path with the requested basename, inside a ``TemporaryDirectory``.
# As with ``TmpDir``, ``dir`` (and parents) will be created, if necessary (and removed on exit, leaving the filesystem in the same state it started in)
with TmpPath('basename.txt', dir='nested/subdir') as tmppath:
...
# Multiple right-most path components can be specified exactly.
with TmpPath('dir1/dir2/basename.txt', dir='nested/subdir') as tmppath:
...
# Used by ``TmpDir``/``TmpPath`` above, creates ``dir`` (and parents), if necessary (and removes any dirs it created, on exit)
with tmp_ensure_dir(dir='nested/subdir'):
...
See also: test_tmpdir.py
.
utz.docker
, utz.bases
, etc.
Misc other modules:
- bases: encode/decode in various bases (62, 64, 90, …)
- escape: split/join on an arbitrary delimiter, with backslash-escaping;
utz.esc
escapes a specific character in a string. - ctxs: compose
contextmanager
s - o:
dict
wrapper exposing keys as attrs (e.g.:o({'a':1}).a == 1
) - docker: DSL for programmatically creating Dockerfiles (and building images from them)
- tmpdir: make temporary directories with a specific basename
- ssh: SSH tunnel wrapped in a context manager
- backoff: exponential-backoff utility
- git: Git helpers, wrappers around GitPython
- pnds: pandas imports and helpers
Examples / Users
Some repos that use utz
:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file utz-0.20.0.tar.gz
.
File metadata
- Download URL: utz-0.20.0.tar.gz
- Upload date:
- Size: 97.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
9c0af213188ab8937043efefa952e12ff452af5ddd0d9705501073664f2081e7
|
|
MD5 |
1c2fd968b061163ebf41ef9c45ff9162
|
|
BLAKE2b-256 |
254c0a6a20fd6ffda6f3cfaeeae9737760fb81a45c2ba3ec888355e1a1bf58d0
|
File details
Details for the file utz-0.20.0-py3-none-any.whl
.
File metadata
- Download URL: utz-0.20.0-py3-none-any.whl
- Upload date:
- Size: 82.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
8fc16a3c32f4a0345d2eeea7b4064e646f0bd7f56dc36717fdf6b02de6ca2df1
|
|
MD5 |
bc918404c9f81a9179ebf6b3fce32d62
|
|
BLAKE2b-256 |
100d5cb8e154511d446f46cd347fec3653b66ccb5477711bc848ad9ac3d30099
|