Layered Python utility package — datetime parsing, logging, safe filesystem ops, mojibake fixing, lint runners, media tools, LLM wrappers. Stdlib-only base.
Project description
emmykit
Personal Python utility kit: 184 importable functions, classes, and constants across 32 submodules
in 9 dependency layers (README highlights the user-facing surface — internal punctuation, frozenset
aliases, probe-target lists, and translation tables are referenced by section rather than enumerated).
Base install is stdlib-only; heavier helpers
(datetime parsing via numpy/pandas/dateutil, mojibake fixing via ftfy, lint runners,
LLM wrappers, ffmpeg/VLC controls) are gated behind optional extras so a bare
import emmykit is fast and side-effect-free.
Install
pip install emmykit # base — stdlib only
pip install 'emmykit[all]' # all optional extras
pip install 'emmykit[datetime]' # pick a single extra group
uv add emmykit # base
uv add 'emmykit[all]' # all optional extras
uv add 'emmykit[datetime]' # pick a single extra group
Available extras groups: datetime, text, lint, llm, media, files, inflection, html, all.
Quick start
import emmykit as ek
print(ek.human_bytesize(1024**3)) # "1.0 GiB"
ts = ek.parse_datetime("2026-06-06T12:34:56Z")
print(ek.my_capitalize("hello world")) # "Hello world"
Table of contents
constants— ANSI colors, unicode punctuation, default encoding, ignore-listsextensions— File-extension lookup tables (audio / video / image / book / text / html / playlist / archive / subtitle)embedded_scripts— Pre-packaged helper-script source-strings_version— Package and Python version constantsoptions— Options dataclasses for configurationinflect_utils— Grammar + pluralization helperslogging_utils— Logging configuration and custom handlerspaths_ensure— Path normalizationsafe_paths— Exception-swallowing filesystem queriesfile_io— Atomic file-write helperio_subprocess— Subprocess wrappers + critical-error reporterprompts— Interactive Y/N + multi-choice promptsintrospection— AST + source-code reflectionhumanize— Human-readable number formattingnumeric_helpers— Numeric parsing + unit-to-seconds conversiondatetime_utils— Date / time parsing, formatting, timezone handlingjson_io— JSON serialization + dataclass conversiondiff_view— Diff rendering with visible whitespacetext— Mojibake fixing, encoding detection, casing helpershosts— Hostname + computer-name detectionnetwork— Internet-connectivity probespython_env— Python version + shell-environment detectionfiles— Checksums, downloads, filename formatting, free-space querieslint— flake8 / autopep8 / mypy interactive runners + multireplacetreeview— Directory tree with new-file highlightingdocker_utils— Docker daemon + image lifecycle helperssystem— OS-level process + resource helpersmedia— Video / audio helpers (ffmpeg, VLC, system volume)html_files— HTML filename munging + multi-file combinationllm— LLM wrapper, config dataclasses, model selection
API reference
constants — ANSI colors, unicode punctuation, default encoding, ignore-lists
Layer 0. from emmykit.constants import …
Terminal escape codes, curly quotes, the em-dash, the package's UTF-8 default, the set of errno codes treated as benign by safe_*, and the flake8/autopep8 codes Emmy deliberately ignores.
DEFAULT_EXCLUDE_DIRS — set[str] (6 items)
DEFAULT_EXCLUDE_DIRS: set[str] = {'.git', '.venv', '__pycache__', 'build', 'dist', 'venv'}
ANSI color escapes — 5 terminal-escape strings: ANSI_CYAN / GREEN / RED / RESET / YELLOW.
Includes: ANSI_CYAN, ANSI_GREEN, ANSI_RED, ANSI_RESET, ANSI_YELLOW.
IGNORED_CODES — flake8 + autopep8 codes Emmy deliberately ignores.
Includes: IGNORED_CODES.
IGNORE_THESE_ERRORS — errno codes treated as benign by safe_* helpers.
Includes: IGNORE_THESE_ERRORS.
extensions — File-extension lookup tables (audio / video / image / book / text / html / playlist / archive / subtitle)
Layer 0. from emmykit.extensions import …
Lists and frozensets of common file extensions per media kind, plus an ALL_KNOWN_EXTENSIONS umbrella and the canonical TEXT_ENCODINGS ordering used by my_fopen when sniffing.
Each *_EXTENSIONS list has a *_EXTENSIONS_SET frozenset alias for fast membership tests.
ALL_KNOWN_EXTENSIONS — Final[tuple[str, ...]] (979 items)
ALL_KNOWN_EXTENSIONS: Final[tuple[str, ...]] = ('.py', '.pyw', '.html', '.htm', '.xhtml', '.txt', '.csv', '.json', '.xml', '.adoc',
'.asciidoc', '.bib', '.cfg', '.conf', '.ini', '.log', '.md', '.markdown',
'.properties', '.rtf', '.rst', ...)
ARCHIVE_EXTENSIONS — Final[tuple[str, ...]] (376 items)
ARCHIVE_EXTENSIONS: Final[tuple[str, ...]] = ('.zip', '.rar', '.7z', '.tar', '.gz', '.tgz', '.bz2', '.xz', '.tbz2', '.tz2', '.lzma',
'.lz', '.xpi', '.crx', '.zst', '.cab', '.arj', '.ace', '.uue', '.zoo', '.jar', '.war',
'.ear', '.iso', ...)
AUDIO_EXTENSIONS — Final[tuple[str, ...]] (112 items)
AUDIO_EXTENSIONS: Final[tuple[str, ...]] = ('.mp3', '.wav', '.flac', '.aac', '.ogg', '.wma', '.m4a', '.alac', '.aiff', '.opus',
'.amr', '.pcm', '.au', '.raw', '.dts', '.ac3', '.mka', '.mpc', '.vqf', '.ape', '.shn',
'.ra', '.rm', '.oga', ...)
BOOK_EXTENSIONS — Final[tuple[str, ...]] (63 items)
BOOK_EXTENSIONS: Final[tuple[str, ...]] = ('.epub', '.pdf', '.txt', '.rtf', '.html', '.htm', '.xhtml', '.doc', '.docx', '.odt',
'.azw', '.azw1', '.azw3', '.azw4', '.azw6', '.kfx', '.mobi', '.prc', '.tpz', '.ibooks',
'.fb2', '.fbz', ...)
HTML_EXTENSIONS — Final[tuple[str, ...]] (3 items)
HTML_EXTENSIONS: Final[tuple[str, ...]] = ('.html', '.htm', '.xhtml')
IMAGE_EXTENSIONS — Final[tuple[str, ...]] (127 items)
IMAGE_EXTENSIONS: Final[tuple[str, ...]] = ('.bmp', '.dib', '.gif', '.jpeg', '.jpg', '.jpe', '.jfif', '.pjpeg', '.pjp', '.png',
'.pbm', '.pgm', '.ppm', '.pnm', '.pam', '.tif', '.tiff', '.sgi', '.rgb', '.tga',
'.hdr', '.exr', '.webp', ...)
PLAYLIST_EXTENSIONS — Final[tuple[str, ...]] (28 items)
PLAYLIST_EXTENSIONS: Final[tuple[str, ...]] = ('.m3u', '.m3u8', '.pls', '.xspf', '.asx', '.wpl', '.zpl', '.b4s', '.cue', '.smil',
'.smi', '.ram', '.wax', '.wmx', '.wvx', '.fpl', '.mpcpl', '.dpl', '.aimppl',
'.aimppl4', '.pla', '.xml', ...)
PYTHON_EXTENSIONS — Final[tuple[str, ...]] (2 items)
PYTHON_EXTENSIONS: Final[tuple[str, ...]] = ('.py', '.pyw')
SUBTITLE_EXTENSIONS — Final[tuple[str, ...]] (47 items)
SUBTITLE_EXTENSIONS: Final[tuple[str, ...]] = ('.srt', '.sub', '.idx', '.ass', '.ssa', '.vtt', '.ttml', '.dfxp', '.smi', '.smil',
'.usf', '.psb', '.mks', '.lrc', '.stl', '.pjs', '.rt', '.aqt', '.gsub', '.jss', '.dks',
'.mpl2', '.sbt', ...)
TEXT_ENCODINGS — Final[tuple[str, ...]] (146 items)
TEXT_ENCODINGS: Final[tuple[str, ...]] = ('utf-8', 'latin-1', 'ascii', 'iso-8859-1', 'big5', 'utf-8-sig', 'utf-16', 'utf-16-be',
'utf-16-le', 'utf-32', 'utf-32-be', 'utf-32-le', 'cp1252', 'cp1251', 'cp1250',
'cp1253', 'cp1254', ...)
TEXT_EXTENSIONS — Final[tuple[str, ...]] (143 items)
TEXT_EXTENSIONS: Final[tuple[str, ...]] = ('.txt', '.html', '.htm', '.csv', '.json', '.xml', '.adoc', '.asciidoc', '.bib', '.cfg',
'.conf', '.ini', '.log', '.md', '.markdown', '.properties', '.rtf', '.rst', '.sgm',
'.sgml', '.tex', ...)
VIDEO_EXTENSIONS — Final[tuple[str, ...]] (133 items)
VIDEO_EXTENSIONS: Final[tuple[str, ...]] = ('.mp4', '.mkv', '.mov', '.avi', '.mpg', '.mpeg', '.wmv', '.m4v', '.flv', '.divx',
'.vob', '.iso', '.3gp', '.webm', '.mts', '.m2ts', '.ts', '.ogv', '.rm', '.rmvb',
'.asf', '.f4v', '.mxf', ...)
embedded_scripts — Pre-packaged helper-script source-strings
Layer 0. from emmykit.embedded_scripts import …
Multi-kilobyte Python script literals shipped as importable strings — used by Emmy's external automation to drop drop-in helpers into other projects.
7 embedded helper scripts — Multi-KB Python script source strings shipped as importable constants.
Includes: MULTIREPLACE_SCRIPT, MYAUDIT_SCRIPT, MYDIFF_SCRIPT, PRINTALL_SCRIPT, SETUP_CARTOPY_SCRIPT, TREEVIEW_SCRIPT, UNIV_DEFS_SYS_PATH_SCRIPT.
_version — Package and Python version constants
Layer 1. from emmykit._version import …
Single source of truth for emmykit.__version__ (read by hatchling at build-time) and the supported PY_VERSION floor.
options — Options dataclasses for configuration
Layer 1. from emmykit.options import …
Aggregated runtime/plot-time settings passed through downstream APIs as a single object.
Options — Class that has all global options in one place.
Options() -> 'None'
Class that has all global options in one place.
inflect_utils — Grammar + pluralization helpers
Layer 1. from emmykit.inflect_utils import …
Lazy wrapper over the inflect package with a stdlib-only fallback table for common nouns when the extra isn't installed.
InflectEngine — Protocol for the 'inflect' library's engine interface.
InflectEngine(*args, **kwargs)
Protocol for the 'inflect' library's engine interface.
Public methods: plural, plural_noun.
my_plural — Return a pluralized version of 'word' preceded by 'n'.
my_plural(n: 'int', word: 'str') -> 'str'
Return a pluralized version of 'word' preceded by 'n'.
Behavior:
- If the open-source 'inflect' library is available, use it for pluralization.
- Otherwise, fall back to a casefold()-based irregulars table, some uncountables,
and a small set of morphological rules.
Examples (fallback behavior):
1 millennium -> "1 millennium"
2 millennium -> "2 millennia"
2 millenium -> "2 millennia" # (handles the common misspelling too)
Args:
n: The quantity of the item.
word: The singular form of the item.
Returns:
A string in the format "{n} {pluralized_word}".
Raises:
None.
logging_utils — Logging configuration and custom handlers
Layer 1. from emmykit.logging_utils import …
Drop-in configure_logging, level-filtering handlers, an in-memory ring-buffer handler, and an introspection helper (return_method_name) used throughout the package for self-naming log messages.
configure_logging — Configure logging to write to files and stdout/stderr, and return a MemoryHandler to capture ERROR logs for later (dupl…
configure_logging(basename: 'str', log_level: 'int | str' = 20, rawlog: 'bool' = False, logdir: 'str | os.PathLike[str]' = '') -> 'MemoryHandler | None'
Configure logging to write to files and stdout/stderr, and return a MemoryHandler to capture ERROR logs for later (duplicate) printing.
Args:
basename : Base name for the log files.
log_level: Logging level (default: logging.INFO).
rawlog : If True, use a simple log format without timestamps or levels.
logdir : Directory to store log files. Defaults to './logs'.
Returns:
MemoryHandler instance capturing ERROR logs, or None if log files couldn't be created.
Raises:
None (file creation errors are caught and logged to stdout).
fallback_logging_config — Configure the root logger with a basic configuration if no handlers are set.
fallback_logging_config(log_level: 'int | str' = 20, rawlog: 'bool' = False) -> 'None'
Configure the root logger with a basic configuration if no handlers are set.
Run this at the start of functions which might be run without first configuring logging.
Args:
level : The logging level to set. Defaults to logging.INFO.
rawlog : If True, use a simple log format without timestamps or levels.
FlushingStreamHandler — A logging handler that flushes the stream after emitting each log so the logs are immediately visible.
FlushingStreamHandler(stream=None)
A logging handler that flushes the stream after emitting each log so the logs are immediately visible.
Public methods: acquire, addFilter, close, createLock, emit, filter, flush, format, get_name, handle, handleError, release, removeFilter, setFormatter, setLevel, setStream, set_name.
MaxLevelFilter — A logging filter that only allows logs up to a certain level to pass through, so that error messages aren't printed mul…
MaxLevelFilter(max_level: 'int') -> 'None'
A logging filter that only allows logs up to a certain level to pass through, so that error messages aren't printed multiple times.
Public methods: filter.
MemoryHandler — A logging handler that stores logs in memory so the errors can be printed at the end.
MemoryHandler(level: 'int' = 40) -> 'None'
A logging handler that stores logs in memory so the errors can be printed at the end.
Public methods: acquire, addFilter, close, createLock, emit, filter, flush, format, get_name, handle, handleError, release, removeFilter, setFormatter, setLevel, set_name.
print_all_errors — Print all the captured error messages.
print_all_errors(memory_handler: 'MemoryHandler', rawlog: 'bool' = False) -> 'None'
Print all the captured error messages.
return_method_name — Return the caller's qualified method/function name.
return_method_name(levels_up: 'int' = 1) -> 'str'
Return the caller's qualified method/function name.
- For instance methods: ClassName.method
- For classmethods: ClassName.method
- For staticmethods: ClassName.method on Python >= 3.11 (via co_qualname),
otherwise just 'method' (class is not recoverable without heuristics)
- For functions: function
Args:
levels_up: How many frames up to inspect (1 = caller). If greater than
the stack depth, the highest available frame is used.
Returns:
The current method name as a string, formatted as 'ClassName.method' or 'function'.
Raises:
None: This function does not raise exceptions, but it may log warnings
if sys._getframe or inspect fails.
paths_ensure — Path normalization
Layer 1. from emmykit.paths_ensure import …
Leaf helper that coerces os.PathLike / str arguments into resolved Path objects.
ensure_path — Ensure that the path is a Path. If not, make it a Path.
ensure_path(path: 'str | os.PathLike[str]', absolute: 'bool' = True) -> 'Path'
Ensure that the path is a Path. If not, make it a Path.
Args:
path: The path to ensure is a Path object.
absolute: If True (default), return an absolute path without resolving symlinks.
Returns:
A Path object (expanded for "~"). If absolute=True, it's absolute; otherwise it may be relative.
safe_paths — Exception-swallowing filesystem queries
Layer 2. from emmykit.safe_paths import …
safe_* wrappers around os.stat/Path.exists/is_file/is_dir that never raise, plus ensure_file/ensure_dir builders that compose on top of them.
ensure_dir — Ensure that the given path is an existing directory and return it as a Path object.
ensure_dir(path: 'str | os.PathLike[str]', allow_symlink: 'bool' = True, follow_symlinks: 'bool' = True) -> 'Path'
Ensure that the given path is an existing directory and return it as a Path object.
Args:
path: The path to check.
allow_symlink: If False, raise an exception if the path is a symlink.
follow_symlinks: If False, do not follow symlinks when checking if it's a directory.
If False, symlinks aren't considered directories (even if allow_symlink=True).
With follow_symlinks=False, attribute reads (size/mtime/etc.) also don't
follow symlinks.
Returns:
A Path object representing the directory.
Raises:
FileNotFoundError: If the directory does not exist.
NotADirectoryError: If the path exists but is not a directory.
ensure_file — Ensure that the given path is an existing file and return it as a Path object.
ensure_file(path: 'str | os.PathLike[str]', raise_on_empty: 'bool' = False, allow_symlink: 'bool' = True, follow_symlinks: 'bool' = True, verbose: 'bool' = True) -> 'Path'
Ensure that the given path is an existing file and return it as a Path object.
Args:
path: The path to check.
raise_on_empty: If True, raise an exception if the file is empty.
allow_symlink: If False, raise an exception if the path is a symlink.
follow_symlinks: If False, do not follow symlinks when checking if it's a file.
If False, symlinks aren't considered files (even if allow_symlink=True).
With follow_symlinks=False, attribute reads (size/mtime/etc.) also don't
follow symlinks.
verbose: If True (default), log a warning if the file is empty or size is unknown.
Returns:
A Path object representing the file.
Raises:
FileNotFoundError: If the file does not exist.
IsADirectoryError: If the path exists but is a directory.
ValueError: If the path exists but is not a regular file, or if symlinks are not allowed.
ValueError: If raise_on_empty is True and the file is empty (or bad permissions, etc.)
safe_ctime — Return ctime (seconds float or ns int) or None on errors.
safe_ctime(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True, ns: 'bool' = False) -> 'int | float | None'
Return ctime (seconds float or ns int) or None on errors.
Note: On POSIX, ctime == inode *change* time, not creation time.
On Windows, ctime is the file *creation* time.
Args:
path: The file or directory path to stat().st_ctime
follow_symlinks: Whether to follow symlinks (default: True).
If true, uses Path.stat() else Path.lstat().
ns: Whether to return the result in nanoseconds (default: False).
Returns:
The ctime of the file in seconds or nanoseconds, or None if an error occurred.
safe_exists — Like Path.exists()/os.path.lexists(), but doesn't raise on permission/loop errors.
safe_exists(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True) -> 'bool'
Like Path.exists()/os.path.lexists(), but doesn't raise on permission/loop errors.
Args:
path: The path to check.
follow_symlinks: If False, do not follow symlinks when checking if it exists.
Returns:
True if the path appears to exist (respecting follow_symlinks), False if it doesn't.
For certain access/loop issues, returns True to avoid misclassifying as 'missing'.
safe_is_dir — Like Path.is_dir(), but returns False on permission errors instead of raising.
safe_is_dir(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True) -> 'bool'
Like Path.is_dir(), but returns False on permission errors instead of raising.
Uses _is_dir() for pre-3.13 compatibility and no-follow mode.
Args:
path: The file or directory path to check.
follow_symlinks: Whether to follow symlinks (default: True).
Returns:
True if the path is a directory, False otherwise.
Raises:
Intentionally designed to catch PermissionError, FileNotFoundError,
some OSError variations. But not all.
safe_is_file — Like Path.is_file(), but returns False on permission errors instead of raising.
safe_is_file(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True) -> 'bool'
Like Path.is_file(), but returns False on permission errors instead of raising.
Uses _is_file() for pre-3.13 compatibility and no-follow mode.
Args:
path: The file or directory path to check.
follow_symlinks: Whether to follow symlinks (default: True).
Returns:
True if the path is a file, False otherwise.
Raises:
Intentionally designed to catch PermissionError, FileNotFoundError,
some OSError variations. But not all.
safe_mtime — Return mtime (seconds float or ns int) or None on errors.
safe_mtime(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True, ns: 'bool' = False) -> 'int | float | None'
Return mtime (seconds float or ns int) or None on errors.
Args:
path: The file or directory path to stat().st_mtime
follow_symlinks: Whether to follow symlinks (default: True).
If true, uses Path.stat() else Path.lstat().
ns: Whether to return the result in nanoseconds (default: False).
Returns:
The mtime of the file in seconds or nanoseconds, or None if an error occurred.
safe_size — Like Path.stat().st_size, but returns None on permission/missing/loop errors.
safe_size(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True) -> 'int | None'
Like Path.stat().st_size, but returns None on permission/missing/loop errors.
Args:
path: The file or directory path to stat().st_size
follow_symlinks: Whether to follow symlinks (default: True).
If true, uses Path.stat() else Path.lstat().
Returns:
The size of the file in bytes or None if an error occurred.
safe_stat — Like Path.stat()/lstat(), but returns None on permission/missing/loop errors.
safe_stat(path: 'str | os.PathLike[str]', follow_symlinks: 'bool' = True) -> 'os.stat_result | None'
Like Path.stat()/lstat(), but returns None on permission/missing/loop errors.
Args:
path: The file or directory path to stat.
follow_symlinks: Whether to follow symlinks (default: True).
If true, uses Path.stat() else Path.lstat().
Returns:
An os.stat_result object or None if an error occurred.
Raises:
Intentionally designed to catch PermissionError, FileNotFoundError,
some OSError variations. But not all.
file_io — Atomic file-write helper
Layer 2. from emmykit.file_io import …
my_atomic_write — lazy atomicwrites + filelock wrapper that lets you write text to a path with cross-process safety.
my_atomic_write — Atomically write 'data' to 'filepath' with an advisory lock.
my_atomic_write(filepath: 'str | Path | os.PathLike[str]', data: 'str | bytes | bytearray', write_mode: "Literal['w', 'a']", encoding: 'str' = 'utf-8', lock_timeout: 'float | None' = None) -> 'None'
Atomically write 'data' to 'filepath' with an advisory lock.
- If write_mode="a" and file exists, data is appended.
- If write_mode="a" and file does *not* exist, file is created.
- A '.lock' file beside 'filepath' prevents concurrent writers.
Args:
filepath: Path to the file to write.
data: Data to write (str or bytes).
write_mode: "w" for overwrite, "a" for append.
encoding: Encoding to use for text data (default: DEFAULT_ENCODING).
lock_timeout: Maximum time to wait for the lock (default: None, meaning wait indefinitely).
Returns:
None: The file is written atomically.
Raises:
RuntimeError: If the lock cannot be acquired within the specified timeout.
io_subprocess — Subprocess wrappers + critical-error reporter
Layer 3. from emmykit.io_subprocess import …
my_fopen (smart-encoding text-file opener), my_popen (subprocess with timeout + capture), my_critical_error (logging + traceback + raise pattern), and the MyPopenResult data carrier.
my_critical_error — Log a critical error message and either exit the program or enter a breakpoint.
my_critical_error(message: 'str' = 'A critical error occurred.', choose_breakpoint: 'bool' = False, exit_code: 'int' = 1) -> 'None'
Log a critical error message and either exit the program or enter a breakpoint.
my_fopen — Attempt to read a text file with various encodings and return the file content if successful. Optionally, specify numli…
my_fopen(file_path: 'str | os.PathLike[str]', suppress_errors: 'bool' = False, rawlog: 'bool' = False, numlines: 'int | None' = None, verbose: 'bool' = True) -> 'str | None'
Attempt to read a text file with various encodings and return the file content if successful. Optionally, specify numlines to limit the number of lines read.
Args:
file_path: Path to the file to read.
suppress_errors: If True, suppress error messages by logging them as info instead of as error.
rawlog: If True, use a simple log format without timestamps or levels.
numlines: If specified, read only this many lines from the file and return them as a string.
verbose: If True, log messages about the file reading process (default: True).
Returns:
The content of the file as a string, or:
Returns None:
- if the file does not exist
- is empty
- is a non-text file (video, audio, image, archive)
- cannot be read with any of the specified encodings
my_popen — Execute a command using subprocess.Popen and capture the output line by line using threads.
my_popen(command_list: 'list', suppress_info: 'bool' = False, suppress_error: 'bool' = False) -> 'MyPopenResult'
Execute a command using subprocess.Popen and capture the output line by line using threads.
MyPopenResult — A class to store the results of a customized subprocess.Popen call.
MyPopenResult(stdout: 'str', stderr: 'str', returncode: 'int') -> 'None'
A class to store the results of a customized subprocess.Popen call.
prompts — Interactive Y/N + multi-choice prompts
Layer 4. from emmykit.prompts import …
Two small helpers that funnel through input() with consistent retry behavior.
prompt_then_choose — Show a numbered list of choices and prompt the user to select one.
prompt_then_choose(prompt: 'str', choices: 'list[str]', default: 'str | None' = None) -> 'str'
Show a numbered list of choices and prompt the user to select one.
Args:
prompt: The message to display before the choices.
choices: A list of choices to present to the user.
default: The default choice to return if the user presses Enter without inputting a choice.
Returns:
str : The selected choice from the list (or the default if provided).
Raises:
None: If the user input is invalid, it will keep prompting until a valid choice is made.
prompt_then_confirm — Prompt the user with the given message and return True if the user enters 'yes', False otherwise.
prompt_then_confirm(prompt: 'str') -> 'bool'
Prompt the user with the given message and return True if the user enters 'yes', False otherwise.
introspection — AST + source-code reflection
Layer 4. from emmykit.introspection import …
Render a function's source with original whitespace, parse module-level constants out of a file, normalize objects into dicts, compile source snippets in-memory, and conditionally read-and-eval embedded scripts.
compile_code — Attempt to compile the given source code in 'exec' mode.
compile_code(source_or_filepath: 'str | os.PathLike[str]', force_source: 'bool' = False) -> 'bool'
Attempt to compile the given source code in 'exec' mode.
If 'source_or_filepath' is a file path, read its contents first.
Args:
source_or_filepath: The source code string or file path to compile.
force_source: If True, treat 'source_or_filepath' as a source code string even if it looks like a file path.
Returns:
True if compilation succeeds, False if it fails with a SyntaxError or other exception
Raises:
SyntaxError: If the source code has a syntax error, it will be logged and False is returned.
TypeError: If 'source_or_filepath' is not a string or a file path.
if_filepath_then_read — If given a path, return the file's text; otherwise return the string itself.
if_filepath_then_read(input_string_or_filepath: 'str | os.PathLike[str]', force_string: 'bool' = False) -> 'str'
If given a path, return the file's text; otherwise return the string itself.
Behavior:
- If a real PathLike is passed and 'force_string' is False:
* If the path does not exist → raise FileNotFoundError.
* If the path exists but is not a regular file → raise IsADirectoryError.
* If it is a file → return its contents. On permission or decoding errors,
log and return "".
* A race after the existence check may still raise FileNotFoundError (re-raised).
- If a string is passed:
* If it contains a newline or is longer than 4096 chars → treat as literal and return as-is.
* Else, if it names an existing file and 'force_string' is False → read and return
contents; on read errors (not found/permission/decoding), log and return "".
* Else → return the string as-is.
- If 'force_string' is True and a PathLike is passed → TypeError.
Args:
input_string_or_filepath: A string to return as-is, or a path to read.
force_string: If True, always treat the input as a string literal (PathLike
inputs are rejected with TypeError).
Returns:
The file contents (when reading a file) or the input string/literal path.
Raises:
TypeError: If a PathLike is given with 'force_string=True', or if the input
is neither str nor PathLike.
FileNotFoundError: When a PathLike is given and the path does not exist.
IsADirectoryError: When a PathLike is given and the path is not a regular file.
Notes:
For string inputs that look like paths, missing files do not raise; the
string is returned unchanged. Permission/decoding errors are logged and
result in an empty string.
load_ast_var — Load a top-level literal Python variable from a module without executing it.
load_ast_var(var_name: 'str', script_path: 'str | os.PathLike[str]', rawlog: 'bool' = False) -> 'Any | None'
Load a top-level literal Python variable from a module without executing it.
Args:
var_name: The name of the global variable to extract from the script.
script_path: The path to the Python script file from which to extract the variable.
rawlog: If True, use a simple log format without timestamps or levels.
Returns:
The value of the variable if found, or None.
Raises:
FileNotFoundError: If the script file does not exist.
AttributeError: If the variable is not found at the top level of the script.
ValueError: If the value of the variable cannot be evaluated as a literal expression.
normalize_to_dict — Ensure that 'value' is a dict. If it's a JSON-style string, try to parse it. Otherwise, log a warning and return an emp…
normalize_to_dict(value: 'Any', var_name: 'str', script_path: 'str | os.PathLike[str]') -> 'dict'
Ensure that 'value' is a dict. If it's a JSON-style string, try to parse it. Otherwise, log a warning and return an empty dict.
show_function_source — Print the full source text of a Python function (including comments,
show_function_source(target: 'object | str', *, unwrap: 'bool' = True, output: 'str | os.PathLike[str] | TextIO | None' = None) -> 'str'
Print the full source text of a Python function (including comments,
docstrings, decorators, and type hints).
Args:
target: A function *name* (string) or a function object.
If a string is given, it's resolved in the caller's scope, then
in builtins, then as a dotted path via pydoc.locate (e.g. 'pkg.mod.func').
unwrap: If True, attempt to unwrap decorated functions to show
the original implementation. Defaults to True.
output: A file-like object to write to (optional, defaults to sys.stdout). More details:
Details on the "output" argument:
- None -> sys.stdout
- TextIO (e.g., sys.stdout, an open text file, StringIO) -> used as-is
(must be opened in *text* mode; binary streams are rejected)
- str | os.PathLike[str] -> treated as a path:
* '~' is expanded
* parent directories are created (parents=True, exist_ok=True)
* file is opened in append mode ('a', UTF-8)
* a one-line note is written indicating whether we created or appended
Notes:
- A trailing newline is added if the source text doesn't already end with one.
- If you pass the string "-" as the output path, it is treated as stdout.
- If the given path is an existing directory, an IsADirectoryError is raised.
- If you pass a binary stream, a TypeError is raised.
Returns:
str: The source text that was printed.
Raises:
NameError: If a string cannot be resolved to an object.
OSError: If source is unavailable (e.g., built-in/C extension or optimized away).
TypeError: If the resolved object isn't suitable for source extraction.
humanize — Human-readable number formatting
Layer 4. from emmykit.humanize import …
Byte sizes (1.0 GiB), scientific-notation exponents, and away-from-zero rounding (lazy numpy).
human_bytesize — Formats a byte count into a human-readable string.
human_bytesize(num: 'float | int | None', *, suffix: 'str' = 'B', si: 'bool' = False, precision: 'int' = 1, space: 'bool' = True, trim_trailing_zeros: 'bool' = False, long_units: 'bool' = False) -> 'str'
Formats a byte count into a human-readable string.
Args:
num: Size in bytes. Negative values are preserved with a leading minus.
If None, returns "None".
suffix: Unit suffix appended after the prefix (defaults to "B"). If long_units is True and
suffix is "B", "bytes" is appended in the output. Otherwise, the suffix is appended to the long name.
si: If True, use powers of 1000 with SI prefixes (k, M, G, ... up to R, Q).
If False, use powers of 1024 with IEC prefixes (Ki, Mi, Gi, ... up to Ri, Qi).
precision: If >= 0, digits to show after the decimal point.
If < 0, constrains the total returned string length to `-precision`
(width-constrained mode; `long_units` is forced False).
space: If True, inserts a space between the number and the unit (ignored when long_units is True).
trim_trailing_zeros: If True, removes trailing zeros and any dangling decimal point.
long_units: If True, spell out unit names ("bytes", "kibibytes", ... "quebibytes"/"quettabytes").
Returns:
A concise string such as "1.5KiB", "1.5 kB", or "1.5 megabytes" depending on options.
If num is None, returns "None".
Handles negative values with a leading minus sign and units up to "quebibytes" (2^100 = 1024^10 bytes) for IEC,
or "quettabytes" (10^30 bytes) for SI.
Raises:
None.
round_out — Round a number away from zero (i.e. rounds up for x>0 and down for x<0) to
round_out(x: 'float', round_digits: 'int' = 3, max_digits: 'int' = 15) -> 'float'
Round a number away from zero (i.e. rounds up for x>0 and down for x<0) to
the specified number of significant figures (defaults to 3).
If the number is smaller than 10^(-max_digits), it will be returned as is.
The max_digits parameter defaults to 15, but can be changed to a different value if needed.
Args:
x: The number to round.
round_digits: The number of significant figures to round to (default is 3).
max_digits: The maximum number of digits to consider for very small numbers (default is 15).
Returns:
float: The rounded number, or the original number if it is smaller than 10^(-max_digits).
sci_exp — Return floor(log10(|x|)), clamped to -max_digits for very small |x|.
sci_exp(x: 'float | int', max_digits: 'int' = 15) -> 'int'
Return floor(log10(|x|)), clamped to -max_digits for very small |x|.
For x == 0, returns -max_digits.
numeric_helpers — Numeric parsing + unit-to-seconds conversion
Layer 1. from emmykit.numeric_helpers import …
Tiny helpers shared by humanize and datetime_utils so neither has to pull in the other.
is_float — Check if a string can be parsed as a float.
is_float(s: 'str') -> 'bool'
Check if a string can be parsed as a float.
seconds_in_unit — Return the number of seconds in a given time unit.
seconds_in_unit(unit: 'str') -> 'float'
Return the number of seconds in a given time unit.
datetime_utils — Date / time parsing, formatting, timezone handling
Layer 4. from emmykit.datetime_utils import …
parse_datetime is the load-bearing dispatcher (handles ISO, JD/MJD, decimal years, dateutil fallbacks); supplemented by AdaptiveDateFormatter for matplotlib, human_timespan for durations, and a small zoo of tz/JD helpers.
adaptive_date_labels — Format dates at the coarsest precision that produces unique labels.
adaptive_date_labels(dates: "'Sequence[AnyDateTimeType]'", *, min_precision: 'int' = 0, max_precision: 'int' = 4, format_levels: "'list[str] | None'" = None) -> 'list[str]'
Format dates at the coarsest precision that produces unique labels.
Given a sequence of dates, starts formatting at the coarsest level and
refines until all labels are unique or max_precision is reached.
Args:
dates: Sequence of date values (datetime.datetime, numpy.datetime64,
or matplotlib date floats).
min_precision: Minimum precision level (default: Precision.YEAR).
The formatter will never produce labels coarser than this.
max_precision: Maximum precision level (default: Precision.SECOND).
The formatter stops refining at this level even if labels collide.
format_levels: Custom format strings for each level. Must have length
>= max_precision + 1. Defaults to ADAPTIVE_FORMAT_LEVELS.
Returns:
List of formatted date strings, one per input date. Empty strings
for NaT/NaN values.
AdaptiveDateFormatter — Matplotlib Formatter that auto-selects date label precision.
AdaptiveDateFormatter(*, min_precision: 'int' = 0, max_precision: 'int' = 4, format_levels: "'list[str] | None'" = None) -> 'None'
Matplotlib Formatter that auto-selects date label precision.
Uses adaptive disambiguation: labels start at the coarsest level and
refine until all tick labels are unique. Drop-in replacement for any
matplotlib axis formatter or colorbar formatter.
Args:
min_precision: Minimum precision level (default: Precision.YEAR).
max_precision: Maximum precision level (default: Precision.SECOND).
format_levels: Custom format strings per level.
Example:
>>> ax.xaxis.set_major_formatter(AdaptiveDateFormatter())
>>> cbar.ax.yaxis.set_major_formatter(AdaptiveDateFormatter())
Public methods: format_ticks.
AnyDateTimeType — TypeAlias (62 chars)
AnyDateTimeType: TypeAlias = 'str | float | int | np.datetime64 | pd.Timestamp | dt.datetime'
decimal_year_to_datetime — Convert a decimal year to a datetime object.
decimal_year_to_datetime(dec: 'float', use_astropy: 'bool' = False) -> 'dt.datetime'
Convert a decimal year to a datetime object.
If use_astropy is True, astropy.time is used for sub-second and leap-second–aware conversion.
Usage: new_datetime_datetime_object = decimal_year_to_datetime(2002.291)
extract_timestamp — Extract timestamp string (in format YYYYMMDD-HHMMSS) from the_string, or None if not found.
extract_timestamp(the_string: 'str') -> 'str | None'
Extract timestamp string (in format YYYYMMDD-HHMMSS) from the_string, or None if not found.
format_date_range — Process a pair of datetime.datetime dates and produce a formatted date range string
format_date_range(date1: 'dt.datetime', date2: 'dt.datetime | None' = None) -> 'str'
Process a pair of datetime.datetime dates and produce a formatted date range string
where each date looks like 'Jan 7, 2025'. If date2 is not provided, it is set to date1.
Args:
date1: The first date as a datetime.datetime object.
date2: The second date as a datetime.datetime object. If None, defaults to date1.
Returns:
A formatted string representing the date range, such as 'Jan 7, 2025' or 'Jan 7 - Feb 3, 2025' or 'Jan 7 - 15, 2025'.
If both dates and times are the same, it returns just one date like 'Jan 7, 2025'.
If both dates are the same but times are different, it returns a string like '06:04:02 - 19:05:39 on Jan 7, 2025'
Raises:
ValueError: If either date1 or date2 is not a datetime.datetime object.
human_timespan — Format a time span in seconds into a human-readable string.
human_timespan(timespan: 'int | float') -> 'str'
Format a time span in seconds into a human-readable string.
Negative values are treated as absolute.
Args:
timespan: A float or int representing the time span in seconds.
Returns:
A human-readable string describing the time span, such as
"1 year, 2 weeks, 3 days, 4 hours, 5 minutes and 6.789 seconds".
If the timespan is zero, returns "0 seconds".
Raises:
None.
parse_datetime — Try parsing the given_date string or number into a datetime.datetime object in the specified timezone.
parse_datetime(given_date: 'AnyDateTimeType', timezone: 'str | dt.tzinfo | None' = None, format_str: 'str | None' = None, should_convert: 'bool | None' = None) -> 'dt.datetime'
Try parsing the given_date string or number into a datetime.datetime object in the specified timezone.
If "format_str" is provided, it will be used to parse the date string. These format types are accepted:
- "seconds" or "milliseconds" indicating the number of seconds or milliseconds since an epoch (Unix epoch by default).
- "YYYY-MM-DD" or similar ISO8601 formats such as "YYYY-MM-DDTHH:MM:SS", "MM/DD/YYYY", etc.
- A custom string following this pattern: "units (optional: since/after epoch)", where "units" can be anything that the function seconds_in_unit() accepts (e.g. "days", "weeks", "months", etc.). The optional epoch time can be a string, float, int, numpy.datetime64, pandas.Timestamp, or datetime.datetime object. Example: "days since 1990", "milliseconds after J2000", "sidereal days since 2000-01-01", etc. If the epoch is not specified, it defaults to the Unix epoch (1970-01-01T00:00:00Z)
If a boolean "should_convert" is provided, it will override the default behavior of whether to convert the datetime to the specified timezone by shifting the clock or just attaching the timezone without shifting. If None, the function will determine this based on the type of given_date and format_str.
If a given_date starts with "JD" or "MJD", it will be treated as a Julian Date or Modified Julian Date, respectively.
Otherwise, if given_date is a float or int, treat it as a decimal year by default if format_str is not provided.
Any call that doesn't provide a timezone argument will default to UTC.
The timezone can be a datetime.tzinfo object or a string that can be converted to a ZoneInfo object (e.g. 'America/New_York').
If the given_date is an "aware" datetime.datetime object which already has a timezone attached, it will be converted to the specified timezone (which may involve changing its date and time if the specified timezone is different).
The timezone can also be a fixed‐offset like "+05:30" or "-04:00", or the string "Naive" to indicate that the datetime should be treated as a naive datetime (i.e. without any timezone information).
Accepts:
'NOW' (case-insensitive) → current datetime
strings in YYYY, YYYY-MM, YYYY-MM-DD, YYYY-MM-DDTHH:MM:SS, or other ISO8601 formats (e.g. '2002-10-18T07:00:00Z', '2002-10-18 07:00:00+00:00').
If YYYY is provided, it will default to January 1st of that year at midnight.
If YYYY-MM is provided, it will default to the first day of that month at midnight.
If YYYY-MM-DD is provided, it will default to midnight on that day.
fallback to dateutil.parser.parse for free-form strings ("18 Oct 2002", "March 5th, 2020", etc.)
floats (e.g. 2002.29178082191777) or integer (e.g. 2002) → decimal year
numpy.datetime64 objects (e.g. np.datetime64('2002-10-18T07:00:00'))
pandas.Timestamp objects (e.g. pd.Timestamp('2002-10-18 07:00:00'))
datetime.datetime objects (e.g. datetime.datetime(2002, 10, 18, 7, 0, 0))
Args:
given_date: The date to parse, which can be a string, float, int, numpy.datetime64,
pandas.Timestamp, or datetime.datetime object.
timezone: A string or datetime.tzinfo object representing the timezone to convert
the datetime to. If None, defaults to UTC.
format_str: A string indicating the format of the date. If None, the function will
try to infer the format from the given_date.
should_convert: A boolean indicating whether to convert the datetime to the specified
timezone by shifting the clock (True) or just attaching the timezone
without shifting (False). If None, the function will determine this
based on the type of given_date and format_str.
Returns:
datetime.datetime object in the specified timezone.
Note that datetime.datetime objects cannot represent dates before 1 January 1, 0001 or after 31 December 9999.
So dates outside this range will raise a ValueError. Future versions of this code may support a wider range of dates (like 44 BC, 44 BCE, etc.) using libraries like 'astropy.time': https://chatgpt.com/share/685c5157-5cac-8006-b68c-4a0731927a50
However, this will require the function to return an 'astropy.time.Time' object instead of a 'datetime.datetime' object.
Raises:
ValueError: If the given_date cannot be parsed into a datetime object, or if the timezone is invalid.
TypeError: If the given_date is not a string, float, int, numpy.datetime64, pandas.Timestamp, or datetime.datetime object.
parse_timezone — Parse the given timezone string or tzinfo object into a datetime.tzinfo object.
parse_timezone(tz_arg: 'str | dt.tzinfo | None' = None) -> 'dt.tzinfo | str'
Parse the given timezone string or tzinfo object into a datetime.tzinfo object.
If tz_arg is None, return UTC timezone.
If tz_arg is a string, it can be in one of the following formats:
- A fixed‐offset like: "+HH:MM", "+HHMM", "+H", "+Hh", "+HhMMm" (or minus variants).
Examples: "+05:30", "-0530", "+5h", "-5h30m".
- A string that can be converted to a ZoneInfo object (e.g. 'America/New_York').
- A timezone abbreviation that maps to a known IANA zone name (e.g. 'EST', 'CET').
- "Z", "UTC", or "GMT" (case‐insensitive) to represent UTC.
- A string "Naive" to represent a naive datetime (no timezone).
If tz_arg is already a tzinfo object, return it as is.
Args:
tz_arg : A timezone string, a datetime.tzinfo object, or None.
Returns:
A datetime.tzinfo object representing the parsed timezone, or a string "Naive"
if the input was "Naive".
Raises:
ValueError if the string cannot be converted to a valid timezone.
Precision — Integer constants representing date-formatting precision levels.
Precision()
Integer constants representing date-formatting precision levels.
Levels are ordered from coarsest (YEAR=0) to finest (SECOND=4).
json_io — JSON serialization + dataclass conversion
Layer 4. from emmykit.json_io import …
to_jsonable/from_jsonable round-trip recursively-typed structures including Path, datetime, and dataclasses, with paired save_options_to_json/load_options_from_json helpers for Options objects.
from_jsonable — Reconstruct objects encoded with to_jsonable(..., roundtrip=True).
from_jsonable(obj: 'Any') -> 'Any'
Reconstruct objects encoded with to_jsonable(..., roundtrip=True).
If input was produced with roundtrip=False, this mostly passes values through.
load_options_from_json — Load the options object from a JSON file.
load_options_from_json(options: 'Options', json_file: 'str | os.PathLike[str]') -> 'Options | None'
Load the options object from a JSON file.
Args:
options: An existing Options object (used for logging purposes).
json_file: Path to the JSON file to load.
Returns:
Options object loaded from the JSON file, or None if the file does not exist or cannot be read.
Raises:
IOError: If there is an error reading the file.
ValueError: If the JSON file is invalid or cannot be parsed.
save_options_to_json — Save the options object to a JSON file.
save_options_to_json(options: 'Options') -> 'None'
Save the options object to a JSON file.
Args:
options: Options object containing:
- script_dir: Directory where the JSON file will be saved.
- python_script: Name of the Python script (used in the JSON filename).
- my_name: Name of the current script (used in the JSON filename).
- timestamp: Current timestamp (used in the JSON filename).
Returns:
None - writes the options to a JSON file.
Raises:
IOError: If there is an error writing to the file.
ValueError: If the options object is invalid.
to_jsonable — Convert arbitrary Python objects into JSON-serializable primitives.
to_jsonable(obj: 'Any', *, roundtrip: 'bool' = True) -> 'Any'
Convert arbitrary Python objects into JSON-serializable primitives.
If roundtrip=True, non-JSON types are wrapped with a small type tag so they can be reconstructed.
diff_view — Diff rendering with visible whitespace
Layer 4. from emmykit.diff_view import …
my_diff (color unified diff), diff_and_confirm (interactive accept/reject loop), highlight_changes (per-line intra-word emphasis), and is_python_script (the heuristic that decides whether a path holds Python source).
diff_and_confirm — Show a unified diff of orig_text → changed_text with a number of context lines
diff_and_confirm(orig_text: 'str', changed_text: 'str', path: 'str | os.PathLike[str]', label: 'str' = '', skip_compile: 'bool' = False, diff_choice: 'int' = 1, changed_color: 'str' = '\x1b[94m', deleted_color: 'str' = '\x1b[91m', added_color: 'str' = '\x1b[93m', the_fix: 'str' = '', description: 'str' = '') -> 'bool'
Show a unified diff of orig_text → changed_text with a number of context lines
(determined by 'diff_choice') around each hunk, log using 'label' and 'description', then prompt.
If the user confirms, overwrite 'path' with changed_text and return True.
If the user chooses to quit, log a message and return False.
Args:
orig_text: Original text to compare against.
changed_text: Proposed changes to the original text.
path: Path to the file being modified.
label: A short label for the issue being fixed (default "").
skip_compile: If True, do not try to compile the changed text before writing (default False).
diff_choice: How many context lines to show in the diff (0 = old-style diff, 1 = unified diff with 0 context lines, 2+ = unified diff with 'diff_choice - 1' context lines) (default 1).
changed_color: Color to use for unchanged characters in the changed lines in the diff (default ANSI_CYAN).
deleted_color: Color to use for the deleted characters in orig lines (default ANSI_YELLOW).
added_color: Color to use for the added characters in changed lines (default ANSI_GREEN).
the_fix: A string describing the fix being applied (e.g. "autopep8", "manual edit") (default "").
description: A longer description of the issue being fixed (default "").
Returns:
False if the user chose to quit; True otherwise.
Raises:
FileNotFoundError: If the specified file does not exist.
ValueError: If the specified path is not a file. The function which raises this exception is my_fopen().
highlight_changes — Compare 'orig' and 'new' strings and return a tuple
highlight_changes(orig: 'str', new: 'str', unchanged_color: 'str', added_color: 'str', deleted_color: 'str') -> 'tuple[str, str]'
Compare 'orig' and 'new' strings and return a tuple
(old_highlighted, new_highlighted), where:
- old_highlighted has parts present only in 'orig' wrapped in deleted_color.
- new_highlighted has parts present only in 'new' wrapped in added_color
and unchanged parts in unchanged_color.
Args:
orig: The original string.
new: The modified string.
unchanged_color: The color to use for unchanged parts.
added_color: The color to use for added parts.
deleted_color: The color to use for deleted parts.
Returns:
A tuple of (old_highlighted, new_highlighted) strings.
Raises:
None.
is_python_script — Return True if 'path' looks like a Python script:
is_python_script(path: 'str | os.PathLike[str]') -> 'bool'
Return True if 'path' looks like a Python script:
1. It's a file which ends in .py or .pyw
2. Or it is executable AND its first line is a python shebang
Args:
path: The file path to check.
Returns:
True if the path is a Python script, False otherwise.
Raises:
IsADirectoryError: If the path is a directory.
FileNotFoundError: If the file is not found.
PermissionError: If the file is not accessible due to permission issues.
my_diff — Show a diff between 'orig_text' and 'changed_text' in the console,
my_diff(orig_text: 'str', changed_text: 'str', orig_path: 'str | os.PathLike[str]', changed_path: 'str | os.PathLike[str] | None' = None, diff_choice: 'int' = 1, changed_color: 'str' = '\x1b[94m', deleted_color: 'str' = '\x1b[91m', added_color: 'str' = '\x1b[93m') -> 'None'
Show a diff between 'orig_text' and 'changed_text' in the console,
highlighting character-level changes within changed lines.
Args:
orig_text: Original text to compare against.
changed_text: Proposed changes to the original text.
orig_path: Path to the original file.
changed_path: Optional path to the changed file (if different).
diff_choice: How many context lines to show in the diff ( 0 = old-style diff, 1 = unified diff with 0 context lines,
2+ = unified diff with 'diff_choice - 1' context lines).
changed_color: Color to use for unchanged characters in the changed lines in the diff (default ANSI_CYAN).
deleted_color: Color to use for the deleted characters in orig lines (default ANSI_YELLOW).
added_color: Color to use for the added characters in changed lines (default ANSI_RED).
Returns:
None: Prints the diff to the console.
Raises:
None.
text — Mojibake fixing, encoding detection, casing helpers
Layer 5. from emmykit.text import …
ftfy-based fix_text/fix_mojibake (with an atomic write-back), explicit UTF-8 / CP-1252 decoders, sentence-aware my_capitalize/my_title_case, and normalize_for_search for diacritic-folded comparisons.
Translation tables live in emmykit.text_constants (CHARACTERS_TO_SPACE / QUOTES_TO_DELETE / REPLACE_WITH_SPACE / TRANSLATION_TABLE) and feed normalize_for_search.
contains_mojibake — Use ftfy.badness.is_bad() to detect any likely mojibake in the text.
contains_mojibake(text: 'str') -> 'bool'
Use ftfy.badness.is_bad() to detect any likely mojibake in the text.
decode_cp1252 — Attempt to decode CP1252 bytes and return as a string.
decode_cp1252(raw_bytes: 'bytes', path_str: 'str' = 'input string') -> 'str | None'
Attempt to decode CP1252 bytes and return as a string.
If it fails, return None.
decode_utf8 — If the file at 'path' is valid UTF-8 without lone C1 controls,
decode_utf8(raw_bytes: 'bytes', path_str: 'str' = 'input string') -> 'str | None'
If the file at 'path' is valid UTF-8 without lone C1 controls,
return the decoded string. Otherwise, return None.
ensure_utf8_meta — Ensure the HTML text has a tag.
ensure_utf8_meta(html: 'str') -> 'str'
Ensure the HTML text has a <meta charset="utf-8"> tag.
If one already exists—either as a charset attribute or
as an http-equiv Content-Type declaration—normalize it to
<meta charset="utf-8">. Otherwise, insert that tag right
after the opening <head> tag.
fix_mojibake — Fix mojibake in a text file, recoding from CP1252 to UTF-8 if necessary.
fix_mojibake(filepath: 'str | os.PathLike[str]', make_backup: 'bool' = True, dry_run: 'bool' = False) -> 'None'
Fix mojibake in a text file, recoding from CP1252 to UTF-8 if necessary.
If the file is already valid UTF-8, it will only fix mojibake.
fix_text — Fix mojibake in a string using ftfy.fix_encoding().
fix_text(current_text: 'str', path: 'str | os.PathLike[str]', raw_bytes: 'bytes') -> 'str | None'
Fix mojibake in a string using ftfy.fix_encoding().
my_capitalize — Capitalize ONLY the first letter of a string and DON'T modify the rest of it.
my_capitalize(string_to_capitalize: 'str') -> 'str'
Capitalize ONLY the first letter of a string and DON'T modify the rest of it.
my_title_case — Capitalize the first letter of each word, but if a word already has ANY uppercase letters, leave it as is. This way, wo…
my_title_case(the_title: 'str') -> 'str'
Capitalize the first letter of each word, but if a word already has ANY uppercase letters, leave it as is. This way, words like "WW2" or "iZombie" won't be modified.
normalize_for_search — Convert text to ASCII and lowercase for case- and diacritic-insensitive comparison. Also treat some characters such as …
normalize_for_search(text: 'str') -> 'str'
Convert text to ASCII and lowercase for case- and diacritic-insensitive comparison. Also treat some characters such as ._- the same as spaces. Remove quotes (', ", ' and their unicode variants).
hosts — Hostname + computer-name detection
Layer 5. from emmykit.hosts import …
Five strategies for retrieving a hostname (socket / platform / uname / hostname / scutil), aggregated by get_computer_name with a NASA-prefix detector.
analyze_computer_name_results — Analyzes the retrieved computer names.
analyze_computer_name_results(results: 'dict[str, str]', rawlog: 'bool' = False) -> 'str'
Analyzes the retrieved computer names.
Args:
results: Dictionary with method names as keys and computer names as values.
rawlog: If True, print statements are disabled.
Returns:
A string representing the most common computer name obtained from the methods.
If no names were retrieved, returns "ERROR-NO-NAME".
Raises:
None: This function does not raise exceptions, but it may log errors or warnings if
no names (or differing names) are retrieved.
get_computer_name — Attempts multiple methods to retrieve the computer's name and returns the most common one.
get_computer_name(rawlog: 'bool' = False) -> 'str'
Attempts multiple methods to retrieve the computer's name and returns the most common one.
Args:
rawlog: If True, print statements are disabled.
Returns:
A string representing the most common computer name obtained from the methods.
If no names were retrieved, returns "ERROR-NO-NAME".
Raises:
None: This function does not raise exceptions, but it may log warnings if no names are retrieved.
get_hostname_os_uname — Retrieves the hostname using os.uname().nodename.
get_hostname_os_uname(rawlog: 'bool' = False) -> 'str | None'
Retrieves the hostname using os.uname().nodename.
get_hostname_platform — Retrieves the hostname using platform.node().
get_hostname_platform(rawlog: 'bool' = False) -> 'str | None'
Retrieves the hostname using platform.node().
get_hostname_socket — Retrieves the hostname using socket.gethostname().
get_hostname_socket(rawlog: 'bool' = False) -> 'str | None'
Retrieves the hostname using socket.gethostname().
get_hostname_subprocess_hostname — Retrieves the hostname using the 'hostname' system command via subprocess.
get_hostname_subprocess_hostname(rawlog: 'bool' = False) -> 'str | None'
Retrieves the hostname using the 'hostname' system command via subprocess.
get_hostname_subprocess_scutil — Retrieves the hostname using the 'scutil --get ComputerName' command on macOS via subprocess.
get_hostname_subprocess_scutil(rawlog: 'bool' = False) -> 'str | None'
Retrieves the hostname using the 'scutil --get ComputerName' command on macOS via subprocess.
NASA computer-name prefixes — Prefix lists feeding `IS_NASA_COMPUTER` detection.
Includes: NASA_CASEFOLDED_COMPUTER_NAME_PREFIXES, NASA_COMPUTER_NAME_PREFIXES.
network — Internet-connectivity probes
Layer 5. from emmykit.network import …
is_internet_available runs a multi-strategy DNS + HTTP + TCP check against net_targets with a captive-portal sniff and a shared ThreadPoolExecutor.
Probe targets live in emmykit.net_targets (IPV4_TARGETS / IPV6_TARGETS / HTTP_PROBES / DNS_TEST_NAMES) and feed is_internet_available.
CheckResult — Aggregate results from the multi-strategy connectivity check.
CheckResult(tcp_ok: 'bool', dns_ok: 'bool', http_ok: 'bool', captive_detected: 'bool') -> None
Aggregate results from the multi-strategy connectivity check.
Fields: tcp_ok, dns_ok, http_ok, captive_detected.
is_internet_available — Determine if the internet is available using multiple methods.
is_internet_available(timeout_per_step: 'float' = 2.5, retries: 'int' = 1, workers: 'int' = 6, include_ipv6: 'bool' = False, strict: 'bool' = False, ignore_proxies: 'bool' = False) -> 'bool'
Determine if the internet is available using multiple methods.
Strategy (per attempt):
1) TCP to multiple well-known numeric IPs (no DNS).
2) DNS resolution of common hostnames.
3) HTTP(S) probes with expectations and captive-portal detection.
Aggregation logic:
- If captive portal is detected -> return False immediately.
- If any HTTP probe passes expectations -> return True.
- Else if TCP OK and DNS OK -> return True.
- Else:
* If strict is False and TCP OK alone -> return False
(raw TCP alone is not considered sufficient for "internet usable").
* If strict is True -> still False.
Args:
timeout_per_step: Timeout (seconds) per individual network attempt.
retries: Number of times to repeat the full check if the result is False.
workers: Thread pool size for TCP checks.
include_ipv6: Whether to include IPv6 targets.
strict: Require stronger evidence of connectivity.
ignore_proxies: Disable env proxies for HTTP probes.
Returns:
True if the internet appears reachable and usable, else False.
Raises:
None.
python_env — Python version + shell-environment detection
Layer 5. from emmykit.python_env import …
Helpers for picking a Python interpreter, locating the user's shell rc file, and finding alias-source files.
check_python_version — Check if the given Python command is available and has a version of PY_VERSION or higher.
check_python_version(command: 'str') -> 'bool'
Check if the given Python command is available and has a version of PY_VERSION or higher.
detect_shell — Detect the current interactive shell, falling back to parent process name if needed.
detect_shell(options: 'Options') -> 'None'
Detect the current interactive shell, falling back to parent process name if needed.
Args:
options: Options object to store the detected shell information.
Returns:
None, but updates options.shell with the detected shell name.
Raises:
None, but logs an error if the shell cannot be detected via
subprocess.CalledProcessError or FileNotFoundError.
find_additional_alias_files — Find additional alias files for the shell.
find_additional_alias_files(options: 'Options') -> 'None'
Find additional alias files for the shell.
find_preferred_python_version — Find the command for the preferred version of python (stored here as PY_VERSION).
find_preferred_python_version() -> 'str | None'
Find the command for the preferred version of python (stored here as PY_VERSION).
find_shell_rc_file — Find the shell configuration file for the current user, store in options.rc_file.
find_shell_rc_file(options: 'Options') -> 'None'
Find the shell configuration file for the current user, store in options.rc_file.
For bash/zsh, also consider login‐shell files if the usual rc isn't present.
Args:
options: Options object containing the shell type and rc_file attribute.
Returns:
None, but updates options.rc_file with the path to the shell configuration file.
Raises:
None, but logs an error if the shell is unsupported or if no rc file is found
for the specified shell.
files — Checksums, downloads, filename formatting, free-space queries
Layer 5. from emmykit.files import …
download_file with progress, calculate_checksum, query_free_space via shutil, filename_format for legal-on-most-OSes name munging, and verify_script to confirm a shell script is well-formed.
calculate_checksum — Calculate the SHA256 checksum of a file.
calculate_checksum(file_path: 'str | os.PathLike[str]') -> 'str'
Calculate the SHA256 checksum of a file.
download_file — Download a file to 'dest' with retry + exponential backoff.
download_file(url: 'str', dest: 'str | os.PathLike[str]', retries: 'int' = 5, chunk_size: 'int' = 1048576, timeout: 'int' = 30, headers: 'dict[str, str] | None' = None) -> 'None'
Download a file to 'dest' with retry + exponential backoff.
Writes to a temporary .part file and renames atomically on success.
Verifies Content-Length if provided.
Logs progress by bytes (rough).
Also checks free disk space (if size is known) before downloading.
Args:
url: The source URL to download from.
dest: Destination file path.
retries: Number of attempts (default 5 is a good balance for transient errors).
chunk_size: Bytes per read chunk (default 1MiB).
timeout: Per-attempt socket timeout (seconds).
headers: Optional dict of HTTP headers to include in the request.
Returns:
None. Writes the file to 'dest'.
Raises:
SystemExit on failure after retries or if insufficient free space is detected.
filename_format — Turn arbitrary text into an ASCII-only, filesystem‐safe base filename.
filename_format(text: 'str', sep: 'str' = '_', max_length: 'int | None' = None) -> 'str'
Turn arbitrary text into an ASCII-only, filesystem‐safe base filename.
WARNING: Do not include an extension in the text, because this function
might remove the dot which separates the filename from the extension.
It attempts to recognize and remove extensions listed in ALL_KNOWN_EXTENSIONS
but this list (actually, ordered tuple) is not exhaustive.
Steps:
1. Unicode → ASCII
2. Recognize & remove common extensions (e.g. .txt, .fits, .tar.gz)
3. Treat dots, underscores & whitespace as word separators
4. Remove any character that isn't A-z, a–z, 0–9, dashes, or the separator
5. Collapse runs of separators into a single one
6. Trim separators from ends
7. Optionally truncate to max_length (preserving word boundaries)
8. If an extension was removed, append it back as the last step.
Args:
text: Original filename or title
sep: Single-character separator (default: "_")
max_length: If set, strongest‐effort truncate to this many chars
Returns:
A clean, filename-safe string.
Raises:
None: If the input text is None, it will return an empty string.
query_free_space — Return the free space (in bytes) available to the current user on the
query_free_space(path: 'str | os.PathLike[str]') -> 'int'
Return the free space (in bytes) available to the current user on the
filesystem that contains 'path'. Works for files or directories, and
for paths that don't yet exist (it climbs to the nearest existing parent).
Args:
path: A file or directory path.
Returns:
Free space in bytes available to the current user on the filesystem.
Raises:
FileNotFoundError: If no existing parent directory is found.
OSError: If the filesystem information cannot be retrieved.
verify_script — Ensure that 'thepath' exists and contains exactly 'thescript'.
verify_script(options: 'Options', thepath: 'str | os.PathLike[str]', thescript: 'str') -> 'None'
Ensure that 'thepath' exists and contains exactly 'thescript'.
- If 'thepath' does not exist or is not a file, it will be created and populated.
- If it exists but its contents differ, it will be overwritten.
- Otherwise, nothing happens.
lint — flake8 / autopep8 / mypy interactive runners + multireplace
Layer 6. from emmykit.lint import …
Run linters, gather + display findings with color, prompt-and-apply autopep8 fixes, and the multireplace regex-driven search-and-replace tool that shares lint internals.
ask_and_autopep8 — Prompt the user about fixing ALL occurrences of 'code' in 'path',
ask_and_autopep8(path: 'str | os.PathLike[str]', code: 'str', description: 'str' = '', diff_choice: 'int' = 1, changed_color: 'str' = '\x1b[94m', deleted_color: 'str' = '\x1b[91m', added_color: 'str' = '\x1b[93m') -> 'bool'
Prompt the user about fixing ALL occurrences of 'code' in 'path',
and if yes, apply autopep8.fix_file with --select=code.
The fix will be applied without saving, and the user will be shown a diff
of the changes before saving to the file.
Args:
path: The path to the file to modify.
code: The specific PEP 8 violation code to fix.
description: A description of the issue being fixed (default "").
diff_choice: How many context lines to show in the diff (0 = old-style diff, 1 = unified diff with 0 context lines, 2+ = unified diff with 'diff_choice - 1' context lines) (default 1).
changed_color: Color to use for unchanged characters in the changed lines in the diff (default ANSI_CYAN).
deleted_color: Color to use for the deleted characters in orig lines (default ANSI_YELLOW).
added_color: Color to use for the added characters in changed lines (default ANSI_GREEN).
Returns:
True if the user wants to continue, False if they want to quit.
Raises:
FileNotFoundError: If the specified file does not exist.
ValueError: If the specified path is not a file. The function which raises this exception is autopep8.fix_file().
ask_and_replace — Read 'path', do orig.replace(old, new), then show a diff and ask to confirm.
ask_and_replace(old_str: 'str', new_str: 'str', path: 'str | os.PathLike[str]', label: 'str' = '', diff_choice: 'int' = 1, description: 'str' = '', changed_color: 'str' = '\x1b[94m', deleted_color: 'str' = '\x1b[91m', added_color: 'str' = '\x1b[93m', skip_compile: 'bool' = False, verbose: 'bool' = True) -> 'bool'
Read 'path', do orig.replace(old, new), then show a diff and ask to confirm.
Args:
old_str: Old string to search for.
new_str: New string to replace the old string.
path: Path to the file being modified.
label: A short label for the issue being fixed (default "").
skip_compile: If True, do not try to compile the changed text before writing (default False).
diff_choice: How many context lines to show in the diff (0 = old-style diff, 1 = unified diff with 0 context lines, 2+ = unified diff with 'diff_choice - 1' context lines) (default 1).
changed_color: Color to use for unchanged characters in the changed lines in the diff (default ANSI_CYAN).
deleted_color: Color to use for the deleted characters in orig lines (default ANSI_YELLOW).
added_color: Color to use for the added characters in changed lines (default ANSI_GREEN).
the_fix: A string describing the fix being applied (e.g. "autopep8", "manual edit") (default "").
description: A longer description of the issue being fixed (default "").
Returns:
False if the user chose to quit; True otherwise.
Raises:
IsADirectoryError: If the path is a directory.
FileNotFoundError: If the file is not found.
PermissionError: If the file is not accessible due to permission issues.
check_python_formatting — Reads a .py file at 'path' via my_fopen, makes sure it compiles, parses it with AST,
check_python_formatting(path: 'str | os.PathLike[str]', diff_choice: 'int' = 1) -> 'bool'
Reads a .py file at 'path' via my_fopen, makes sure it compiles, parses it with AST,
prints any custom formatting violations to stdout,
and asks the user to fix any backticks or curly quotes in the file. If the user quits, it returns False.
Args:
path: The path to the Python file to check.
diff_choice: How many context lines to show in the diff (0 = old-style diff, 1 = unified diff with 0 context lines, 2+ = unified diff with 'diff_choice - 1' context lines).
Returns:
False if the user chose to quit during any replacement prompts or if there was an error,
True otherwise.
Raises:
FileNotFoundError: If the specified file does not exist.
FormatChecker — Walks a module AST and collects formatting violations:
FormatChecker(source: 'str', doc_style: 'str' = 'None') -> 'None'
Walks a module AST and collects formatting violations:
- missing type hints on params / return
- missing docstring or incorrect docstring quote style
Public methods: generic_visit, visit, visit_AsyncFunctionDef, visit_ClassDef, visit_Constant, visit_FunctionDef.
get_autopep8_fixable_codes — Run 'autopep8 --list-fixes' (via subprocess) to discover exactly
get_autopep8_fixable_codes() -> 'set[str]'
Run 'autopep8 --list-fixes' (via subprocess) to discover exactly
which Flake8 error‐codes autopep8 knows how to fix.
Returns a set like {"E101","E111", ...}.
interactive_flake8 — 1) Run the flake8 API for summary counts.
interactive_flake8(options: 'Options', path: 'str | os.PathLike[str]', ignore_codes: 'list[str] | None' = None, diff_choice: 'int' = 1, max_line_length: 'int' = 100, changed_color: 'str' = '\x1b[94m', deleted_color: 'str' = '\x1b[91m', added_color: 'str' = '\x1b[93m') -> 'bool'
1) Run the flake8 API for summary counts.
2) Shell out to flake8 CLI once to harvest one description per code.
3) For each code, ask the user; on "yes", call autopep8 to fix only that code.
Args:
options: The parsed command-line options. Contains:
- bugbear_choice: Whether to include flake8-bugbear checks.
path: Path to the Python file to check.
diff_choice: How many context lines to show in the diff (0 = old-style diff,
1 = unified diff with 0 context lines,
2+ = unified diff with 'diff_choice - 1' context lines).
ignore_codes: List of Flake8 codes to ignore (default: empty list).
max_line_length: Maximum line length for E501 (default: 100).
changed_color: Color for unchanged characters in changed lines (default: ANSI_CYAN).
deleted_color: Color for deleted characters in original lines (default: ANSI_RED).
added_color: Color for added characters in changed lines (default: ANSI_YELLOW).
Returns:
False if the user chose to quit during any replacement prompts, True otherwise.
multireplace — Perform a multi-file replace operation.
multireplace(options: 'Options', verbose: 'bool' = True) -> 'None'
Perform a multi-file replace operation.
Args:
options: The parsed command-line options. Contains:
- old_str: The text to be replaced in the files.
- new_str: The text to replace the old_str.
- glob_pattern: Glob pattern of files to edit.
- dir: Directory to search in.
- recursive: Whether to search recursively in subdirectories.
verbose: If True, log messages about files with no occurrences found (default: True).
Returns:
None. Modifies files in place if the user confirms the changes.
Raises:
ValueError: If the glob pattern is invalid.
FileNotFoundError: If the specified directory does not exist.
NotADirectoryError: If the specified path is not a directory.
run_flake8 — Run Flake8 on 'path', but:
run_flake8(options: 'Options', path: 'str | os.PathLike[str]', ignore_codes: 'list[str] | None' = None, max_line_length: 'int' = 100) -> 'flake8.Report'
Run Flake8 on 'path', but:
- only flag E501 if a line exceeds 'max_line_length',
- ignore whatever codes are in 'ignore_codes'.
Args:
options: Options instance containing various settings.
path: The path to the Python file to check.
ignore_codes: A list of Flake8 error/warning codes to ignore.
max_line_length: The (custom) maximum allowed line length for E501 checks.
Returns:
flake8.Report: The Flake8 report object containing the results.
Raises:
FileNotFoundError: If the specified file does not exist.
run_mypy — Run basic mypy static analysis on the specified file.
run_mypy(options: 'Options', path: 'str | os.PathLike[str]') -> 'None'
Run basic mypy static analysis on the specified file.
Args:
options: The parsed command-line options. (Currently unused but included for consistency.)
path: Path to the Python file to analyze.
Returns:
None.
treeview — Directory tree with new-file highlighting
Layer 7. from emmykit.treeview import …
Renders a colored ASCII tree starting at a directory, marking files newer than a cutoff.
treeview_new_files — Recursively scan the directory, print the contents of files newer than last_file_path (if provided- if so store its mod…
treeview_new_files(directory: 'str | os.PathLike[str]', last_file_path: 'str | os.PathLike[str] | None' = None, last_mtime: 'float | None' = None, maxlines: 'int' = 0, use_colors: 'bool' = True, print_root: 'bool' = True, prefix: 'str' = '', is_last: 'bool' = True, level: 'int' = 0, state: 'dict[str, Any] | None' = None, probe_only: 'bool' = False) -> 'bool'
Recursively scan the directory, print the contents of files newer than last_file_path (if provided- if so store its modification date in last_mtime). Return True if any relevant files are found.
Args:
directory: The directory to scan.
last_file_path: The optional path to a chosen file. Only files newer than this will be printed.
last_mtime: The modification time of the last_file_path. If None, all files will be
considered.
maxlines: The maximum number of lines to read from each file. 0 means don't read at all,
-1 means read all lines, otherwise read up to maxlines (default 0).
use_colors: Whether to use ANSI color codes in the output (default True).
print_root: If True, print the root directory name (default True).
prefix: The prefix to use for logging output (default '').
is_last: Whether this is the last item in the current level (default True).
level: The current recursion level (default 0).
state: A dictionary to maintain state across recursive calls (default None).
probe_only: If True, do not print file contents, just check for existence of relevant
files (default False).
Returns:
True if any relevant files are found or the directory itself is newer than last_mtime,
False otherwise.
Raises:
None: Catches exceptions, logs an error and returns False if the directory is not a valid
directory or does not exist.
docker_utils — Docker daemon + image lifecycle helpers
Layer 7. from emmykit.docker_utils import …
Ensure the docker daemon is running, the requested image is built, and rerun a command with auto-fixes when daemon or image is missing.
ensure_daemon_running — Check if the Docker daemon is running; if not, attempt to start it.
ensure_daemon_running() -> 'None'
Check if the Docker daemon is running; if not, attempt to start it.
ensure_docker_installed — Check if the Docker CLI is installed; if not, raise an error.
ensure_docker_installed() -> 'None'
Check if the Docker CLI is installed; if not, raise an error.
ensure_image_built — Ensure that a Docker image with the given name exists; if not, build it.
ensure_image_built(image: 'str', *, dockerfile: 'Path | None' = None, build_dir: 'Path | None' = None, build_cmd: 'str | None' = None) -> 'None'
Ensure that a Docker image with the given name exists; if not, build it.
You can specify either a dockerfile (whose first line is a comment with the build command)
or a build_cmd (and optionally a build_dir). If both dockerfile and build_cmd are None,
the function will raise an error.
run_with_docker_fixes — Run a command (typically 'docker run ...') and if it fails, attempt to fix
run_with_docker_fixes(base_args: 'list[str]', *, ensure_build: 'Callable[[], None] | None' = None, extra_fixes: 'Iterable[Callable[[], None]] | None' = None) -> 'MyPopenResult'
Run a command (typically 'docker run ...') and if it fails, attempt to fix
common Docker issues (like Docker not installed or daemon not running) and retry.
Args:
base_args: The command and its arguments to run (e.g., ['docker', 'run', ...]).
ensure_build: An optional function to ensure a Docker image is built.
If provided, it will be called if the initial command fails.
extra_fixes: An optional iterable of additional fix functions to try if the command fails.
Returns:
The result of the successful command, or None if all fixes fail.
Raises:
RuntimeError: If all fixes fail and the command still does not succeed.
system — OS-level process + resource helpers
Layer 7. from emmykit.system import …
kill_process, is_process_running, start_only_one_instance (PID-lock idempotency), detect_country (IP geoloc), and file-manager / terminal-launcher entry points.
check_if_command_exists — Check if a command exists on the system.
check_if_command_exists(command: 'str') -> 'bool'
Check if a command exists on the system.
Args:
command: The command to check.
Returns:
True if the command exists, False otherwise.
detect_country — Detect the country of the IP address using ipinfo.io service.
detect_country(force_wtfismyip: 'bool' = False) -> 'str | None'
Detect the country of the IP address using ipinfo.io service.
If the request fails, it falls back to wtfismyip.com service.
Args:
force_wtfismyip: If True, always use wtfismyip.com
Returns:
The country name as a string, or None if detection fails.
Raises:
ValueError: If the IPINFO_API_TOKEN environment variable is not set.
get_effective_free_memory — Return the "effective" free memory in bytes: free memory plus buffers plus cache.
get_effective_free_memory() -> 'float'
Return the "effective" free memory in bytes: free memory plus buffers plus cache.
is_process_running — Check if a process with the given name is running.
is_process_running(process_name: 'str') -> 'bool'
Check if a process with the given name is running.
kill_process — Kill a process by its name, then check if it is still running and retry if needed. Make sure the process name is unique…
kill_process(pname: 'str') -> 'None'
Kill a process by its name, then check if it is still running and retry if needed. Make sure the process name is unique to avoid killing unintended processes.
open_filemanager_with_dirs — Open the file manager with the specified directories.
open_filemanager_with_dirs(directories: 'list[str | os.PathLike[str]]') -> 'None'
Open the file manager with the specified directories.
Note: Most file managers don't support multiple tabs via command line, so open separate windows.
open_terminal_and_run_command — Open a GNOME terminal, source ~/.bashrc (via bash -i), run the_command,
open_terminal_and_run_command(the_command: 'str', close_after: 'bool' = False, maximize_window: 'bool' = False) -> 'None'
Open a GNOME terminal, source ~/.bashrc (via bash -i), run the_command,
and optionally close or keep the window open. Optionally, maximize it.
start_only_one_instance — Start a process, but only if it's not already running.
start_only_one_instance(process_name: 'str') -> 'None'
Start a process, but only if it's not already running.
media — Video / audio helpers (ffmpeg, VLC, system volume)
Layer 7. from emmykit.media import …
Open paths in VLC, find the bundled ffmpeg, query video duration, slice + concatenate video segments, and set the system volume via pulsectl.
ensure_even_dimensions — Ensure the image at 'image_path' has dimensions divisible by 2, by resizing if necessary.
ensure_even_dimensions(image_path: 'str | os.PathLike[str]') -> 'None'
Ensure the image at 'image_path' has dimensions divisible by 2, by resizing if necessary.
extract_and_concatenate_segments — Extracts segments from a video file and concatenates them into a new file.
extract_and_concatenate_segments(input_file: 'str | os.PathLike[str]', timestamps: 'list', output_name_or_path: 'str | os.PathLike[str]', subtitle_file: 'str | os.PathLike[str]') -> 'None'
Extracts segments from a video file and concatenates them into a new file.
find_ffmpeg — Return a full path string to an ffmpeg executable if found, else None.
find_ffmpeg() -> 'str | None'
Return a full path string to an ffmpeg executable if found, else None.
Tries: env vars, PATH, common Conda and Windows/Cygwin/MSYS installs,
and (optionally) imageio-ffmpeg if available.
Args:
None
Returns:
A string containing the path to the ffmpeg executable or None if not found.
Raises:
None
get_video_duration_seconds — Return the duration of a video file in seconds, using fast and reliable probes.
get_video_duration_seconds(path: 'str | os.PathLike[str]', timeout: 'float' = 10.0) -> 'float'
Return the duration of a video file in seconds, using fast and reliable probes.
The function prefers `ffprobe` (from FFmpeg) for speed and accuracy, falls back to
`mediainfo` if available, and finally attempts an OpenCV-based estimate if neither
CLI is present. All filesystem paths are handled via `pathlib.Path`.
Args:
path: Path to the video file (string path or os.PathLike). Converted to Path.
timeout: Per-process timeout (in seconds) for external probes.
Returns:
The duration of the video in seconds as a float.
Raises:
FileNotFoundError: If the given path does not exist or is not a file.
RuntimeError: If duration could not be determined by any available method.
ValueError: If a probe returns an invalid or non-positive duration.
open_dir_in_VLC — Create a playlist of the files in the specified directory, then play that playlist in VLC. By default, don't search the…
open_dir_in_VLC(the_dir: 'str | os.PathLike[str]', sort_choice: 'str' = 'sort_by_name', recursive: 'bool' = False, no_start: 'bool' = False) -> 'None'
Create a playlist of the files in the specified directory, then play that playlist in VLC. By default, don't search the directory recursively and sort the files by name. Optional arguments allow recursive loading or sorting by modification time. If no_start is True, don't start playback in VLC.
open_in_vlc — Open a file or directory in VLC. If it's a directory, create a playlist of its contents first. If no_start is True, don…
open_in_vlc(path: 'str | os.PathLike[str]', no_start: 'bool' = False) -> 'None'
Open a file or directory in VLC. If it's a directory, create a playlist of its contents first. If no_start is True, don't start playback in VLC.
Args:
path: The file or directory path to open in VLC.
no_start: If True, VLC will open the file or playlist but not start playback automatically
(default: False).
Returns:
None: The function performs the action of opening VLC and does not return any value.
Raises:
FileNotFoundError: If the specified path does not exist.
open_playlist_in_VLC — Open a playlist in VLC. If no_start is True, don't start playback in VLC.
open_playlist_in_VLC(playlist: 'str | os.PathLike[str]', no_start: 'bool' = False) -> 'None'
Open a playlist in VLC. If no_start is True, don't start playback in VLC.
set_system_volume — Set the system volume to a specific level.
set_system_volume(percent: 'int', tolerance: 'int' = 1, change_mute: "Literal['mute', 'unmute'] | None" = None, force_pactl: 'bool' = False) -> 'None'
Set the system volume to a specific level.
On Linux, this function will:
Try to set the PulseAudio default sink volume to 'percent'% via pulsectl,
verify it, and if that fails, fall back to pactl.
Args:
percent: Desired volume level (0–100).
tolerance: Allowed percent difference when verifying (default: 1%).
change_mute: If set to "mute", the function will mute the audio instead of
setting a specific volume. If set to "unmute", it will unmute
the audio. If None, it will not change the mute state.
force_pactl: If True, always use pactl even if pulsectl is available (default: False).
Returns:
None
Raises:
RuntimeError: If the volume could not be set or verified.
html_files — HTML filename munging + multi-file combination
Layer 7. from emmykit.html_files import …
Strip a leading prefix from filenames or <title> tags, and concatenate multiple HTML files into one.
combine_html_files — Combine multiple HTML files into a single HTML file.
combine_html_files(file_paths: 'list[str | os.PathLike[str]]', output_file_path: 'str | os.PathLike[str]') -> 'None'
Combine multiple HTML files into a single HTML file.
The first file's <head> is preserved, and all <body> contents are concatenated.
Args:
file_paths: List of (presorted) file paths to the HTML files to combine.
output_file_path: Path to save the combined HTML file.
Returns:
None: the combined HTML is saved to the specified output file path.
Raises:
Exception: If there is an error reading any of the HTML files or writing the output file.
FileNotFoundError: If any of the input files do not exist.
ValueError: If the output file path is not valid.
ImportError: If BeautifulSoup is not installed.
RuntimeError: If the output file cannot be written.
OSError: If there is an error during file operations.
remove_prefix_from_filename — If the given filepath's base filename starts with the given prefix:
remove_prefix_from_filename(filepath: 'str | os.PathLike[str]', prefix: 'str') -> 'bool'
If the given filepath's base filename starts with the given prefix:
1. Remove the prefix (and any " _-" immediately following it).
2. Move the file (but only if that doesn't cause errors).
Args:
filepath: The path to the file whose name may need to be changed.
prefix: The prefix to remove from the filename.
Returns:
True: If the file was successfully renamed, or if it didn't need renaming.
False: If the file was not renamed because it didn't start with the prefix,
or if the new filename already exists.
Raises:
OSError: If the rename operation fails due to an OS error (e.g., permission denied).
remove_prefix_from_html_title — If the given filepath is an HTML file and its title starts with the given prefix, remove the prefix from the title and …
remove_prefix_from_html_title(filepath: 'str | os.PathLike[str]', prefix: 'str') -> 'bool'
If the given filepath is an HTML file and its title starts with the given prefix, remove the prefix from the title and save the file, then return True. Otherwise, return False.
llm — LLM wrapper, config dataclasses, model selection
Layer 8. from emmykit.llm import …
The 2 000-LOC LLMs class wraps litellm/tiktoken with LLMConfig/ModelInfo dataclasses, a SelectionStrategy enum, and lazy backoff via tenacity.
LLMConfig — Configuration for LLM selection and usage. Data only.
Configuration for LLM selection and usage. Data only.
Fields: only_cleared_models, only_local_models, allow_local_models, ollama_base_url, vllm_base_url, rate_throttle, rate_headroom, rate_retry_max_attempts, rate_retry_max_wait, rate_db_path, availability_probe, availability_probe_ttl_sec, availability_probe_timeout, availability_probe_allow_costly, selection_strategy, min_context_tokens, assumed_prompt_tokens, assumed_output_tokens, candidate_models, default_temperature, max_tokens, model_scores, prefer_code, prefer_low_TTFT, prefer_local, max_estimated_cost, speed_floor, model_filter, provider_filter, weight_price, weight_code_skill, weight_general_skill, weight_TTFT, weight_speed, weight_nonlocal_penalty.
LLMs — - Routes via LiteLLM
LLMs() -> 'None'
- Routes via LiteLLM
- Builds ModelInfo list, filters by availability/context
- Applies registered/built-in selection strategy
- Exposes stable send_prompt(...)
Public methods: alternative_model, apply_config, describe_selection, get_config, list_candidates, refresh_selection, register_strategy, send_prompt, tokenize.
ModelInfo — Information about a candidate Large Language Model (LLM).
Information about a candidate Large Language Model (LLM).
Fields: name, provider, context_window, input_cost_per_token, output_cost_per_token, available, is_local, cleared, runtime, parameters, code_skill, general_skill, TTFT, speed, meta.
Public methods: estimate_cost.
SelectionContext — Context passed to strategy functions.
SelectionContext(tokens_in: 'int', tokens_out: 'int', min_context_tokens: 'int', require_local: 'bool' = False, require_cleared: 'bool' = False, extras: 'dict[str, Any]' = <factory>) -> None
Context passed to strategy functions.
Fields: tokens_in, tokens_out, min_context_tokens, require_local, require_cleared, extras.
SelectionStrategy — Enumeration of selection strategies for model selection.
SelectionStrategy(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Enumeration of selection strategies for model selection.
Public methods: capitalize, casefold, center, count, encode, endswith, expandtabs, find, format, format_map, index, isalnum, isalpha, isascii, isdecimal, isdigit, isidentifier, islower, isnumeric, isprintable, isspace, istitle, isupper, join, ljust, lower, lstrip, maketrans, partition, removeprefix, removesuffix, replace, rfind, rindex, rjust, rpartition, rsplit, rstrip, split, splitlines, startswith, strip, swapcase, title, translate, upper, zfill.
StrategyFn — TypeAlias
StrategyFn: TypeAlias = collections.abc.Callable[[collections.abc.Sequence[emmykit.llm.ModelInfo], emmykit.llm.SelectionContext], emmykit.llm.ModelInfo]
License
Apache 2.0 — see LICENSE. Changelog at CHANGELOG.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file emmykit-0.3.4.tar.gz.
File metadata
- Download URL: emmykit-0.3.4.tar.gz
- Upload date:
- Size: 218.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd04dd2f22c878b3ba7faf10c6eceb9222b4d9278c5ec260d60df997857863f5
|
|
| MD5 |
de081f2bceca756a862b7694b486c14a
|
|
| BLAKE2b-256 |
e908e83be6a6aeeb0635372fb0484ad8f301983611d95c458d0d57f89bbf90fe
|
File details
Details for the file emmykit-0.3.4-py3-none-any.whl.
File metadata
- Download URL: emmykit-0.3.4-py3-none-any.whl
- Upload date:
- Size: 174.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90777304e5b898585c8538f0cdf584352f741531c8bf60a0466a8f467041bf62
|
|
| MD5 |
16460452fcc54b2c9212c78831bdc592
|
|
| BLAKE2b-256 |
ab9f5365f36d532a05ac256be49a1258c867797088a330d03bb6172592b7bcd1
|