Skip to main content

Tiny, useful Python lib for strings and files

Project description

strif

Strif is a tiny (~1000 loc) library of a few basic string, file, and object utilities for modern Python.

It has zero dependencies.

It is simply a few functions and tricks that have repeatedly shown value in various projects. The goal is to complement the standard libs and fill in a few gaps, but not replace or wrap standard libraries.

NEW: Version 3.0 is out and has additions and updates for Python 3.10-3.13! ✨

A quick overview is here in the readme. The libs are all small so see pydoc strings or code for full docs.

Installation

# Use uv
uv add strif
# Or poetry
poetry add strif
# Or pip
pip install strif

Text Abbreviations and Formatting

  • abbrev_str(string: str, max_len: Optional[int] = 80, indicator: str = '…')

    Abbreviates a string and appends an indicator if the content exceeds the allowed length.

  • abbrev_list(items: List[Any], max_items: int = 10, item_max_len: Optional[int] = 40, joiner: str = ', ', indicator: str = '…')

    Shortens each element of a list and appends an ellipsis if the list is truncated.

  • single_line(text: str)

    Converts multi-line text into a single line by replacing extra whitespace with spaces.

  • quote_if_needed(arg: Any)

    Returns a string with quotes if needed for proper display (for example, for filenames with spaces).

String Identifiers, Timestamps, and Hashing

[!TIP]

These functions use base 36. In my humble opinion, base 36 ids are underrated and should be used more often.

Base 36 is briefer than hex and avoids ugly non-alphanumeric characters like base 64. And they are case insensitive, which is generally wise: If you use identifiers for filenames, you definitely should prefer case insensitive identifiers because of case-insensitive filesystems (like macOS). And they're easier to read aloud over the phone for an auth code or to type manually.

If you need a readable, concise identifier, api key format, or hash format, consider

  • new_uid(bits: int = 64)

    Generates a random base36 alphanumeric string with at least the specified bits of randomness. Suitable for filenames (especially on case-insensitive filesystems).

  • new_timestamped_uid(bits: int = 32)

    Creates a unique ID starting with an ISO timestamp, then fractions of seconds and bits of randomness. Example: 20150912T084555Z-378465-43vtwbx

  • iso_timestamp(microseconds: bool = True)

    Returns an ISO 8601 timestamp in UTC, e.g. 2015-09-12T08:41:12.397217Z (with microseconds) or 2015-09-12T08:41:12Z (without).

  • format_iso_timestamp(datetime_obj: datetime, microseconds: bool = True)

    Formats a given datetime object as an ISO 8601 timestamp, ensuring UTC formatting with a trailing Z.

  • clean_alphanum(string: str, max_length: Optional[int] = None)

    Converts a string to a clean identifier by keeping only the first alphanumeric characters and replacing others with underscores.

  • clean_alphanum_hash(string: str, max_length: int = 64, max_hash_len: Optional[int] = None)

    Combines the cleaned version of a string with a base36 SHA1 hash to minimize collisions.

File Hashing

  • file_mtime_hash(path: str | Path)

    Computes a fast hash using a file's name, size, and high-resolution modification time, without looking at file contents. A useful key for fast caching of file contents.

  • hash_string(string: str, algorithm: str = 'sha1') -> Hash and hash_file(file_path: str | Path, algorithm: str = 'sha1') -> Hash

    Provide flexible hashing mechanisms. The returned Hash object has properties to output the digest in hexadecimal, base36, or with a prefixed algorithm name.

Atomic File Operations with Optional Backups

[!TIP]

It’s generally good practice when creating files to write to a file with a temporary name, and move it to a final location once the file is complete. This way, you never leave partial, incorrect versions of files in a directory due to interruptions or failures.

  • atomic_output_file(dest_path: str | Path, make_parents: bool = False, backup_suffix: Optional[str] = None, tmp_suffix: str = '.partial')

    A context manager for writing files or directories atomically. A temporary file is created and, upon successful completion, renamed to the target location.

  • copyfile_atomic(source_path: str | Path, dest_path: str | Path, make_parents: bool = False, backup_suffix: Optional[str] = None)

    Atomically copies a file while preserving its timestamps.

  • copytree_atomic(source_path: str | Path, dest_path: str | Path, make_parents: bool = False, backup_suffix: Optional[str] = None, symlinks: bool = False)

    Recursively copies a directory or file atomically.

  • move_to_backup(path: str | Path, backup_suffix: str = '{timestamp}.bak') and copy_to_backup(path: str | Path, backup_suffix: str = '{timestamp}.bak')

    Functions to move or copy an existing file or directory to a backup destination.

  • move_file(src_path: Path, dest_path: Path, keep_backup: bool = True, backup_suffix: str = '{timestamp}.bak')

    Moves a file to a new location, automatically creating parent directories and optionally keeping a backup of the destination if it already exists.

For example, it is generally a good idea to wrap an open() call with atomic_output_file():

with atomic_output_file("some-dir/my-final-output.txt") as temp_target:
    with open(temp_target, "w") as f:
        f.write("some contents")

And this can (and in most cases should) be used in place of shutil.copyfile:

copyfile_atomic(source_path, dest_path, make_parents=True, backup_suffix=None)

Now if there is some issue during write, the output will instead be at a temporary location in the same directory (with a name like some-dir/my-final-output.txt.partial.XXXXX.) This ensures integrity of the file appearing in the final location.

There are also some handy additional options:

with atomic_output_file("some-dir/my-final-output.txt",
                        make_parents=True, backup_suffix=".old.{timestamp}") as temp_target:
    with open(temp_target, "w") as f:
        sf.write("some contents")

This creates parent folders as needed (a major convenience). And if you would have clobbered a previous output, it keeps a backup with a (fixed or uniquely timestamped) suffix.

Used judiciously, these options can save boilerplate coding and avoid debugging ugly corner case failures with zero-length or truncated files.

Syntax Sugar for Temporary Files

Syntax sugar for auto-deleting temporary files or directories using with:

with temp_output_file("my-scratch.") as (fd, path):
    # Do a bunch of stuff with the opened file descriptor or path, knowing
    # it will be removed assuming successful termination.


with temp_output_dir("work-dir.", dir="/var/tmp") as work_dir:
    # Create some files in the now-existing path work_dir, and it will be
    # deleted afterwards.

Note these don’t delete files in case of error, which is usually what you want. Add always_clean=True if you want the temporary file or directory to be removed no matter what.

Multiple String Replacements

  • insert_multiple(text: str, insertions: list[Insertion]) -> str

    Insert multiple strings into text at the given offsets, at once.

  • replace_multiple(text: str, replacements: list[Replacement]) -> str

    Replace multiple substrings in text with new strings, simultaneously. The replacements are a list of tuples (start_offset, end_offset, new_string).

Simple String Template

A validated template string that supports only specified fields. Can subclass to have a type with a given set of allowed_fields. Provide a type with a field name to allow validation of int/float format strings.

Examples:

>>> t = StringTemplate("{name} is {age} years old", ["name", "age"])
>>> t.format(name="Alice", age=30)
'Alice is 30 years old'

>>> t = StringTemplate("{count:3d}@{price:.2f}", [("count", int), ("price", float)])
>>> t.format(count=10, price=19.99)
' 10@19.99'

Atomic Vars

AtomicVar is a simple zero-dependency thread-safe variable that works for any type.

Often the standard "Pythonic" approach is to use locks directly, but for some common use cases, AtomicVar may be simpler and more readable. Works on any type, including lists and dicts.

Other options include threading.Event (for shared booleans), threading.Queue (for producer-consumer queues), and multiprocessing.Value (for process-safe primitives).

Examples:

# Immutable types are always safe:
count = AtomicVar(0)
count.update(lambda x: x + 5)  # In any thread.
count.set(0)  # In any thread.
current_count = count.value  # In any thread.

# Useful for flags:
global_flag = AtomicVar(False)
global_flag.set(True)  # In any thread.
if global_flag:  # In any thread.
    print("Flag is set")


# For mutable types,consider using `copy` or `deepcopy` to access the value:
my_list = AtomicVar([1, 2, 3])
my_list_copy = my_list.copy()  # In any thread.
my_list_deepcopy = my_list.deepcopy()  # In any thread.

# For mutable types, the `updates()` context manager gives a simple way to
# lock on updates:
with my_list.updates() as value:
    value.append(5)

# Or if you prefer, via a function:
my_list.update(lambda x: x.append(4))  # In any thread.

# You can also use the var's lock directly. In particular, this encapsulates
# locked one-time initialization:
initialized = AtomicVar(False)
with initialized.lock:
    if not initialized:  # checks truthiness of underlying value
        expensive_setup()
        initialized.set(True)

# Or:
lazy_var: AtomicVar[list[str] | None] = AtomicVar(None)
with lazy_var.lock:
    if not lazy_var:
            lazy_var.set(expensive_calculation())

FAQ

Why bother, if it’s so short?

Because it saves time, saves you stupid bugs and clumsy repetition, and has zero (yes zero) dependencies.

Aren't there other libraries that offer these utilities?

A few yes, but most libs I've seen that offer general-purpose tools are larger or more opinionated. And some like atomic_output_file() or the base36 tools I find useful but haven't seen equivalents elsewhere.

If you don't want the dependency on strif, also feel free to just copy the bit you want! They're each short.

Is it mature?

I’ve used many of these functions in production situations for years. It doesn't have comprehensive tests, however.


Development

For how to install uv and Python, see installation.md.

For development workflows, see development.md.

For instructions on publishing to PyPI, see publishing.md.


This project was built from simple-modern-uv.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strif-3.0.1.tar.gz (34.0 kB view details)

Uploaded Source

Built Distribution

strif-3.0.1-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file strif-3.0.1.tar.gz.

File metadata

  • Download URL: strif-3.0.1.tar.gz
  • Upload date:
  • Size: 34.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for strif-3.0.1.tar.gz
Algorithm Hash digest
SHA256 a312ea9540062ccd386132ac2eece93c0c4adaeb25a5f5da11eed7867b1ff31e
MD5 0be2b66c8f327e7c423bf52e13241c92
BLAKE2b-256 5c63bf90e5632aea866c1bbd2567f04a7026191728c648ecf44ff489829898a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for strif-3.0.1.tar.gz:

Publisher: publish.yml on jlevy/strif

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file strif-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: strif-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for strif-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5dca44af3ff2f8a947a0398b619d138014f4d29a5aaf6c7e73fe7bec5f5973da
MD5 17814e6e9102c7a9470bf2e7d43a32b6
BLAKE2b-256 a4fd94ebfa518f46b473d238a8ad5837ce75cfe4ea0ce5a91544a88198d8a8d3

See more details on using hashes here.

Provenance

The following attestation bundles were made for strif-3.0.1-py3-none-any.whl:

Publisher: publish.yml on jlevy/strif

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page