Skip to main content

convenience functions and classes for files and filenames/pathnames

Project description

Assorted convenience functions for files and filenames/pathnames.

Function abspath_from_file(path, from_file)

Return the absolute path of path with respect to from_file, as one might do for an include file.

Class BackedFile

MRO: ReadMixin
A RawIOBase duck type which uses a backing file for initial data and writes new data to a front scratch file.

Method BackedFile.__init__(self, back_file, dirpath=None)

Initialise the BackedFile using back_file for the backing data.

Class BackedFile_TestMethods

Mixin for testing subclasses of BackedFile. Tests self.backed_fp.

Function compare(f1, f2, mode='rb')

Compare the contents of two file-like objects f1 and f2 for equality.

If f1 or f2 is a string, open the named file using mode (default: "rb").

Function copy_data(fpin, fpout, nbytes, rsize=None)

Copy nbytes of data from fpin to fpout, return the number of bytes copied.

Parameters:

  • nbytes: number of bytes to copy. If None, copy until EOF.
  • rsize: read size, default DEFAULT_READSIZE.

Function datafrom_fd(fd, offset, readsize=None, aligned=True, maxlength=None)

General purpose reader for file descriptors yielding data from offset. This does not move the file offset.

Function file_data(fp, nbytes=None, rsize=None)

Read nbytes of data from fp and yield the chunks as read.

Parameters:

  • nbytes: number of bytes to read; if None read until EOF.
  • rsize: read size, default DEFAULT_READSIZE.

Function files_property(func)

A property whose value reloads if any of a list of files changes.

This is just the default mode for make_files_property(). func accepts the file path and returns the new value. The underlying attribute name is '_' + func.name, the default from make_files_property(). The attribute {attr_name}_lock controls access to the property. The attributes {attr_name}_filestates and {attr_name}_paths track the associated file states. The attribute {attr_name}_lastpoll tracks the last poll time.

The decorated function is passed the current list of files and returns the new list of files and the associated value. One example use would be a configuration file with recurive include operations; the inner function would parse the first file in the list, and the parse would accumulate this filename and those of any included files so that they can be monitored, triggering a fresh parse if one changes. Example:

class C(object):
  def __init__(self):
    self._foo_path = '.foorc'
  @files_property
  def foo(self,paths):
    new_paths, result = parse(paths[0])
    return new_paths, result

The load function is called on the first access and on every access thereafter where an associated file's FileState() has changed and the time since the last successful load exceeds the poll_rate (1s). An attempt at avoiding races is made by ignoring reloads that raise exceptions and ignoring reloads where files that were stat()ed during the change check have changed state after the load.

Function lines_of(fp, partials=None)

Generator yielding lines from a file until EOF. Intended for file-like objects that lack a line iteration API.

Function lockfile(path, ext=None, poll_interval=None, timeout=None, runstate=None)

A context manager which takes and holds a lock file.

Parameters:

  • path: the base associated with the lock file.
  • ext: the extension to the base used to construct the lock file name. Default: ".lock"
  • timeout: maximum time to wait before failing. Default: None (wait forever).
  • poll_interval: polling frequency when timeout is not 0.
  • runstate: optional RunState duck instance supporting cancellation.

Function longpath(path, environ=None, prefixes=None)

Return path with prefixes and environment variables substituted. The converse of shortpath().

Function make_files_property(attr_name=None, unset_object=None, poll_rate=1.0)

Construct a decorator that watches multiple associated files.

Parameters:

  • attr_name: the underlying attribute, default: '_' + func.name
  • unset_object: the sentinel value for "uninitialised", default: None
  • poll_rate: how often in seconds to poll the file for changes, default: 1

The attribute {attr_name}_lock controls access to the property. The attributes {attr_name}_filestates and {attr_name}_paths track the associated files' state. The attribute {attr_name}_lastpoll tracks the last poll time.

The decorated function is passed the current list of files and returns the new list of files and the associated value. One example use would be a configuration file with recurive include operations; the inner function would parse the first file in the list, and the parse would accumulate this filename and those of any included files so that they can be monitored, triggering a fresh parse if one changes. Example:

class C(object):
  def __init__(self):
    self._foo_path = '.foorc'
  @files_property
  def foo(self,paths):
    new_paths, result = parse(paths[0])
    return new_paths, result

The load function is called on the first access and on every access thereafter where an associated file's FileState() has changed and the time since the last successful load exceeds the poll_rate (default 1s). An attempt at avoiding races is made by ignoring reloads that raise exceptions and ignoring reloads where files that were stat()ed during the change check have changed state after the load.

Function makelockfile(path, ext=None, poll_interval=None, timeout=None, runstate=None)

Create a lockfile and return its path.

The lockfile can be removed with os.remove. This is the core functionality supporting the lockfile() context manager.

Paramaters:

  • path: the base associated with the lock file, often the filesystem object whose access is being managed.
  • ext: the extension to the base used to construct the lockfile name. Default: ".lock"
  • timeout: maximum time to wait before failing. Default: None (wait forever). Note that zero is an accepted value and requires the lock to succeed on the first attempt.
  • poll_interval: polling frequency when timeout is not 0.
  • runstate: optional RunState duck instance supporting cancellation. Note that if a cancelled RunState is provided no attempt will be made to make the lockfile.

Function max_suffix(dirpath, pfx)

Compute the highest existing numeric suffix for names starting with the prefix pfx.

This is generally used as a starting point for picking a new numeric suffix.

Function mkdirn(path, sep='')

Create a new directory named path+sep+n, where n exceeds any name already present.

Parameters:

  • path: the basic directory path.
  • sep: a separator between path and n. Default: ""

Class NullFile

Writable file that discards its input.

Note that this is not an open of os.devnull; it just discards writes and is not the underlying file descriptor.

Method NullFile.__init__(self)

Initialise the file offset to 0.

Class Pathname

MRO: builtins.str
Subclass of str presenting convenience properties useful for format strings related to file paths.

Function poll_file(path, old_state, reload_file, missing_ok=False)

Watch a file for modification by polling its state as obtained by FileState(). Call reload_file(path) if the state changes. Return (new_state, reload_file(path)) if the file was modified and was unchanged (stable state) beofre and after the reload_file(). Otherwise return (None, None).

This may raise an OSError if the path cannot be os.stat()ed and of course for any exceptions that occur calling reload_file.

If missing_ok is true then a failure to os.stat() which raises OSError with ENOENT will just return (None, None).

Function read_data(fp, nbytes, rsize=None)

Read nbytes of data from fp, return the data.

Parameters:

  • nbytes: number of bytes to copy. If None, copy until EOF.
  • rsize: read size, default DEFAULT_READSIZE.

Function read_from(fp, rsize=None, tail_mode=False, tail_delay=None)

Generator to present text or data from an open file until EOF.

Parameters:

  • rsize: read size, default: DEFAULT_READSIZE
  • tail_mode: if true, yield an empty chunk at EOF, allowing resumption if the file grows.

Class ReadMixin

Useful read methods to accomodate modes not necessarily available in a class.

Note that this mixin presumes that the attribute self._lock is a threading.RLock like context manager.

Classes using this mixin should consider overriding the default .datafrom method with something more efficient or direct.

Function rewrite(filepath, data, mode='w', backup_ext=None, do_rename=False, do_diff=None, empty_ok=False, overwrite_anyway=False)

Rewrite the file filepath with data from the file object data.

Parameters:

  • empty_ok: if not true, raise ValueError if the new data are empty. Default: False.
  • overwrite_anyway: if true (default False), skip the content check and overwrite unconditionally.
  • backup_ext: if a nonempty string, take a backup of the original at filepath + backup_ext.
  • do_diff: if not None, call do_diff(filepath, tempfile).
  • do_rename: if true (default False), rename the temp file to filepath after copying the permission bits. Otherwise (default), copy the tempfile to filepath.

Function rewrite_cmgr(pathname, mode='w', backup_ext=None, keep_backup=False, do_rename=False, do_diff=None, empty_ok=False, overwrite_anyway=False)

Rewrite a file, presented as a context manager.

Parameters:

  • mode: file write mode, defaulting to "w" for text.
  • backup_ext: backup extension. None means no backup. An empty string generates an extension based on the current time.
  • keep_backup: keep the backup file even if everything works.
  • do_rename: rename the temporary file to the original to update.
  • do_diff: call do_diff(pathname, tempfile) before commiting.
  • empty_ok: do not consider empty output an error.
  • overwrite_anyway: do not update the original if the new data are identical.

Class RWFileBlockCache

A scratch file for storing data.

Method RWFileBlockCache.__init__(self, pathname=None, dirpath=None, suffix=None, lock=None)

Initialise the file.

Parameters:

  • pathname: path of file. If None, create a new file with tempfile.mkstemp using dir=dirpath and unlink that file once opened.
  • dirpath: location for the file if made by mkstemp as above.
  • lock: an object to use as a mutex, allowing sharing with some outer system. A Lock will be allocated if omitted.

Function saferename(oldpath, newpath)

Rename a path using os.rename(), but raise an exception if the target path already exists. Note: slightly racey.

Function seekable(fp)

Try to test if a filelike object is seekable.

First try the .seekable method from IOBase, otherwise try getting a file descriptor from fp.fileno and stat()ing that, otherwise return False.

Function shortpath(path, environ=None, prefixes=None)

Return path with the first matching leading prefix replaced.

Parameters:

  • environ: environment mapping if not os.environ
  • prefixes: iterable of (prefix, subst) to consider for replacement; each prefix is subject to environment variable substitution before consideration The default considers "$HOME/" for replacement by "~/".

Class Tee

An object with .write, .flush and .close methods which copies data to multiple output files.

Method Tee.__init__(self, *fps)

Initialise the Tee; any arguments are taken to be output file objects.

Function tee(fp, fp2)

Context manager duplicating .write and .flush from fp to fp2.

Function tmpdir()

Return the pathname of the default temporary directory for scratch data, $TMPDIR or '/tmp'.

Function tmpdirn(tmp=None)

Make a new temporary directory with a numeric suffix.

Function trysaferename(oldpath, newpath)

A saferename() that returns True on success, False on failure.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cs.fileutils-20190617.tar.gz (21.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page