Skip to main content

Assorted filesystem related utility functions, some of which have been bloating cs.fileutils for too long.

Project description

Assorted filesystem related utility functions, some of which have been bloating cs.fileutils for too long.

Latest release 20260610: HasFSPath: make such objects PathLike with an __fspath__ method returning self.fspath.

Short summary:

  • atomic_directory: RUn code in a temporary directory, which will be renamed to the target directory if no exception occurs.
  • findup: Walk up the filesystem tree looking for a directory where criterion(fspath) is not None, where fspath starts at dirpath. Return the result of criterion(fspath). Return None if no such path is found.
  • fnmatchdir: Return a list of the names in dirpath matching the glob fnglob.
  • FSPathBasedSingleton: The basis for a SingletonMixin based on realpath(self.fspath).
  • HasFSPath: A mixin for an object with a .fspath attribute representing a filesystem location.
  • is_valid_rpath: Test that rpath is a clean relative path with no funny business.
  • longpath: Return path with prefixes and environment variables substituted. The converse of shortpath().
  • needdir: Create the directory dirpath if missing. Return True if the directory was made, False otherwise.
  • RemotePath: A representation of a remote filesystem path (local if host is None).
  • remove_protecting: Remove the file at rmpath while protecting safepath from destruction.
  • rpaths: A shim for scandirtree to yield relative file paths from a directory.
  • scandirpaths: A shim for scandirtree to yield filesystem paths from a directory.
  • scandirtree: Generator to recurse over dirpath, yielding (is_dir,subpath) for all selected subpaths.
  • shortpath: Return fspath with the first matching leading prefix replaced.
  • update_linkdir: Update linkdirpath with symlinks to paths. Remove unused names if trim. Return a mapping of names in linkdirpath to absolute forms of paths.
  • validate_rpath: Test that rpath is a clean relative path with no funny business; raise ValueError if the test fails.

Module contents:

  • atomic_directory(dirpath_or_func: Union[str, Callable], *, make_placeholder=False): RUn code in a temporary directory, which will be renamed to the target directory if no exception occurs.

    Parameters:

    • make_placeholder: optional flag, default False: if true an empty directory will be make at the target name and after completion it will be removed and the completed directory renamed to the target name

    This may be used as a context manager or as a decorator.

    As a contextmanager:

    with atimoc_directory(target_directory) as tmpdirpath:
        do work inside tmpdirpath
    

    This will rename the tmpdirpath to target_directory on exit.

    As a decorator:

    @atomic_directory
    def produce_dir_content(tmpdirpath,.....):
    

    This produces a function which will accept a target directory path as its first argument and calls produce_dir_content with the temporary directory. On return the temporary directory will be renamed to the target directory.

  • findup(dirpath: str, criterion: Union[str, Callable[[str], Any]]) -> str: Walk up the filesystem tree looking for a directory where criterion(fspath) is not None, where fspath starts at dirpath. Return the result of criterion(fspath). Return None if no such path is found.

    Parameters:

    • dirpath: the starting directory
    • criterion: a str or a callable accepting a str

    If criterion is a str, look for the existence of os.path.join(fspath,criterion).

    Example:

    # find a directory containing a `.envrc` file
    envrc_path = findup('.', '.envrc')
    
    # find a Tagger rules file for the Downloads directory
    rules_path = findup(expanduser('~/Downloads', '.taggerrc')
    
  • fnmatchdir(dirpath, fnglob): Return a list of the names in dirpath matching the glob fnglob.

  • class FSPathBasedSingleton(cs.obj.SingletonMixin, HasFSPath, cs.deco.Promotable): The basis for a SingletonMixin based on realpath(self.fspath).

FSPathBasedSingleton.__init__(self, fspath: Optional[str] = None, lock=None): Initialise the singleton:

On the first call:

  • set .fspath to self._resolve_fspath(fspath)
  • set ._lock to lock (or cs.threads.NRLock() if not specified)

FSPathBasedSingleton.fspath_normalised(fspath: str): Return the normalised form of the filesystem path fspath, used as the key for the singleton registry.

This default returns realpath(fspath).

As a contracting example, the cs.ebooks.kindle.classic.KindleTree class tries to locate the directory containing the book database, and returns its realpath, allowing some imprecision.

FSPathBasedSingleton.promote(obj): Promote None or str to a CalibreTree.

  • class HasFSPath(os.PathLike): A mixin for an object with a .fspath attribute representing a filesystem location.

    The __init__ method just sets the .fspath attribute, and need not be called if the main class takes care of that itself.

HasFSPath.__init__(self, fspath): Save fspath as .fspath; often done by the parent class.

HasFSPath.__fspath__(self): Return the filesystem path string, for os.PathLike.

HasFSPath.fnmatch(self, fnglob): Return a list of the names in self.fspath matching the glob fnglob.

HasFSPath.listdir(self): Return os.listdir(self.fspath).

HasFSPath.pathto(self, *subpaths): The full path to subpaths, comprising a relative path below self.fspath. This is a shim for os.path.join which requires that all the subpaths be relative paths.

HasFSPath.shortpath: The short version of self.fspath.

  • is_valid_rpath(rpath, log=None) -> bool: Test that rpath is a clean relative path with no funny business.

    This is a Boolean wrapper for validate_rpath().

  • longpath(path, prefixes=None): Return path with prefixes and environment variables substituted. The converse of shortpath().

  • needdir(dirpath, mode=511, *, use_makedirs=False, log=None) -> bool: Create the directory dirpath if missing. Return True if the directory was made, False otherwise.

    Parameters:

    • dirpath: the required directory path
    • mode: the permissions mode, default 0o777
    • log: log makedirs or mkdir call
    • use_makedirs: optional creation mode, default False; if true, use os.makedirs, otherwise os.mkdir
  • class RemotePath(RemotePath, HasFSPath, cs.deco.Promotable): A representation of a remote filesystem path (local if host is None).

    This is useful for things like rsync targets.

RemotePath.__init__(self, host, fspath): dummy init since namedtuple does not have one

RemotePath.__str__(self): Return the string form of this path.

RemotePath.from_str(pathspec: <staticmethod(<function RemotePath.str at 0x10f62d3a0>)>): Produce a RemotePathfrompathspec, a path with an optional leading [user@]rhost:` prefix.

RemotePath.from_tuple(cls, host_fspath: tuple): Produce a RemotePathfromhost_fspath, a (host,fspath)` 2-tuple.

RemotePath.str(host, fspath): Return the string form of a remote path.

  • remove_protecting(rmpath, safepath): Remove the file at rmpath while protecting safepath from destruction.

    This is for situations such as "merging" two equivalent files where the "source" file (rmpath) is to be removed, leaving the destination file (safepath). It can be that these are the same file (not merely links to the same file, but the same link/name); this is surprisingly hard to detect, and removing the source will then destroy the destination.

    Instead of checking carefully and unreliably, we instead make a "safe" hard link of the destination, remove the source, then try to link the safe link back to the destination. If that succeeds, we have recovered from the destruction. If that fails with FileExistsError then the destruction did not occur. Both are good. Other exceptions are released with an accompanying note about the path to the "safe" link.

    Example use:

    if srcpath != dstpath and same_content(srcpath, dstpath):
        remove_protecting(srcpath, dstpath)
    
  • rpaths(dirpath='.', **scan_kw): A shim for scandirtree to yield relative file paths from a directory.

    Parameters:

    • dirpath: optional top directory, default '.'

    Other keyword arguments are passed to scandirtree.

  • scandirpaths(dirpath='.', **scan_kw): A shim for scandirtree to yield filesystem paths from a directory.

    Parameters:

    • dirpath: optional top directory, default '.'

    Other keyword arguments are passed to scandirtree.

  • scandirtree(dirpath='.', *, include_dirs=False, name_selector=None, only_suffixes=None, skip_suffixes=None, sort_names=False, follow_symlinks=False, recurse=True): Generator to recurse over dirpath, yielding (is_dir,subpath) for all selected subpaths.

    Parameters:

    • dirpath: the directory to scan, default '.'
    • include_dirs: if true yield directories; default False
    • name_selector: optional callable to select particular names; the default is to select names not starting with a dot ('.')
    • only_suffixes: if supplied, skip entries whose extension is not in only_suffixes
    • skip_suffixes: if supplied, skip entries whose extension is in skip_suffixes
    • sort_names: option flag, default False; yield entires in lexical order if true
    • follow_symlinks: optional flag, default False; passed to scandir
    • recurse: optional flag, default True; if true, recurse into subdrectories
  • shortpath(fspath, prefixes=None, *, collapseuser=False, foldsymlinks=False): Return fspath with the first matching leading prefix replaced.

    Parameters:

    • prefixes: optional list of (prefix,subst) pairs
    • collapseuser: optional flag to enable detection of user home directory paths; default False
    • foldsymlinks: optional flag to enable detection of convenience symlinks which point deeper into the path; default False

    The prefixes is an optional iterable of (prefix,subst) to consider for replacement. Each prefix is subject to environment variable substitution before consideration. The default prefixes is from SHORTPATH_PREFIXES_DEFAULT: (('$HOME/', '~/'),).

  • update_linkdir(linkdirpath: str, paths: Iterable[str], trim=False): Update linkdirpath with symlinks to paths. Remove unused names if trim. Return a mapping of names in linkdirpath to absolute forms of paths.

    My example use is maintaining a small directory of wallpapers to shuffle, selected from a reference image tree.

  • validate_rpath(rpath: str): Test that rpath is a clean relative path with no funny business; raise ValueError if the test fails.

    Tests:

    • not empty or '.' or '..'
    • not an absolute path
    • normalised
    • does not walk up out of its parent directory

    Examples:

    >>> validate_rpath('')
    False
    >>> validate_rpath('.')
    

Release Log

Release 20260610: HasFSPath: make such objects PathLike with an __fspath__ method returning self.fspath.

Release 20260526: atomic_directory: should now be usable as a decorator and also as a contextmanager.

Release 20250728:

  • remove_protecting: truncate the portion of the basename in the temp file prefix, can blow the filesystem limit.
  • shortpath: cope if there is no pwd.getpwuid function (nonUNIX).

Release 20250528: New remove_protecting(rmpath,safepath): remove rmpath while protecting safepath from destruction.

Release 20250414:

  • HasFSPath: provide an lt method which compares the .fspath attributes to facilitate sorting.
  • New RemotePath(host,fspath) namedtuple subclass for [host:]fspath path specifications.

Release 20250325: New update_linkdir() imported from my wpr script.

Release 20241122:

  • FSPathBasedSingleton: add a .promote method to promote a filesystem path to an instance.
  • FSPathBasedSingleton: new fspath_normalised(fspath) class method to produce a normalised form of the fspath for use as the key to the singleton registry.

Release 20241007: FSPathBasedSingleton.init: use an NRLock for the default lock, using a late import with fallback to Lock.

Release 20241005: needdir: now returns True if the directory was made, False if it already existed.

Release 20240630: FSPathBasedSingleton: recent Pythons seem to check that init returns None, subclasses must test another way.

Release 20240623:

  • shortpath(foldsymlinks=True): only examine symlinks which have clean subpaths in their link text - this avoids junk and also avoids stat()ing links which might be symlinks to mount points which might be offline.
  • scandirtree: clean up the logic, possibly fix repeated mention of directories.

Release 20240522: shortpath: new collapseuser=False, foldsymlinks=False parameters, rename DEFAULT_SHORTEN_PREFIXES to SHORTPATH_PREFIXES_DEFAULT.

Release 20240422: New scandirtree scandir based version of os.walk, yielding (is_dir,fspath). New shim scandirpaths.

Release 20240412: HasFSPath: explain that the init is optional in the docstring.

Release 20240316: Fixed release upload artifacts.

Release 20240201:

  • FSPathBasedSingleton: drop the default_factory parameter/attribute, let default_attr specify a callable.
  • Singleton._resolve_fspath: fix reference to class name.

Release 20231129:

  • HasFSPath: new listdir method.
  • HasFSPath.pathto: accept multiple relative subpaths.
  • FSPathBasedSingleton: accept cls.FSPATH_FACTORY as a factory function for the default fspath, makes it possible to defer the path lookup.
  • Replace is_clean_subpath with validate_rpath/is_valid_rpath pair.

Release 20230806:

  • Reimplement fnmatchdir using fnmatch.filter.
  • No longer claim Python 2 compatibility.

Release 20230401: HasFSPath.shortpath: hand call before .fspath set.

Release 20221221: Replace use of cs.env.envsub with os.path.expandvars and drop unused environ parameter.

Release 20220918:

  • FSPathBasedSingleton.init: return True on the first call, False on subsequent calls.
  • FSPathBasedSingleton.init: probe dict for '_lock' instead of using hasattr (which plays poorly this early on with classes with their own getattr).
  • needdir: accept optional log parameter to log mkdir or makedirs.
  • HasFSPath: add a default str.

Release 20220805: Doc update.

Release 20220530: FSPathBasedSingleton._resolve_fspath: new envvar and default_attr parameters.

Release 20220429:

  • New HasFSPath and FSPathBasedSingleton.
  • Add longpath and shortpath from cs.fileutils.
  • New is_clean_subpath(subpath).
  • New needdir(path).
  • New fnmatchdir(dirpath,fnglob) pulled out from HasFSPath.fnmatch(fnglob).

Release 20220327: New module cs.fs to contain more filesystem focussed functions than cs.fileutils, which is feeling a bit bloated.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cs_fs-20260610.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cs_fs-20260610-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file cs_fs-20260610.tar.gz.

File metadata

  • Download URL: cs_fs-20260610.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for cs_fs-20260610.tar.gz
Algorithm Hash digest
SHA256 7e8d148df94349fce6b490f50568cddfc1f9d4294e32768f469f72d3758f92fd
MD5 f75a98b2902a6a81b37f347164f080f5
BLAKE2b-256 d462bd2df7cbd2f4f9cac47e0ea4dc4d105f729f00d290a6450a848ad38c3ed3

See more details on using hashes here.

File details

Details for the file cs_fs-20260610-py3-none-any.whl.

File metadata

  • Download URL: cs_fs-20260610-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for cs_fs-20260610-py3-none-any.whl
Algorithm Hash digest
SHA256 6cedc04fee5003f35f6c2373b25f41151cbdab02748884420a4a7d9fbcf264ad
MD5 31d81e13b07c6ba4d2a259621c4c077f
BLAKE2b-256 121bd4871825f4efe66e5fce54991897c658700b5caa897721cf52ccb4ca2d14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page