Flexible recursive directory iterator: scandir meets glob("**", recursive=True)
Project description
scantree
Recursive directory iterator supporting:
- flexible filtering including wildcard path matching
- in memory representation of file-tree (for repeated access)
- efficient access to directory entry properties (
posix.DirEntry
interface) extended with real path and path relative to the recursion root directory - detection and handling of cyclic symlinks
Installation
pip install scantree
Usage
See source code for full documentation, some generic examples below.
Get matching file paths:
from scantree import scantree, RecursionFilter tree = scantree('/path/to/dir', RecursionFilter(match=['*.txt'])) print([path.relative for path in tree.filepaths()]) print([path.real for path in tree.filepaths()])
['d1/d2/file3.txt', 'd1/file2.txt', 'file1.txt']
['/path/to/other_dir/file3.txt', '/path/to/dir/d1/file2.txt', '/path/to/dir/file1.txt']
Access metadata of directory entries in file tree:
d2 = tree.directories[0].directories[0] print(type(d2)) print(d2.path.absolute) print(d2.path.real) print(d2.path.is_symlink()) print(d2.files[0].relative)
scantree._node.DirNode
/path/to/dir/d1/d2
/path/to/other_dir
True
d1/d2/file3.txt
Aggregate information by operating on tree:
hello_count = tree.apply( file_apply=lambda path: sum([ w.lower() == 'hello' for w in path.as_pathlib().read_text().split() ]), dir_apply=lambda dir_: sum(dir_.entries), ) print(hello_count)
3
hello_count_tree = tree.apply( file_apply=lambda path: { 'name': path.name, 'count': sum([ w.lower() == 'hello' for w in path.as_pathlib().read_text().split() ]) }, dir_apply=lambda dir_: { 'name': dir_.path.name, 'count': sum(e['count'] for e in dir_.entries), 'sub_counts': [e for e in dir_.entries] }, ) from pprint import pprint pprint(hello_count_tree)
{'count': 3,
'name': 'dir',
'sub_counts': [{'count': 2, 'name': 'file1.txt'},
{'count': 1,
'name': 'd1',
'sub_counts': [{'count': 1, 'name': 'file2.txt'},
{'count': 0,
'name': 'd2',
'sub_counts': [{'count': 0,
'name': 'file3.txt'}]}]}]}
Flexible filtering:
without_hidden_files = scantree('.', RecursionFilter(match=['*', '!.*'])) without_palindrome_linked_dirs = scantree( '.', lambda paths: [ p for p in paths if not ( p.is_dir() and p.is_symlink() and p.name == p.name[::-1] ) ] )
Comparison:
tree = scandir('path/to/dir') # make some operations on filesystem, make sure file tree is the same: assert tree == scandir('path/to/dir') # tree contains absolute/real path info: import shutil shutil.copytree('path/to/dir', 'path/to/other_dir') new_tree = scandir('path/to/other_dir') assert tree != new_tree assert ( [p.relative for p in tree.leafpaths()] == [p.relative for p in new_tree.leafpaths()] )
Inspect symlinks:
from scantree import CyclicLinkedDir file_links = [] dir_links = [] cyclic_links = [] def file_apply(path): if path.is_symlink(): file_links.append(path) def dir_apply(dir_node): if dir_node.path.is_symlink(): dir_links.append(dir_node.path) if isinstance(dir_node, CyclicLinkedDir): cyclic_links.append((dir_node.path, dir_node.target_path)) scantree('.', file_apply=file_apply, dir_apply=dir_apply)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size scantree-0.0.1.tar.gz (13.4 kB) | File type Source | Python version None | Upload date | Hashes View |