Flexible recursive directory iterator: scandir meets glob("**", recursive=True)
Project description
scantree
Recursive directory iterator supporting:
- flexible filtering including wildcard path matching
- in memory representation of file-tree (for repeated access)
- efficient access to directory entry properties (
os.DirEntryinterface) extended with real path and path relative to the recursion root directory - detection and handling of cyclic symlinks
Installation
pip install scantree
Usage
See source code for full documentation, some generic examples below.
Get matching file paths:
from scantree import scantree, RecursionFilter
tree = scantree('/path/to/dir', RecursionFilter(match=['*.txt']))
print([path.relative for path in tree.filepaths()])
print([path.real for path in tree.filepaths()])
['d1/d2/file3.txt', 'd1/file2.txt', 'file1.txt']
['/path/to/other_dir/file3.txt', '/path/to/dir/d1/file2.txt', '/path/to/dir/file1.txt']
Access metadata of directory entries in file tree:
d2 = tree.directories[0].directories[0]
print(type(d2))
print(d2.path.absolute)
print(d2.path.real)
print(d2.path.is_symlink())
print(d2.files[0].relative)
scantree._node.DirNode
/path/to/dir/d1/d2
/path/to/other_dir
True
d1/d2/file3.txt
Aggregate information by operating on tree:
hello_count = tree.apply(
file_apply=lambda path: sum([
w.lower() == 'hello' for w in
path.as_pathlib().read_text().split()
]),
dir_apply=lambda dir_: sum(dir_.entries),
)
print(hello_count)
3
hello_count_tree = tree.apply(
file_apply=lambda path: {
'name': path.name,
'count': sum([
w.lower() == 'hello'
for w in path.as_pathlib().read_text().split()
])
},
dir_apply=lambda dir_: {
'name': dir_.path.name,
'count': sum(e['count'] for e in dir_.entries),
'sub_counts': [e for e in dir_.entries]
},
)
from pprint import pprint
pprint(hello_count_tree)
{'count': 3,
'name': 'dir',
'sub_counts': [{'count': 2, 'name': 'file1.txt'},
{'count': 1,
'name': 'd1',
'sub_counts': [{'count': 1, 'name': 'file2.txt'},
{'count': 0,
'name': 'd2',
'sub_counts': [{'count': 0,
'name': 'file3.txt'}]}]}]}
Flexible filtering:
without_hidden_files = scantree('.', RecursionFilter(match=['*', '!.*']))
without_palindrome_linked_dirs = scantree(
'.',
lambda paths: [
p for p in paths if not (
p.is_dir() and
p.is_symlink() and
p.name == p.name[::-1]
)
]
)
Comparison:
tree = scandir('path/to/dir')
# make some operations on filesystem, make sure file tree is the same:
assert tree == scandir('path/to/dir')
# tree contains absolute/real path info:
import shutil
shutil.copytree('path/to/dir', 'path/to/other_dir')
new_tree = scandir('path/to/other_dir')
assert tree != new_tree
assert (
[p.relative for p in tree.leafpaths()] ==
[p.relative for p in new_tree.leafpaths()]
)
Inspect symlinks:
from scantree import CyclicLinkedDir
file_links = []
dir_links = []
cyclic_links = []
def file_apply(path):
if path.is_symlink():
file_links.append(path)
def dir_apply(dir_node):
if dir_node.path.is_symlink():
dir_links.append(dir_node.path)
if isinstance(dir_node, CyclicLinkedDir):
cyclic_links.append((dir_node.path, dir_node.target_path))
scantree('.', file_apply=file_apply, dir_apply=dir_apply)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scantree-0.0.4.tar.gz.
File metadata
- Download URL: scantree-0.0.4.tar.gz
- Upload date:
- Size: 24.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15bd5cb24483b04db2c70653604e8ea3522e98087db7e38ab8482f053984c0ac
|
|
| MD5 |
ab01b6a5f7fa8d372e55315f6a5e3973
|
|
| BLAKE2b-256 |
b3e440998faefc72ba1ddeb640a44fba92935353525dba110488806da8339c0b
|
File details
Details for the file scantree-0.0.4-py3-none-any.whl.
File metadata
- Download URL: scantree-0.0.4-py3-none-any.whl
- Upload date:
- Size: 20.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7616ab65aa6b7f16fcf8e6fa1d9afaa99a27ab72bba05c61b691853b96763174
|
|
| MD5 |
e2911ec7e92aa6b274eff65c0417c8ee
|
|
| BLAKE2b-256 |
93ce828467ddfa0d2fe473673026442d2032d552a168e42cfbf25fd0e5264e0c
|