Threaded directory iteration via os.scandir() with progress indicator and resume function.
Project description
IterFilesystem
Multiprocess directory iteration via os.scandir():
“stats” processes:
only counts up all directories and files.
accumulates the sizes of all files.
“worker” process:
Filesystem walk and process the real action with dir/files
among other things these packages are used:
progress bar tqdm
Requirement:
Python 3.6 or newer.
Pipenv. Packages and virtual environment manager.
Please: try, fork and contribute! ;)
Example
Use example CLI, e.g.:
~$ git clone https://github.com/jedie/IterFilesystem.git ~$ cd IterFilesystem ~/IterFilesystem$ pipenv install ~/IterFilesystem$ pipenv shell (IterFilesystem) ~/IterFilesystem$ print_fs_stats --help (IterFilesystem) ~/IterFilesystem$ pip install -e . ... Successfully installed iterfilesystem (IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --help usage: print_fs_stats.py [-h] [-v] [--path PATH] [--skip_dirs [SKIP_DIRS [SKIP_DIRS ...]]] [--skip_filenames [SKIP_FILENAMES [SKIP_FILENAMES ...]]] Scan filesystem and print some information optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit --path PATH The file path that should be scanned e.g.: "~/foobar/" default is "~" --skip_dirs [SKIP_DIRS [SKIP_DIRS ...]] Directory names to exclude from scan. --skip_filenames [SKIP_FILENAMES [SKIP_FILENAMES ...]] File names to ignore.
example output looks like this:
(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --path ~/IterFilesystem --skip_dirs .tox .pytest_cache $ print_fs_stats --path ~/repos/IterFilesystem --skip_dirs .tox .pytest_cache Read/process: '~/repos/IterFilesystem'... Skip directories: * .tox * .pytest_cache No files will be skipped. ... Filesystem items..: 100%|██████████████████████████████████████████|633/633 13185.18entries/s [00:00<00:00, 13185.18entries/s] File sizes........: 100%|█████████████████████████████████████████████████████████████|2.22M/2.22M [00:00<00:00, 48.6MBytes/s] Average progress..: 100%|████████████████████████████████████████████████████████████████████████████████████████|00:00<00:00 Current File......:, ~/repos/IterFilesystem/Pipfile Processed 633 filesystem items in 0.06 sec SHA515 hash calculated over all file content: 79f2b0587e147b1c7d8581ea3597039a9e6d0c79ff10ea3bfd499cc60bc48892507437dd00da3c311280b4305c75459dbb122ebbec6b3b0445ce595b47c9f4a8 File count: 428 Total file size: 2.2 MB
History
dev - compare v1.0.0…master
TBC
12.10.2019 - compare v0.2.0…v1.0.0
refactoring:
don’t use persist-queue
switch from threading to multiprocessing
enhance progress display with multiple tqdm process bars
15.09.2019 - compare v0.1.0…v0.2.0
store persist queue in temp directory
Don’t catch process_path_item errors, this should be made in child class
15.09.2019 - compare v0.0.1…v0.1.0
add some project meta files and tests
setup CI
fix tests
15.09.2019 - v0.0.1
first Release on PyPi
Links
Donating
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for iterfilesystem-1.0.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2bcd0f8439ece8ac8f740c32e967ba9ee852604c6d880ed6c5f95aaa3ef43d4b |
|
MD5 | 3d2f6caa8b7ac12497ca3475b84a220f |
|
BLAKE2b-256 | 17898ce08dfad5a4f05d3bf1c74bc8bd0985ae503d1bc340ce78907faf2e0d07 |