A fast directory scanner.
Project description
scandir-rs
scandir_rs
is a directory iteration module like os.walk()
, but with more features and higher speed. Depending on the function call
it yields a list of paths, tuple of lists grouped by their entry type or DirEntry
objects that include file type and stat information along
with the name. Using scandir_rs
is about 2-17 times faster than os.walk()
(depending on the platform, file system and file tree structure)
by parallelizing the iteration in background.
If you are just interested in directory statistics you can use the submodule count
.
scandir_rs
contains following submodules:
count
for determining statistics of a directory.walk
for getting names of directory entries.scandir
for getting detailed stats of directory entries.
For the API see:
- Submodule
count
doc/count.md - Submodule
walk
doc/walk.md - Submodule
scandir
doc/scandir.md
Installation
For building this wheel from source you need Rust with channel nightly
and the tool maturin
.
Switch to channel nightly
:
rustup default nightly
Install maturin
:
cargo install maturin
Build wheel (not on Windows):
maturin build --release --strip
Build wheel on Windows:
maturin build --release --strip --no-sdist
maturin
will build the wheels for all Python versions installed on your system.
Building and running tests for different Python versions
To make it easier to build wheels for several different Python versions the script build_wheels.sh
has been added.
It creates wheels for Python versions 3.6, 3.7, 3.8 and 3.9. In addition it runs pytest
after successfull creation of each wheel.
To be able to run the script pyenv
needs to be installed first including all Python interpreter versions mentioned above.
Instruction how to install pyenv
can be found here.
Examples
Get statistics of a directory:
import scandir_rs as scandir
print(scandir.count.count("~/workspace", extended=True))
The same, but asynchronously in background using a class instance:
import scandir_rs as scandir
scanner = scandir.count.Count("~/workspace", extended=True))
scanner.start()) # Start background thread pool
...
value = scanner.statistics # Can be read at any time
...
scanner.stop() # If you want to cancel the scanner
and with a context manager:
import scandir_rs as scandir
C = scandir.count.Count("~/workspace", extended=True))
with C:
while C.busy():
statistics = C.statistics
# Do something
os.walk()
example:
import scandir_rs as scandir
for root, dirs, files in scandir.walk.Walk("~/workspace"):
# Do something
with extended data:
import scandir_rs as scandir
for root, dirs, files, symlinks, other, errors in scandir.walk.Walk("~/workspace",
return_type=scandir.RETURN_TYPE_EXT):
# Do something
os.scandir()
example:
import scandir_rs as scandir
for path, entry in scandir.scandir.Scandir("~/workspace",
return_type=scandir.RETURN_TYPE_EXT):
# entry is a custom DirEntry object
Benchmarks
In the below table the line scandir_rs.walk.Walk returns comparable results to os.walk.
Linux with Ryzen 5 2400G and SSD
Directory /usr with
- 83790 directories
- 671847 files
- 48480 symlinks
- 1278 hardlinks
- 0 devices
- 0 pipes
- 30.3GB size and 31.9GB usage on disk
Time [s] | Method |
---|---|
5.319 | os.walk (Python 3.8) |
13.351 | os.walk+os.stat (Python 3.8) |
0.918 | scandir_rs.count.count |
1.340 | scandir_rs.count.count(extended=True) |
0.812 | scandir_rs.count.Count |
1.663 | scandir_rs.walk.toc |
1.107 | scandir_rs.walk.Walk (iter) |
1.775 | scandir_rs.walk.Walk (collect) |
2.511 | scandir_rs.scandir.entries (RETURN_TYPE_FAST) |
2.561 | scandir_rs.scandir.entries (RETURN_TYPE_BASE) |
2.496 | scandir_rs.scandir.entries (RETURN_TYPE_EXT) |
2.881 | scandir_rs.scandir.entries (RETURN_TYPE_FULL) |
2.437 | scandir_rs.scandir.entries (iter, RETURN_TYPE_FULL) |
Directory linux-5.5.5 with
- 4391 directories
- 66459 files
- 35 symlinks
- 13 hardlinks
- 0 devices
- 0 pipes
- 870.7MB size and 1021.5MB usage on disk
Time [s] | Method |
---|---|
0.343 | os.walk (Python 3.8) |
0.966 | os.walk+os.stat (Python 3.8) |
0.067 | scandir_rs.count.count |
0.116 | scandir_rs.count.count(extended=True) |
0.067 | scandir_rs.count.Count |
0.155 | scandir_rs.walk.toc |
0.081 | scandir_rs.walk.Walk (iter) |
0.150 | scandir_rs.walk.Walk (collect) |
0.186 | scandir_rs.scandir.entries (RETURN_TYPE_FAST) |
0.201 | scandir_rs.scandir.entries (RETURN_TYPE_BASE) |
0.202 | scandir_rs.scandir.entries (RETURN_TYPE_EXT) |
0.260 | scandir_rs.scandir.entries (RETURN_TYPE_FULL) |
0.210 | scandir_rs.scandir.entries (iter, RETURN_TYPE_FULL) |
Up to ~5 times faster on Linux.
Windows 10 with Laptop Core i7-4810MQ @ 2.8GHz Laptop, MTF SSD
Directory C:\Windows with
- 130429 directories
- 426588 files
- 0 symlinks
- 53563 hardlinks
- 0 devices
- 0 pipes
- 49.8GB size and 50.9GB usage on disk
Time [s] | Method |
---|---|
96.544 | os.walk (Python 3.8) |
328.965 | os.walk+os.stat (Python 3.8) |
17.133 | scandir_rs.count.count |
90.272 | scandir_rs.count.count(extended=True) |
19.607 | scandir_rs.count.Count |
19.654 | scandir_rs.walk.toc |
18.203 | scandir_rs.walk.Walk (iter) |
19.704 | scandir_rs.walk.Walk (collect) |
88.183 | scandir_rs.scandir.entries (RETURN_TYPE_FAST) |
90.077 | scandir_rs.scandir.entries (RETURN_TYPE_BASE) |
90.704 | scandir_rs.scandir.entries (RETURN_TYPE_EXT) |
93.704 | scandir_rs.scandir.entries (RETURN_TYPE_FULL) |
90.340 | scandir_rs.scandir.entries (iter, RETURN_TYPE_FULL) |
Directory linux-5.5.5 with
- 4391 directories
- 66459 files
- 35 symlinks
- 13 hardlinks
- 0 devices
- 0 pipes
- 870.7MB size and 1021.5MB usage on disk
Time [s] | Method |
---|---|
0.343 | os.walk (Python 3.8) |
0.966 | os.walk+os.stat (Python 3.8) |
0.067 | scandir_rs.count.count |
0.116 | scandir_rs.count.count(extended=True) |
0.067 | scandir_rs.count.Count |
0.155 | scandir_rs.walk.toc |
0.081 | scandir_rs.walk.Walk (iter) |
0.150 | scandir_rs.walk.Walk (collect) |
0.186 | scandir_rs.scandir.entries (RETURN_TYPE_FAST) |
0.201 | scandir_rs.scandir.entries (RETURN_TYPE_BASE) |
0.202 | scandir_rs.scandir.entries (RETURN_TYPE_EXT) |
0.260 | scandir_rs.scandir.entries (RETURN_TYPE_FULL) |
0.210 | scandir_rs.scandir.entries (iter, RETURN_TYPE_FULL) |
Up to 6.7 times faster on Windows 10.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for scandir_rs-0.9.3-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3744700515e78bd2b2292b102689c14243c10701618a61a38fb8fde19f52d40 |
|
MD5 | c505ecea11f9c02d6cdff70d4e01e3d9 |
|
BLAKE2b-256 | 8d6e220cbd3a110dfa8121eedf4cf174cb56b65b11e0c5cf0f0859ae518130fd |
Hashes for scandir_rs-0.9.3-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 752c6ccd22d704f933c44c3c0b00b7dac07a24b24b4a0c6921ea21de6959778b |
|
MD5 | fe3208d548f0e9465016366cc798824c |
|
BLAKE2b-256 | a4b48fa799acb53d86786d48745be6ae9a5c14068093a890951c793df5f9a4f8 |
Hashes for scandir_rs-0.9.3-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d65541722633409d44467f97b399dbd91fa5a9172ee8f12f1aa6eda9213fa4d |
|
MD5 | 90e04f89b8fab4547f3a7193361f8a18 |
|
BLAKE2b-256 | ae3f6dfe744259a36eb506cc256002e55f366275eeec9ada60f78849b7e98a8f |
Hashes for scandir_rs-0.9.3-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f37700d0f032415d08f97ed8132a68ca12acf31a483b31de80ffa28a0845bcc1 |
|
MD5 | ee427da62f01d1966e80da2ef38698ce |
|
BLAKE2b-256 | b8c53bbc2a6cfc3139196f27169be1f20dea99690f494ba7e4ff1b4dd6e363db |
Hashes for scandir_rs-0.9.3-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d6d7c833bc93c956811238856743ac0b24d2f7a3744b88ffff0d7885906fb74 |
|
MD5 | 090c8650bd08a66dca8ac57531f1fbe6 |
|
BLAKE2b-256 | 6dec76c408d999c09abbec4c14d774cc5efe854d57a0122880a7504c646d7b1a |
Hashes for scandir_rs-0.9.3-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6221442f5fbe624201952d510940143ad941107d873203b794850b8816a7ee3 |
|
MD5 | 342bdc439493c25ceefd9e7980816c8a |
|
BLAKE2b-256 | aa1313a8be699008a443fb1b9b4c31bc343537f8b8e79a57e7777a8b3159b144 |