Skip to main content

A fast file tree scanner written in Rust

Project description

scandir-rs

The Python module is called scandir_rs and installable via pip. It is an alternative to os.walk() and os.scandir() with more features and higher speed. On Linux it is 3 - 11 times faster and on Windows 6 - 70 time faster (see benchmarks).
It releases the GIL and the scanning is done in a background thread. With different methods intermediate results can be read.

If you are just interested in directory statistics you can use the Count.

scandir_rs contains following classes:

  • Count for determining statistics of a directory.
  • Walk for getting names of directory entries.
  • Scandir for getting detailed stats of directory entries.

For the API see:

Installation

For building this wheel from source you need the tool maturin.

Install maturin:

cargo install maturin

IMPORTANT: In order to build this project at least Rust version 1.61 is needed!

Build wheel:

Change to directory pyscandir.

Build wheel (on Linux):

maturin build --release --strip

Build wheel on Windows:

maturin build --release --strip --no-sdist

maturin will build the wheels for all Python versions installed on your system.

Alternatively you can use the build script build_wheels.py. The precondition to run this script is to have pyenv installed. The script can build the wheel for specific Python versions or for all Python versions installed by pyenv. In addition it runs pytest after successfull creation of each wheel.

python build_wheels.py

By default the script will build the wheel for the current Python interpreter. If you want to build the wheel for specific Python version(s) by providing the argument --versions.

python build_wheels.py --versions 3.11.8,3.12.2

To build the wheel for all installed Python versions:

python build_wheels.py --versions *

Instruction how to install pyenv can be found here.

Examples

Get statistics of a directory:

from scandir_rs import Count, ReturnType

print(Count("/usr", return_type=ReturnType.Ext).collect())

The collect method releases the GIL. So other Python threads can run in parallel.

The same, but asynchronously in background using a class instance:

from scandir_rs import Count, ReturnType

instance = Count("/usr", return_type=ReturnType.Ext)
instance.start()  # Start scanning the directory in background
...
values = instance.results()  # Returns the current statistics. Can be read at any time
...
if instance.busy():  # Check if the task is still running.
...
instance.stop()  # If you want to cancel the task
...
instance.join()  # Wait for the instance to finish.

and with a context manager:

import time

from scandir_rs import Count, ReturnType

with Count("/usr", return_type=ReturnType.Ext) as instance:
    while instance.busy():
        statistics = instance.results()
        # Do something
        time.sleep(0.01)
    print(instance.results())

os.walk() example:

from scandir_rs import Walk

for root, dirs, files in Walk("/usr"):
    # Do something

with extended data:

from scandir_rs import Walk, ReturnType

for root, dirs, files, symlinks, other, errors in Walk("/usr", return_type=ReturnType.Ext):
    # Do something

os.scandir() example:

from scandir_rs import Scandir, ReturnType

for path, entry in Scandir("~/workspace", return_type=ReturnType.Ext):
    # entry is a custom DirEntry object

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scandir_rs-2.7.1.tar.gz (39.0 kB view hashes)

Uploaded Source

Built Distributions

scandir_rs-2.7.1-cp312-none-win_amd64.whl (441.6 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

scandir_rs-2.7.1-cp312-cp312-manylinux_2_34_x86_64.whl (578.2 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.34+ x86-64

scandir_rs-2.7.1-cp311-none-win_amd64.whl (436.8 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

scandir_rs-2.7.1-cp311-cp311-manylinux_2_34_x86_64.whl (576.8 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.34+ x86-64

scandir_rs-2.7.1-cp310-none-win_amd64.whl (436.7 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

scandir_rs-2.7.1-cp310-cp310-manylinux_2_34_x86_64.whl (576.8 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.34+ x86-64

scandir_rs-2.7.1-cp39-none-win_amd64.whl (437.0 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

scandir_rs-2.7.1-cp39-cp39-manylinux_2_34_x86_64.whl (577.3 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.34+ x86-64

scandir_rs-2.7.1-cp38-none-win_amd64.whl (437.3 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

scandir_rs-2.7.1-cp38-cp38-manylinux_2_34_x86_64.whl (577.5 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.34+ x86-64

scandir_rs-2.7.1-cp37-none-win_amd64.whl (437.3 kB view hashes)

Uploaded CPython 3.7 Windows x86-64

scandir_rs-2.7.1-cp37-cp37m-manylinux_2_34_x86_64.whl (577.6 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.34+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page