Skip to main content

Python binding to linux syscall getdents64.

Project description

Iterate large directories efficiently with python.

About

python-getdents is a simple wrapper around Linux system call getdents64 (see man getdents for details).

Implementation is based on solution descibed in You can list a directory containing 8 million files! But not with ls. article by Ben Congleton.

Install

pip install getdents

For development

python3 -m venv env
. env/bin/activate
pip install -e .[test]

Building Wheels

pip install cibuildwheel
cibuildwheel --platform linux --output-dir wheelhouse

Run tests

ulimit -v 33554432 && py.test tests/

Usage

from getdents import getdents

for inode, type_, name in getdents("/tmp"):
    print(name)

Advanced

While getdents provides a convenient wrapper with ls-like filtering, you can use getdents_raw for more control:

import os
from getdents import DT_LNK, O_GETDENTS, getdents_raw

fd = os.open("/tmp", O_GETDENTS)

for inode, type_, name in getdents_raw(fd, 2**20):
    if type_ == DT_LNK and inode != 0:
        print("found symlink:", name, "->", os.readlink(name, dir_fd=fd))

os.close(fd)

Batching

In case you need more control over syscalls, you may call instance of getdents_raw instead. Each call corresponds to single getdents64 syscall, returning list of hovever many entries fits in buffer size. Call returns None when there are no more entries to read.

it = getdents_raw(fd, 2**20)

for batch in iter(it, None):
     for inode, type, name in batch:
        ...

Free-threading

While it is not so wise idea to do an I/O from multiple threads on a single file descriptor, you can do it if you need to. This package supports free-threading (nogil) in Python.

CLI

Usage

python-getdents [-h] [-b N] [-o NAME] PATH

Options

Option

Description

-b N

Buffer size (in bytes) to allocate when iterating over directory. Default is 32768, the same value used by glibc, you probably want to increase this value. Try starting with 16777216 (16 MiB). Best performance is achieved when buffer size rounds to size of the file system block.

--buffer-size N

-o NAME

Output format:

  • plain (default) Print only names.

  • csv Print as comma-separated values in order: inode, type, name.

  • csv-headers Same as csv, but print headers on the first line also.

  • json output as JSON array.

  • json-stream output each directory entry as single json object separated by newline.

--output-format NAME

Exit codes

  • 3 - Requested buffer is too large

  • 4 - PATH not found.

  • 5 - PATH is not a directory.

  • 6 - Not enough permissions to read contents of the PATH.

Examples

python-getdents /path/to/large/dir
python -m getdents /path/to/large/dir
python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getdents-1.0.0.tar.gz (13.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

getdents-1.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl (16.2 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

getdents-1.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl (17.0 kB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

getdents-1.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl (15.9 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ x86-64

getdents-1.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl (17.2 kB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.28+ ARM64

getdents-1.0.0-cp310-abi3-musllinux_1_2_x86_64.whl (15.5 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

getdents-1.0.0-cp310-abi3-musllinux_1_2_aarch64.whl (16.2 kB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

getdents-1.0.0-cp310-abi3-manylinux_2_28_x86_64.whl (15.2 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

getdents-1.0.0-cp310-abi3-manylinux_2_28_aarch64.whl (16.3 kB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

File details

Details for the file getdents-1.0.0.tar.gz.

File metadata

  • Download URL: getdents-1.0.0.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for getdents-1.0.0.tar.gz
Algorithm Hash digest
SHA256 80ab2825a09e5b1107fe3d166458d01d4a7cedfe255ee9762d12c68c9f890d24
MD5 c1d657d70c3245cde663d587b5b793ae
BLAKE2b-256 10aacbdc87f71e8659f579557beb5d719e82459f70cdac6c089f948bce6cd76a

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0.tar.gz:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 35238c0e4fa94b266099abd00391ce1716d439cd4b127427ac385bc49fa230cd
MD5 34163cf17a58dd7f062fca53650655a5
BLAKE2b-256 b8c26bf256ec5358ab95608f0f1a9671f7e441bfb9045ce046f8396d5be4d609

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 381f0081be3bdd249f121e51b13b477c615b516be31d08b8bf6b839ea968b48e
MD5 842e3926720a0448ebdd7f525d1d56b6
BLAKE2b-256 d361e7304e86899d2b8181ab23035e54b0322f70c019439ece5b83b0cf1888bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2ce612bdc9cc3690dc568b360bafee319afca151b9abb38fea376e7ddd344085
MD5 bc8b947e075fba300d11f32d39c1c1b0
BLAKE2b-256 5a8a4055f0eeb93b7a9251ff720dc1c4d5352cac7b36a975ffaf072073c105a4

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 7c6e461ec4d14e8ea668faab5e68467940dd0d8000f4c6d2f3f91832bddb0769
MD5 82dda60e563df604bf5c0d3fb0cb98f4
BLAKE2b-256 6ee4a4c3172a2e1d17621dd52884d48a5120673379ec478ff8e1c312124770fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 cdd21d302592fa4c2b4e983d30b07a0be8b41846ec1413d2ffd2034a287e25ce
MD5 a41f6318f5d4be27891e378b301d9226
BLAKE2b-256 a897c31cb9dafdba8edba3983e14fc063cd885c99c0a5d4d0da3c692d43b06c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp310-abi3-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 8f992fa25380d76f88cb89cd582b1cb5b64e9bb1142cb26776ccf3b40044f7f4
MD5 ef26257f57c1361a350a61c2f53e4913
BLAKE2b-256 3a05a9d854115b73022e681cfa916e24cf70d38eaf19bc57a592097fa99b601f

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp310-abi3-musllinux_1_2_aarch64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 49cd092b360b52a40802ef6fa08e50346ad36dd67f63f05781501f957ca21ab0
MD5 36e823a5462464cd03296d9af38c85e3
BLAKE2b-256 8a2c4765c59e3349d60856c77215b32f5fe6b6cb2cc1675639d359fd12424a30

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp310-abi3-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file getdents-1.0.0-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for getdents-1.0.0-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f62d1edd1522fd044439589c4e8c200b94a677d81ae3b86320eff8e3cd8ccb10
MD5 eec3877d3b41e7931ae89dd6fb7fe697
BLAKE2b-256 a192a28176f225841e06fd8c27c37951b045df648f89b8f2f04c65be430aef73

See more details on using hashes here.

Provenance

The following attestation bundles were made for getdents-1.0.0-cp310-abi3-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on ZipFile/python-getdents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page