Skip to main content

Python binding to linux syscall getdents64.

Project description

Iterate large directories efficiently with python.

About

python-getdents is a simple wrapper around Linux system call getdents64 (see man getdents for details). More details on approach.

TODO

  • Verify that implementation works on platforms other than x86_64.

Install

pip install getdents

For development

python3 -m venv env
. env/bin/activate
pip install -e .[test]

Building Wheels

pip install cibuildwheel
cibuildwheel --platform linux --output-dir wheelhouse

Run tests

ulimit -v 33554432 && py.test tests/

Or

ulimit -v 33554432 && ./setup.py test

Usage

from getdents import getdents

for inode, type, name in getdents('/tmp', 32768):
    print(name)

Advanced

import os
from getdents import *

fd = os.open('/tmp', O_GETDENTS)

for inode, type, name in getdents_raw(fd, 2**20):
    print({
            DT_BLK:     'blockdev',
            DT_CHR:     'chardev ',
            DT_DIR:     'dir     ',
            DT_FIFO:    'pipe    ',
            DT_LNK:     'symlink ',
            DT_REG:     'file    ',
            DT_SOCK:    'socket  ',
            DT_UNKNOWN: 'unknown ',
        }[type], {
            True:  'd',
            False: ' ',
        }[inode == 0],
        name,
    )

os.close(fd)

CLI

Usage

python-getdents [-h] [-b N] [-o NAME] PATH

Options

Option

Description

-b N

Buffer size (in bytes) to allocate when iterating over directory. Default is 32768, the same value used by glibc, you probably want to increase this value. Try starting with 16777216 (16 MiB). Best performance is achieved when buffer size rounds to size of the file system block.

--buffer-size N

-o NAME

Output format:

  • plain (default) Print only names.

  • csv Print as comma-separated values in order: inode, type, name.

  • csv-headers Same as csv, but print headers on the first line also.

  • json output as JSON array.

  • json-stream output each directory entry as single json object separated by newline.

--output-format NAME

Exit codes

  • 3 - Requested buffer is too large

  • 4 - PATH not found.

  • 5 - PATH is not a directory.

  • 6 - Not enough permissions to read contents of the PATH.

Examples

python-getdents /path/to/large/dir
python -m getdents /path/to/large/dir
python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getdents-0.4.1.tar.gz (11.7 kB view details)

Uploaded Source

Built Distributions

getdents-0.4.1-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

getdents-0.4.1-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

getdents-0.4.1-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.1 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

getdents-0.4.1-cp38-abi3-musllinux_1_2_x86_64.whl (16.8 kB view details)

Uploaded CPython 3.8+ musllinux: musl 1.2+ x86-64

getdents-0.4.1-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.9 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

File details

Details for the file getdents-0.4.1.tar.gz.

File metadata

  • Download URL: getdents-0.4.1.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for getdents-0.4.1.tar.gz
Algorithm Hash digest
SHA256 75332d2a20d4ac7b0f623b98d5aaabe71236ab346013edf913e795c2952522d9
MD5 2acb00e4e78b226dae5af8fd62ec3188
BLAKE2b-256 26ad74c400ce490c34ac64a57708ca1e47f2d921be825fdbab40e9b0dce4a85d

See more details on using hashes here.

File details

Details for the file getdents-0.4.1-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.1-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4096ace1857970c70a6db3df01340cb0285dbaac01c4869fc70bb84a0dd07e87
MD5 1f12201839cb4ff2c4c0e669a0e7885f
BLAKE2b-256 bfd53c2f90c472ee79a06238c3706984e4819c96bdfe6732241b441dc7f06ca1

See more details on using hashes here.

File details

Details for the file getdents-0.4.1-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.1-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 885c1f0915bf82651c657d93aa3d635b6e8eb22b9ab74c7ab0bb218a10d4d23a
MD5 d88480bae8bf347d1530c8323a60e117
BLAKE2b-256 7ffd4ca64e6f4e25466bf81ec5209f6b6bcb4813dfb4196aafcb7d9fb94d22dc

See more details on using hashes here.

File details

Details for the file getdents-0.4.1-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.1-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4b521f830a803efdca5468576e24b6e4abbe0730dbcf9331a4590035e5e8a344
MD5 5cf624871543c17507c504a738fe80a6
BLAKE2b-256 ef525890c3f4b353b182a8f45c5dcabfe0d0f40f0ee6ea82f9415ab6ae1262a7

See more details on using hashes here.

File details

Details for the file getdents-0.4.1-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.1-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 cd9681dc61f63b67b8127424aec26ebde216c9bcf92f8c8597fe1f64505a26fd
MD5 71645e0c391bb7ac52db2e8cb259305e
BLAKE2b-256 f5066eba5c00a3132d20f2e5f9b2ed43ca7fe791892c83b96605c7df056d45f0

See more details on using hashes here.

File details

Details for the file getdents-0.4.1-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.1-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f79dcd666c9fc17c09d8d7833a13f254b49fe82a65f2922dbb8d306d7e93a0d5
MD5 8bd2ab198c48781ac3a60269ccfde3f6
BLAKE2b-256 ef0b618c2ca652418011f4b5665662d3c02dcbb8b737d44489e08dff4f34dff1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page