Skip to main content

Python binding to linux syscall getdents64.

Project description

Iterate large directories efficiently with python.

About

python-getdents is a simple wrapper around Linux system call getdents64 (see man getdents for details). More details on approach.

TODO

  • Verify that implementation works on platforms other than x86_64.

Install

pip install getdents

For development

python3 -m venv env
. env/bin/activate
pip install -e .[test]

Building Wheels

pip install cibuildwheel
cibuildwheel --platform linux --output-dir wheelhouse

Run tests

ulimit -v 33554432 && py.test tests/

Or

ulimit -v 33554432 && ./setup.py test

Usage

from getdents import getdents

for inode, type, name in getdents('/tmp', 32768):
    print(name)

Advanced

import os
from getdents import *

fd = os.open('/tmp', O_GETDENTS)

for inode, type, name in getdents_raw(fd, 2**20):
    print({
            DT_BLK:     'blockdev',
            DT_CHR:     'chardev ',
            DT_DIR:     'dir     ',
            DT_FIFO:    'pipe    ',
            DT_LNK:     'symlink ',
            DT_REG:     'file    ',
            DT_SOCK:    'socket  ',
            DT_UNKNOWN: 'unknown ',
        }[type], {
            True:  'd',
            False: ' ',
        }[inode == 0],
        name,
    )

os.close(fd)

CLI

Usage

python-getdents [-h] [-b N] [-o NAME] PATH

Options

Option

Description

-b N

Buffer size (in bytes) to allocate when iterating over directory. Default is 32768, the same value used by glibc, you probably want to increase this value. Try starting with 16777216 (16 MiB). Best performance is achieved when buffer size rounds to size of the file system block.

--buffer-size N

-o NAME

Output format:

  • plain (default) Print only names.

  • csv Print as comma-separated values in order: inode, type, name.

  • csv-headers Same as csv, but print headers on the first line also.

  • json output as JSON array.

  • json-stream output each directory entry as single json object separated by newline.

--output-format NAME

Exit codes

  • 3 - Requested buffer is too large

  • 4 - PATH not found.

  • 5 - PATH is not a directory.

  • 6 - Not enough permissions to read contents of the PATH.

Examples

python-getdents /path/to/large/dir
python -m getdents /path/to/large/dir
python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getdents-0.4.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

getdents-0.4.0-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.6 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

getdents-0.4.0-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.6 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

getdents-0.4.0-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.6 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

getdents-0.4.0-cp38-abi3-musllinux_1_1_x86_64.whl (18.7 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

getdents-0.4.0-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.3 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64manylinux: glibc 2.5+ x86-64

File details

Details for the file getdents-0.4.0.tar.gz.

File metadata

  • Download URL: getdents-0.4.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for getdents-0.4.0.tar.gz
Algorithm Hash digest
SHA256 03af041b079173f9e2975f4198a32e3f9fb1962b2c8b856b8838e401946e168c
MD5 72f3b0964abb92d0839fa3f79b9fbfbf
BLAKE2b-256 d10c62f3264e8c49908d7b884597acbda9057b0ab26cc41cae096462a07db66b

See more details on using hashes here.

File details

Details for the file getdents-0.4.0-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.0-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 13f93ead3cd3c99b094793f4cbbb19abac7d5e09fc373fe37335addd41afbdac
MD5 0b737c0d146daae671124b293327a2f2
BLAKE2b-256 6ddd4d98b79999febfbb9ebe068740fc9d50e8af88137b09324337d806f2a48d

See more details on using hashes here.

File details

Details for the file getdents-0.4.0-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.0-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 40475dba347b20cbdca289b0468f52f3176307e35077a58d186d6cebbaf19501
MD5 417c740169d372755d7cbcc00c10b78d
BLAKE2b-256 b64c27b34c0bcbf7e63fa0ce7a619ab3dc31073606d478e80e4cacbda4764bf8

See more details on using hashes here.

File details

Details for the file getdents-0.4.0-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.0-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7ee12332b72b015022424e2df177bbcc58754aaf4085b6f5614b9135504ab0ee
MD5 80afed6e0553ca83fcb709d1528bed07
BLAKE2b-256 e0fa0e1d5007290b206c87e64cbeaa9caa6b218b2c982ead899322e74168f75c

See more details on using hashes here.

File details

Details for the file getdents-0.4.0-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.0-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 76e2f9281beb429c0b321b91561beb2adad0c80920cf75248bd8842c59d111ec
MD5 ed908dc2685d68a370e85e27b563fbb7
BLAKE2b-256 4468312f4d4c2b2721899ea7bfc7bbbe2e19b519df00276e3a56dfdfc73c0da6

See more details on using hashes here.

File details

Details for the file getdents-0.4.0-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for getdents-0.4.0-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6c4ede8c0396ccee694c5507d59a33659b589b1e09fda2c6e4bece78225bb839
MD5 4987cbe00f15854c180c4e45d4a330f5
BLAKE2b-256 4306f1753bc171807960d43efcf327357f38c32f9ab02def06113bb2d400b42e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page