Skip to main content

Python binding to linux syscall getdents64.

Project description

Iterate large directories efficiently with python.

About

python-getdents is a simple wrapper around Linux system call getdents64 (see man getdents for details). More details on approach.

TODO

  • Verify that implementation works on platforms other than x86_64.

Install

pip install getdents

For development

python3 -m venv env
. env/bin/activate
pip install -e .[test]

Run tests

ulimit -v 33554432 && py.test tests/

Or

ulimit -v 33554432 && ./setup.py test

Usage

from getdents import getdents

for inode, type, name in getdents('/tmp', 32768):
    print(name)

Advanced

import os
from getdents import *

fd = os.open('/tmp', O_GETDENTS)

for inode, type, name in getdents_raw(fd, 2**20):
    print({
            DT_BLK:     'blockdev',
            DT_CHR:     'chardev ',
            DT_DIR:     'dir     ',
            DT_FIFO:    'pipe    ',
            DT_LNK:     'symlink ',
            DT_REG:     'file    ',
            DT_SOCK:    'socket  ',
            DT_UNKNOWN: 'unknown ',
        }[type], {
            True:  'd',
            False: ' ',
        }[inode == 0],
        name,
    )

os.close(fd)

CLI

Usage

python-getdents [-h] [-b N] [-o NAME] PATH

Options

Option Description
-b N Buffer size (in bytes) to allocate when iterating over directory. Default is 32768, the same value used by glibc, you probably want to increase this value. Try starting with 16777216 (16 MiB). Best performance is achieved when buffer size rounds to size of the file system block.
--buffer-size N
-o NAME

Output format:

  • plain (default) Print only names.
  • csv Print as comma-separated values in order: inode, type, name.
  • csv-headers Same as csv, but print headers on the first line also.
  • json output as JSON array.
  • json-stream output each directory entry as single json object separated by newline.
--output-format NAME

Exit codes

  • 3 - Requested buffer is too large
  • 4 - PATH not found.
  • 5 - PATH is not a directory.
  • 6 - Not enough permissions to read contents of the PATH.

Examples

python-getdents /path/to/large/dir
python -m getdents /path/to/large/dir
python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for getdents, version 0.3
Filename, size File type Python version Upload date Hashes
Filename, size getdents-0.3-cp35-cp35m-manylinux1_x86_64.whl (22.7 kB) File type Wheel Python version cp35 Upload date Hashes View
Filename, size getdents-0.3-cp36-cp36m-manylinux1_x86_64.whl (23.1 kB) File type Wheel Python version cp36 Upload date Hashes View
Filename, size getdents-0.3-cp37-cp37m-manylinux1_x86_64.whl (24.0 kB) File type Wheel Python version cp37 Upload date Hashes View
Filename, size getdents-0.3-cp38-cp38-manylinux1_x86_64.whl (23.0 kB) File type Wheel Python version cp38 Upload date Hashes View
Filename, size getdents-0.3.tar.gz (6.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page