Skip to main content

A Python interface to splice(2)

Project description

A Python interface to splice(2) system call.

About

splice(2) moves data between two file descriptors without copying between kernel address space and user address space. It transfers up to nbytes bytes of data from the file descriptor in to the file descriptor out.

zero-copy

Normally when you copy data from one data stream to another, the data to be copied is first stored in a buffer in userspace and is then copied back to the target data stream from the user space which introduces a certain overhead.

zero-copy allows us to operate on data without the use of copying data to userspace. It essentialy transfers the data by remapping pages and not actually performing the copying of data, resulting in improved performance.

Illustrated below is a simple example of copying data from one file to another using the splice(2) system call. For the complete documentation see API Documentation.

# copy data from one file to another using splice

from splice import splice

to_read = open("read.txt")
to_write = open("write.txt", "w+")

splice(to_read.fileno(), to_write.fileno())

This copying of the data twice (once into the userland buffer, and once out from that userland buffer) imposes some performance and resource penalties. splice(2) syscall avoids these penalties by avoiding any use of userland buffers; it also results in a single system call (and thus only one context switch), rather than the series of read(2) / write(2) system calls (each system call requiring a context switch) used internally for the data copying.

API Documentation

sendfile module provides a single function: sendfile().

  • splice.splice(out, in, offset, nbytes, flags)

    Copy nbytes bytes from file descriptor in (a regular file) to file descriptor out (a regular file) starting at offset. Return the number of bytes just being sent. When the end of file is reached return 0. If offset is not specified, the bytes are read from the current position of in and the position of in is updated. If nbytes is not specified, the whole of in is copied over to out.

    Required arguments

    • in: file descriptor of the file from which data is to be read.

    • out: file descriptor of the file to which data is to be transferred.

    Positional optional arguments

    • offset: offset from where the input file is read from.

    • nbytes: number of bytes to be copied in total, default value

    • flags: a bit mask which can be composed by ORing together the following.

      • splice.SPLICE_F_MOVE

      • splice.SPLICE_F_NONBLOCK

      • splice.SPLICE_F_MORE

      • splice.SPLICE_F_GIFT

    More information on what each of the flag means can be found on the splice(2) man page here.

Usage

>>> from splice import splice

# init file objects
>>> to_read = open("read.txt") # file to read from
>>> to_write = open("write.txt", "w+") # file to write to

>>> len(to_read.read())
50

# copying whole file
>>> splice(to_read.fileno(), to_write.fileno())
50  # bytes copied

# copying file starting from an offset
>>> splice(to_read.fileno(), to_write.fileno(), offset=10)
40

# copying certain amount of bytes
>>> splice(to_read.fileno(), to_write.fileno(), nbytes=20)
20

# copying certain amount of bytes beginning from an offset
>>> splice(to_read.fileno(), to_write.fileno(), offset=10, nbytes=20)
20

# specifying flags
>>> import splice
>>> splice(to_read.fileno(), to_write.fileno(), flags=splice.SPLICE_F_MORE)
50

Why would I use this?

splice(2) is supposed to be better in terms of performance when compared to traditional read/write methods since it avoids overhead of copying the data to user address space and instead, does the transfer by remapping pages in kernel address space.

Supported platforms

The splice(2) system call is (GNU)Linux-specific.

Support

Feel free to add improvements, report issues or contact me about anything related to the project.

LICENSE

GNU GPL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py-splice-1.0.2.tar.gz (16.8 kB view details)

Uploaded Source

File details

Details for the file py-splice-1.0.2.tar.gz.

File metadata

  • Download URL: py-splice-1.0.2.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.7

File hashes

Hashes for py-splice-1.0.2.tar.gz
Algorithm Hash digest
SHA256 5d08b4c0ba8e6bf77eaa8fb7d8932d6538c38e83f4a2722035f0ac9c2e3a82fe
MD5 8e56813b3aaa8f860ea2c6f1595c944a
BLAKE2b-256 e5e07fb9972011e9462a5fbd6692c62fad9d10e603159b4adf8c1a8774ff874b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page