Skip to main content

Replacement or alternative for python copyfile() utilizing server side copy on network shares for faster copying.

Project description

speedcopy

CodeQL 📇 Code Linting and ⚗️ Tests PyPI version

Patched python shutil.copyfile using native call CopyFile2 on windows to accelerate transfer on windows shares. On Linux, it issues special ioctl command CIFS_IOC_COPYCHUNK_FILE to enable server-side copy.

This works only when both source and destination files are on same SMB1(CIFS)/2/3 filesystem.

See https://wiki.samba.org/index.php/Server-Side_Copy

Installation

Add speedcopy to PYTHONPATH or:

pip install speedcopy

Usage

If you want to monkeypatch shutil.copyfile() then:

import shutil
import speedcopy

speedcopy.patch_copyfile()

# your code ...
shutil.copyfile(src, dst)

This will make last call to use speedcopy.

Direct use:

import speedcopy

# some code ...

speedcopy.copyfile(src, dst)

There is also debug mode enabled by setting speedcopy.SPEEDCOPY_DEBUG = True. This will print more information during runtime.

Benchmark

You can run benchmark using benchmark.py script. It will run copy operations with different file sizes and print the results in a table format.

Usage

Benchmark can run in two modes: multithreaded and single-threaded. In multithreaded mode, it will run multiple copy operations in parallel using multiple workers. In single-threaded mode, it will run copy operations sequentially.

Arguments:

python benchmark.py PATH [--sizes-mb SIZES_MB] [--repeats REPEATS] [--copies-per-worker COPIES_PER_WORKER] [--workers WORKERS]

  • PATH: Path to the directory where the benchmark files will be created and copied. This should be a path on an SMB/CIFS share for accurate results.
  • --sizes-mb: Comma-separated list of file sizes in MB to test (default: 1,2,4,8,16,32).
  • --repeats: Number of times to repeat each copy operation (default: 3).
  • --copies-per-worker: Number of copy operations each worker should perform in multithreaded mode (default: 2).
  • --workers: Number of worker threads to use in multithreaded mode (default: 4).

If workers is not set or set to 1, it will run in single-threaded mode.

Windows

running with --sizes-mb 1,2,4,8,16,32 --repeats 3 --copies-per-worker 2 --workers 4

Multithreaded mode

size(MB) shutil(s) speedcopy(s) shutil(MB/s) speedcopy(MB/s) gain
1 0.202 0.087 39.6 92.2 2.33x
2 0.289 0.099 55.4 161.8 2.92x
4 0.430 0.121 74.3 263.8 3.55x
8 0.780 0.164 82.1 389.4 4.74x
16 1.476 0.247 86.7 517.3 5.97x
32 2.824 0.390 90.7 655.8 7.23x

overall gain was 5.41x


running with --sizes-mb 1,2,4,8,16,32 --repeats 3

Single-threaded mode

size(MB) shutil(s) speedcopy(s) shutil(MB/s) speedcopy(MB/s) gain
1 0.160 0.052 18.7 57.5 3.07x
2 0.220 0.062 27.3 97.1 3.56x
4 0.317 0.073 37.8 165.2 4.37x
8 0.554 0.121 43.3 198.6 4.58x
16 1.426 0.151 33.6 318.2 9.46x
32 2.059 0.193 46.6 497.6 10.67x

overall gain was 7.27x

Linux

running with --sizes-mb 1,2,4,8,16,32 --repeats 3 --copies-per-worker 2 --workers 4

Multithreaded mode

size(MB) shutil(s) speedcopy(s) shutil(MB/s) speedcopy(MB/s) gain
1 0.095 0.025 84.6 317.6 3.75x
2 0.172 0.025 93.0 643.8 6.92x
4 0.326 0.027 98.2 1204.7 12.27x
8 0.628 0.035 101.9 1822.1 17.88x
16 1.224 0.045 104.6 2830.2 27.07x
32 2.430 0.063 105.3 4037.1 38.32x

running with --sizes-mb 1,2,4,8,16,32 --repeats 3

Single-threaded mode

size(MB) shutil(s) speedcopy(s) shutil(MB/s) speedcopy(MB/s) gain
1 0.047 0.011 64.2 272.8 4.25x
2 0.084 0.012 71.7 496.4 6.93x
4 0.151 0.013 79.7 925.0 11.61x
8 0.281 0.014 85.3 1674.8 19.62x
16 0.529 0.018 90.7 2725.3 30.04x
32 1.029 0.025 93.3 3793.4 40.64x

maOS

Based on the measured values, there is no significant gain on macOS. The gain is around 1.05x in multithreaded mode and around 1.5x in single-threaded mode, which is not significant enough. It is possible that the file server wasn't configured to support server-side copy for macOS (on samba, you need to have specific options). Even though I've tested the configuration, and it should be working, it's possible that there is some issue with the setup.


running with --sizes-mb 1,2,4,8,16,32 --repeats 3 --copies-per-worker 2 --workers 4

Multithreaded mode

size(MB) shutil(s) speedcopy(s) shutil(MB/s) speedcopy(MB/s) gain
1 0.343 0.309 23.4 25.9 1.11x
2 0.432 0.424 37.0 37.7 1.02x
4 0.606 0.621 52.8 51.5 0.97x
8 0.940 0.940 68.0 68.1 1.00x
16 1.663 1.585 77.0 80.8 1.05x
32 3.077 2.941 83.2 87.0 1.05x

running with --sizes-mb 1,2,4,8,16,32 --repeats 3

Single-threaded mode

size(MB) shutil(s) speedcopy(s) shutil(MB/s) speedcopy(MB/s) gain
1 0.263 0.146 11.4 20.6 1.81x
2 0.301 0.182 19.9 32.9 1.65x
4 0.383 0.266 31.3 45.1 1.44x
8 0.593 0.404 40.5 59.4 1.47x
16 1.090 0.650 44.0 73.8 1.68x
32 1.910 1.225 50.2 78.4 1.56x

Note that Windows, Linux and macOS timings do not correlate, it is taken from different systems. Also note that these figures are not taken from production grade hardware and setup and can be completely off at other places.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speedcopy-2.2.0.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

speedcopy-2.2.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file speedcopy-2.2.0.tar.gz.

File metadata

  • Download URL: speedcopy-2.2.0.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for speedcopy-2.2.0.tar.gz
Algorithm Hash digest
SHA256 71a1aaa6d8d387bbe737525f6680d58d9376e2b0c0aa5aab85cdcfbde9ec921a
MD5 046ace4d12379a62a2bada1b14e95eb0
BLAKE2b-256 ff8547a060897b2222b7bd4e940623f0988893ac38f655ea390dd977cbd9fb43

See more details on using hashes here.

File details

Details for the file speedcopy-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: speedcopy-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for speedcopy-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63cf00173428617224f379976009e80b68f08642fe020479d287db7984e78914
MD5 f98e145cb43a864cf87546a0cdb2e007
BLAKE2b-256 1bc4d20e2b77aec0b94c5130d2384b71ddd690d7e248c3ed05e58598a2251001

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page