Skip to main content

eXPeditious Data Transfer

Project description

xpdt: eXPeditious Data Transfer

PyPI version

About

xpdt is (yet another) language for defining data-types and generating code for serializing and deserializing them. It aims to produce code with little or no overhead and is based on fixed-length representations which allows for zero-copy deserialization and (at-most-)one-copy writes (source to buffer).

The generated C code, in particular, is highly optimized and often permits the elimination of data-copying for writes and enables optimizations such as loop-unrolling for fixed-length objects. This can lead to read speeds in excess of 500 million objects per second (~1.8 nsec per object).

Examples

The xpdt source language looks similar to C struct definitions:

struct timestamp {
	u32	tv_sec;
	u32	tv_nsec;
};

struct point {
	i32	x;
	i32	y;
	i32	z;
};

struct line {
	timestamp	time;
	point		line_start;
	point		line_end;
	bytes		comment;
};

Fixed width integer types from 8 to 128 bit are supported, along with the bytes type, which is a variable-length sequence of bytes.

Target Languages

The following target languages are currently supported:

  • C
  • Python

The C code is very highly optimized.

The Python code is about as well optimized for CPython as I can make it. It uses typed NamedTuple for objects, which has some small overhead over regular tuples, and it uses struct.Struct to do the packing/unpacking. I have also code-golfed the generated bytecodes down to what I think is minimal given the design constraints. As a result, performance of the pure Python code is comparable to a JSON library implemented in C or Rust.

For better performance in Python, it may be desirable to develop a Cython target. In some instances CFFI structs may be more performant since they can avoid the creation/destruction of an object for each record.

Target languages are implemented purely as jinja2 templates.

Serialization format

The serialization format for fixed-length objects is simply a packed C struct.

For any object which contains bytes type fields:

  • a 32bit unsigned record length is prepended to the struct
  • all bytes type fields are converted to u32 and contain the length of the bytes
  • all bytes contents are appended after the struct in the order in which they appear

For example, following the example above, the serialization would be:

u32 tot_len # = 41
u32 time.tv_sec
u32 time.tv_usec
i32 line_start.x
i32 line_start.y
i32 line_start.z
i32 line_end.x
i32 line_end.y
i32 line_end.z
u32 comment # = 5
u8 'H'
u8 'e'
u8 'l'
u8 'l'
u8 'o'

Features

The feature-set is, as of now, pretty slim.

There are no array / sequence / map types, and no keyed unions.

Support for such things may be added in future provided that suitable implementations exist. An implementation is suitable if:

  • It admits a zero (or close to zero) overhead implementation
  • it causes no overhead when the feature isn't being used

License

The compiler is released under the GPLv3.

The C support code/headers are released under the MIT license.

The generated code is yours.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xpdt-0.4.0.tar.gz (37.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xpdt-0.4.0-py3-none-any.whl (42.1 kB view details)

Uploaded Python 3

File details

Details for the file xpdt-0.4.0.tar.gz.

File metadata

  • Download URL: xpdt-0.4.0.tar.gz
  • Upload date:
  • Size: 37.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for xpdt-0.4.0.tar.gz
Algorithm Hash digest
SHA256 e5995cd828b5c343439ad5b82ef771f3e9e5a5add33b408696088d996ec240da
MD5 5d6cfcc463337bc6b6f57018d73374e8
BLAKE2b-256 8bbd296cf1443ea7d2c90f77ced5a9492fb7c24202161df45c312a948a3d1222

See more details on using hashes here.

File details

Details for the file xpdt-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: xpdt-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 42.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for xpdt-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdd5f8e8279b25b6c419f0cd6957a771ab8d0190c0b0c2fba1c2c4c823062aef
MD5 d936f8118762bc83e171c23b4c53687c
BLAKE2b-256 3cc3b5663bde39314830478e03baba3df40135e133d7b56b5722e501481223ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page