Skip to main content

The FFMPEG for proprietary data file formats

Project description

Jeff the Sipper

The infamous brother of John the Ripper. The man we all need but don't deserve. Father to murdered son, husband to a murdered wife. And he will have his vengeance, in this life or the next.

I'm not sure how well this will turn out, but the package is intended to be the FFMPEG analog for proprietary data file formats for highly specialized software systems, especially those on MS Windows which commit an extraordinary amount of sacrilege against the UNIX philosophy.

Contributing

I would love for anyone to contribute their implementation of a proprietary data file parser to enable users the option of using standardized, powerful data processing tools which, more likely than not, may overshadow the proprietary counterpart completely. If you feel inclined, fork the repository, implement it, and submit a pull request.

If you would like to implement the data parser in a different programming language, that is completely fine, but it would be best to integrate it into Python if at all possible.

AVANTES AvaSoft 8 AVS84

My attempt to reverse engineer the AVS84 (i.e ".raw8" or RAW 8) file format generated by the AVANTES AvaSoft 8 spectrometer analyzer.

My internet searches yielded quite literally nothing regarding the structure of the AVS84 file, so the format is hypothesized through the analysis of several such files against their renditions into Excel, CSV, and plaintext through AvaSoft itself. The sole resource found regarding this format was by another GitHub user @padmer who also extracted only the parts they cared about.

Format

As of this commit, the format currently known is as follows:

HEADER (328 bytes)
	SIGNATURE "AVS84" (5 bytes)
	BLOCK ??? (125 bytes)
	MYSTERY ??? (6 bytes) *(FILE UNIQUE ???)
	BLOCK ??? (188 bytes)
	MYSTERY ??? (4 bytes)
SERIES (27200 bytes ???)
	BLOCK Float32 IEEE-754 LE, "X-axis" (13600 bytes)
	BLOCK Float32 IEEE-754 LE, "Y-axis" (13600 bytes)
BLOCK ??? (13600 bytes)
NULL ??? (13614 bytes)
MYSTERY ??? (10 bytes)
CONSTANT Int32 LE = 0xc8000042 (1882 bytes)
NULL ??? (1882 bytes)

Observations

Data is stored in little endian (LE) format with allegiance to the IEEE-754 floating point standard.


Per @padmer's implementation, the size of the data series is clearly arbitrary, but my attempts of finding the address related to the series size has been unsuccessful. My presumption is that this size must be of an Int32 type located somewhere in the HEADER.


Most MYSTERY chunks are consistent accross different data files, but the 6-byte MYSTERY chunk in the HEADER appears to change with each file. My presumption is that it is either a hash of the data series, or a timestamp in some encoding. My observations noted that this MYSTERY chunk appears to be 6 bytes in length, which is rather peculiar.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sipper-0.0.2.tar.gz (15.2 kB view hashes)

Uploaded Source

Built Distribution

sipper-0.0.2-py3-none-any.whl (15.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page