Fast streaming I/O of numeric matrices
Project description
==========================================
fmio - Fast stream I/O for numeric matrices
==========================================
If you have ever piped large CSV or TSV numeric data between
scripts, you might realize just how much time is taken parsing
strings rather than performing actual computations.
``fmio`` is a simple compressed, binary format and Python library
to read and write matrices -- defined as 2D numeric data with row
and column names, analogous to pandas DataFrames that only accept
numeric data.
Installation
============
The only dependencies are numpy and pandas.
.. code-block:: bash
$ pip install fmio
Usage
=====
From the command-line, you can serialize and deserialize fmio
matrices.
.. code-block:: bash
$ fmio < in.tsv > out.fmio
$ fmio -dc < out.fmio
<same as input>
The real purpose is to perform fast reads from within Python:
``run.py``
.. code-block:: python
import fmio, sys
with fmio.Reader(sys.stdin) as h:
for r in h:
print(r.name, r.sum())
.. code-block:: bash
$ python run.py < out.fmio
Warnings
========
The file format is in machine-native format. Although almost all
modern processors are "little-endian", these files may not be
completely portable.
The library is still in development. The file format is mostly
stable but still subject to change. Don't use this for long-term
data storage.
License
=======
AGPLv3
fmio - Fast stream I/O for numeric matrices
==========================================
If you have ever piped large CSV or TSV numeric data between
scripts, you might realize just how much time is taken parsing
strings rather than performing actual computations.
``fmio`` is a simple compressed, binary format and Python library
to read and write matrices -- defined as 2D numeric data with row
and column names, analogous to pandas DataFrames that only accept
numeric data.
Installation
============
The only dependencies are numpy and pandas.
.. code-block:: bash
$ pip install fmio
Usage
=====
From the command-line, you can serialize and deserialize fmio
matrices.
.. code-block:: bash
$ fmio < in.tsv > out.fmio
$ fmio -dc < out.fmio
<same as input>
The real purpose is to perform fast reads from within Python:
``run.py``
.. code-block:: python
import fmio, sys
with fmio.Reader(sys.stdin) as h:
for r in h:
print(r.name, r.sum())
.. code-block:: bash
$ python run.py < out.fmio
Warnings
========
The file format is in machine-native format. Although almost all
modern processors are "little-endian", these files may not be
completely portable.
The library is still in development. The file format is mostly
stable but still subject to change. Don't use this for long-term
data storage.
License
=======
AGPLv3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fmio-1.0-beta.tar.gz
(3.7 kB
view details)
File details
Details for the file fmio-1.0-beta.tar.gz
.
File metadata
- Download URL: fmio-1.0-beta.tar.gz
- Upload date:
- Size: 3.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5adc03b900700389d43725fe03c24d94fe856ab33e76b7d2261f8b9e918c2ce1 |
|
MD5 | 4673edc22bd1a3ae7ab37ae67d7de578 |
|
BLAKE2b-256 | d059f3af9bca536924ef94b7212f7b920938d1e7b8a724ed2cb7323f18d72f68 |