Tool to manipulate and aggregate data
Project description
Tomate
Tool to Manipulate and Aggregate data
Tomate is a Python package that provides ways to manipulate data under the form of a multi-dimensional array. It manages multiples variables, as well as the coordinates along which the data varies. It also provides multiple convenience functions to retrieve subparts of the data, do computations, or plot the data.
The data can be retrieved from disk, where it can be arranged in multiple ways and formats. Information on the data, such as variable attributes, or coordinates values can be retrieved automatically.
Features
For data in memory:
- Keep information about the data, the variables, the coordinates. All this information is in sync with the data.
- Select subparts of data easily.
- Use and create convenience function for analysis, plotting,...
For data on disk:
- Load data that spans multiple files and comes from different sources easily. Different file format ? different structure: rows or columns first ? indexing origin lower or upper ? a varying number of time steps for each file ? This is now all a breeze !
- Scan the files automatically to find values of coordinates, variables attributes, data indexing,...
- Load only subparts of data.
- Logs will ensure you are loading what you want to load.
And in general:
- Highly modulable, can be tailored to your needs.
- Fully documented.
As of now, this only supports NetCDF files out of the box. But the package can be easily extended for other file formats. See the section 'Expanding the package' of the documentation.
Only tested for linux, should work on other OS.
See examples for use cases.
Warning
The code has not been extensively tested for all the possible use cases it supports, and is evolving quickly. I recommend you check thorougly in the logs that the correct files are opened, and that the correct slices of data are taken from thoses files. See the documentation on logging for more information.
Features supplied in 'data_write', that allow to save a database information in a json file to avoid re-scanning it each time, is to be considered very experimental.
Documentation
Documentation is available online at ReadTheDocs
Requirements
Tomate requires python >=3.7. From this version, dictionaries preserve the order in which keys are added. The code heavily relies on this feature. This could be avoided, but would require a fair bit of refactoring.
Tomate requires the following python packages:
- numpy
Optional dependencies:
- [time] cftime>=1.1.3 - To manage dates in time coordinates
- [netcdf] netcdf4-python - To open netCDF4 files
- [plot] matplotlib - To create plots easily
- [compute] scipy - To do various computation on the data
Install
The package is distributed through PyPI. To install, run:
pip install tomate-data
To add optional dependencies:
pip install tomate-data [feature name]
Feature name can be Time, NetCDF, Plot, Compute.
The code is evolving quickly, it is recommended to upgrade it regurlarly:
pip install --upgrade tomate-data
Or to even install it directly from the development branch.
This will place the package files in ./src
, from where you just have to do a git pull
to update from the latest commit:
pip install -e git+https://github.com/Descanonge/tomate.git@develop#egg=tomate-data
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tomate_data-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73c9079f1fdd73862da880fbe9ae8b77e929197f1ea9d5432ae1f46a7c52501d |
|
MD5 | a7b44aad3e596b2a2458c512d9b6fd3d |
|
BLAKE2b-256 | 109dec90ed37995128f26feb5bed9e87c90301ee2d08877e1e6f0e5b650e4334 |