Programming- and CLI-Interface for the h5-dataformat of the Shepherd-Testbed
Project description
Shepherd - Data
This Python Module eases the handling of hdf5-recordings used by the shepherd-testbed. Users can read, validate and create files and also extract, down-sample and plot information.
Installation
PIP - Online
pip3 install shepherd-data
PIP - Offline
- clone repository
- navigate shell into directory
- install local module
git clone https://github.com/orgua/shepherd-datalib
cd .\shepherd-datalib
pip3 install ./
Development
PipEnv
- clone repository
- navigate shell into directory
- install environment
- activate shell
- optional
- update pipenv (optional)
- add special packages with
-devswitch
git clone https://github.com/orgua/shepherd-datalib
cd .\shepherd-datalib
pipenv install --dev
pipenv shell
pipenv update
pipenv install --dev pytest
running Testbench
- run pytest
pytest
code coverage (with pytest)
- run coverage
- check results (in browser
./htmlcov/index.html)
coverage run -m pytest
coverage html
# or simpler
coverage report
Programming Interface
Basic Usage (recommendation)
import shepherd_data as shpd
with shpd.Reader("./hrv_sawtooth_1h.h5") as db:
print(f"Mode: {db.get_mode()}")
print(f"Window: {db.get_window_samples()}")
print(f"Config: {db.get_config()}")
Available Functionality
Reader()- file can be checked for plausibility and validity (
is_valid()) - internal structure of h5file (
get_metadata()orsave_metadata()... to yaml) with lots of additional data - access data and various converters, calculators
read_buffers()-> generator that provides one buffer per call, can be configured on first callget_calibration_data()get_windows_samples()get_mode()get_config()- direct access to root h5-structure via
reader['element'] - converters for raw / physical units:
si_to_raw()&raw_to_si() energy()sums up recorded power over time
downsample()(if needed) visualize recording (plot_to_file())
- file can be checked for plausibility and validity (
Writer()- inherits all functionality from Reader
append_iv_data_raw()append_iv_data_si()set_config()set_windows_samples()
- IVonne Reader
convert_2_ivcurves()converts ivonne-recording into a shepherd ivcurveupsample_2_isc_voc()TODO: for now a upsampled but unusable version of samples of short-circuit-current and open-circuit-voltageconvert_2_ivsamples()already applies a simple harvesting-algo and creates ivsamples
./examples/example_convert_ivonne.pyconverts IVonne recording (jogging_10m.iv) to shepherd ivcurves, NOTE: slow implementationexample_extract_logs.pyis analyzing all files in directory, saves logging-data and calculates cpu-load and data-rateexample_generate_sawtooth.pyis using Writer to generate a 60s ramp with 1h repetition and uses Reader to dump metadata of that fileexample_plot_traces.pydemos some mpl-plots with various zoom levelsexample_repair_recordings.pymakes old recordings from shepherd 1.x fit for v2jogging_10m.iv- 50 Hz measurement with Short-Circuit-Current and two other parameters
- recorded with "IVonne"
CLI-Interface
After installing the module the datalib offers some often needed functionality on the command line:
Validate Recordings
- takes a file or directory as an argument
shepherd-data validate dir_or_file
# examples:
shepherd-data validate ./
shepherd-data validate hrv_saw_1h.h5
Extract IV-Samples to csv
- takes a file or directory as an argument
- can take down-sample-factor as an argument
shepherd-data extract [-f ds-factor] [-s separator_symbol] dir_or_file
# examples:
shepherd-data extract ./
shepherd-data extract -f 1000 -s ; hrv_saw_1h.h5
Extract meta-data and sys-logs
- takes a file or directory as an argument
shepherd-data extract-meta dir_or_file
# examples:
shepherd-data extract-meta ./
shepherd-data extract-meta hrv_saw_1h.h5
Plot IVSamples
- takes a file or directory as an argument
- can take start- and end-time as an argument
- can take image-width and -height as an argument
shepherd-data plot [-s start_time] [-e end_time] [-w plot_width] [-h plot_height] [--multiplot] dir_or_file
# examples:
shepherd-data plot --multiplot ./
shepherd-data plot -s10 -e20 hrv_saw_1h.h5
Downsample IVSamples (for later GUI-usage, TODO)
- generates a set of downsamplings (20 kHz to 0.1 Hz in x4 to x5 Steps)
- takes a file or directory as an argument
- can take down-sample-factor as an argument
shepherd-data downsample [-f ds-factor] [-r sample-rate] dir_or_file
# examples:
shepherd-data downsample ./
shepherd-data downsample -f 1000 hrv_saw_1h.h5
shepherd-data downsample -r 100 hrv_saw_1h.h5
Data-Layout and Design choices
Details about the file-structure can be found in the main-project.
TODO:
- update design of file
- data dtype, mode, ...
Modes and Datatypes
- Mode
harvesterrecorded a harvesting-source like solar with one of various algorithms- Datatype
ivsampleis directly usable by shepherd, input for virtual source / converter - Datatype
ivcurveis directly usable by shepherd, input for a virtual harvester (output are ivsamples) - Datatype
isc_vocis specially for solar-cells and needs to be (at least) transformed into ivcurves later
- Datatype
- Mode
emulatorreplayed a harvester-recording through a virtual converter and supplied a target while recording the power-consumption- Datatype
ivsampleis the only output of this mode
- Datatype
Compression & Beaglebone
- supported are uncompressed, lzf and gzip with level 1 (order of recommendation)
- lzf seems better-suited due to lower load, or if space isn't a constraint: uncompressed (None as argument)
- note: lzf seems to cause trouble with some third party hdf5-tools
- compression is a heavy load for the beaglebone, but it got more performant with recent python-versions
- size-experiment A: 24 h of ramping / sawtooth (data is repetitive with 1 minute ramp)
- gzip-1: 49'646 MiB -> 588 KiB/s
- lzf: 106'445 MiB -> 1262 KiB/s
- uncompressed: 131'928 MiB -> 1564 KiB/s
- cpu-load-experiments (input is 24h sawtooth, python 3.10 with most recent libs as of 2022-04)
- warning: gpio-traffic and other logging-data can cause lots of load
emu_120s_gz1_to_gz1.h5 -> emulator, cpu_util [%] = 65.59, data-rate = 352.0 KiB/s
emu_120s_gz1_to_lzf.h5 -> emulator, cpu_util [%] = 57.37, data-rate = 686.0 KiB/s
emu_120s_gz1_to_unc.h5 -> emulator, cpu_util [%] = 53.63, data-rate = 1564.0 KiB/s
emu_120s_lzf_to_gz1.h5 -> emulator, cpu_util [%] = 63.18, data-rate = 352.0 KiB/s
emu_120s_lzf_to_lzf.h5 -> emulator, cpu_util [%] = 58.60, data-rate = 686.0 KiB/s
emu_120s_lzf_to_unc.h5 -> emulator, cpu_util [%] = 55.75, data-rate = 1564.0 KiB/s
emu_120s_unc_to_gz1.h5 -> emulator, cpu_util [%] = 63.84, data-rate = 351.0 KiB/s
emu_120s_unc_to_lzf.h5 -> emulator, cpu_util [%] = 57.28, data-rate = 686.0 KiB/s
emu_120s_unc_to_unc.h5 -> emulator, cpu_util [%] = 51.69, data-rate = 1564.0 KiB/s
Release-Procedure
- increase version number in init.py
- install and run pre-commit, for QA-Checks, see steps below
- every commit get automatically tested by Github
- put together a release on Github - the tag should match current version-number
- Github automatically pushes release to pypi
pip3 install pre-commit
pre-commit run --all-files
Open Tasks
- click progressbar -> could replace tqdm
- implementations for this lib
- generalize up- and down-sampling, use out_sample_rate instead of ds-factor
- lib samplerate (tested) -> promising, but designed for float32 and range of +-1.0
- lib resampy (tested) -> could be problematic with slice-iterator
- https://stackoverflow.com/questions/29085268/resample-a-numpy-array
- scipy.signal.resample, https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample.html
- scipy.signal.decimate, https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.decimate.html
- scipy.signal.resample_poly, https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample_poly.html#scipy.signal.resample_poly
- timestamps could be regenerated with np.arange( tmin, tmax, 1e9/samplerate)
- generalize converters (currently in IVonne)
- isc&voc <-> ivcurve
- ivcurve -> ivsample
- plotting and downsampling for IVCurves ()
- plotting more generalized (power, cpu-util, ..., if IV then offer power as well)
- some metadata is calculated wrong (non-scalar datasets)
- unittests & codecoverage -> 79% with v22.5.4, https://pytest-cov.readthedocs.io/en/latest/config.html
- test example: https://github.com/kvas-it/pytest-console-scripts
- use coverage to test some edge-cases
- sub-divide valid() into healthy()
- add gain/factor to time, with repair-code
- add https://pypi.org/project/nessie-recorder/#files
- generalize up- and down-sampling, use out_sample_rate instead of ds-factor
- main shepherd-code
- proper validation first
- update commentary
- pin-description should be in yaml (and other descriptions for cpu, io, ...)
- datatype-hint in h5-file (ivcurve, ivsample, isc_voc), add mechanism to prevent misuse
- hostname for emulation
- full and minimal config into h5
- use the datalib as a base
- isc-voc-harvesting
- directly process isc-voc -> resample to ivcurve?
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shepherd_data-2023.2.1.tar.gz.
File metadata
- Download URL: shepherd_data-2023.2.1.tar.gz
- Upload date:
- Size: 30.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7d53fd588aeea5ed0cc94a1f2c8459374faa307703b44ba9832f383aab902c2
|
|
| MD5 |
a1d4b586e21fbf46a9b4473700923001
|
|
| BLAKE2b-256 |
a4acf4006c710e4ff639196f2d81a84f11e1b67ebc072f275b7ec6bbb5deb1d4
|
File details
Details for the file shepherd_data-2023.2.1-py3-none-any.whl.
File metadata
- Download URL: shepherd_data-2023.2.1-py3-none-any.whl
- Upload date:
- Size: 26.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8830d5b98affe5778b338fcb270fd91a9e58d52212439a5a30d3419a512973d
|
|
| MD5 |
610793aaef62e9cbd5b8d60992e862a2
|
|
| BLAKE2b-256 |
b7b47bec35a659270f8811a5709576bb3bfdc317c6ce7e674f1e6483534d5a1c
|