Library to generate quicklooks and data quality checks on Helikite campaigns

These details have not been verified by PyPI

Project description

helikite-data-processing

This library supports Helikite campaigns by unifying field-collected data, generating quicklooks, and performing quality control on instrument recordings. It is now available on PyPI, can be used via a command‐line interface (CLI), and also runs in Docker containers if needed.

Getting Started
Using the Library
Cleaner
Documentation & Examples
Command-line Usage
Development
1. The Instrument class
2. Adding more instruments
Configuration
1. Application constants
2. Runtime configuration

Getting Started

Pip Installation

Helikite is published on PyPI. To install it via pip, run:

pip install helikite-data-processing

After installation, the CLI is available as a system command:

helikite --help

Docker

Note: Docker usage is now optional. For most users, installing via pip is the recommended approach.

Building and Running with Docker

Build the Docker image:
```
docker build -t helikite .
```

Generate project folders and create the configuration file:

docker run \
    -v ./inputs:/app/inputs \
    -v ./outputs:/app/outputs \
    helikite:latest generate_config

Preprocess the configuration file:

docker run \
    -v ./inputs:/app/inputs \
    -v ./outputs:/app/outputs \
    helikite:latest preprocess

Process data and generate plots:

docker run \
    -v ./inputs:/app/inputs \
    -v ./outputs:/app/outputs \
    helikite:latest

You can also use the pre-built image from GitHub Packages:

docker run \
   -v ./inputs:/app/inputs \
   -v ./outputs:/app/outputs \
   ghcr.io/eerl-epfl/helikite-data-processing:latest generate_config

Makefile

The Makefile provides simple commands for common tasks:

make build             # Build the Docker image
make generate_config   # Generate the configuration file in the inputs folder
make preprocess        # Preprocess data and update the configuration file
make process           # Process data and generate plots (output goes into a timestamped folder)

Using the Library

Helikite can be used both as a standalone CLI tool and as an importable Python package. For non-programmers, the CLI is the simplest way to use the library. For programmers, the library can be imported and used in your own scripts:

import helikite
from helikite.processing import preprocess, sorting
from helikite.constants import constants

# For example, to generate a configuration file programmatically:
preprocess.generate_config()

A complete list of available functions and modules is documented on the auto-published documentation site.

Cleaner

The cleaner module is designed to tidy up output folders generated by the application. For instructions on how to use it, refer to the Level 0 notebook.

Documentation & Examples

For full API documentation, usage examples, and tutorials, please visit the Helikite Data Processing Documentation.

The notebooks folder also contains a Level 0 processing example that demonstrates how to use the library for basic data processing tasks.

Command-line Usage

Once installed (via pip or Docker), you can use the CLI to run the three main stages of the application:

Generate a configuration file: This creates a config file in your inputs folder.
```
helikite generate-config
```
Preprocess: Scans the input folder, associates raw instrument files to configurations, and updates the config file.
```
helikite preprocess
```
Process: Processes the input data based on the configuration, normalizes timestamps, and generates plots. (Running without any command runs this stage.)
```
helikite
```

For detailed help on any command, append --help (e.g., helikite preprocess --help).

Development

The Instrument class

The structure of the Instrument class allows specific data cleaning activities to be overridden for each instrument that inherits from it. The main application (in helikite.py) calls these class methods to process the data.

Adding more instruments

The configuration file is generated during the generate_config/preprocess steps by iterating over the instantiated classes imported in helikite/instruments/__init__.py. To add a new instrument, create a subclass of Instrument and import it in __init__.py.

Firstly, the class should inherit from Instrument and set a unique name (e.g., for the MCPC instrument):

def __init__(self, *args, **kwargs) -> None:
    super().__init__(*args, **kwargs)
    self.name = 'mcpc'

The minimum functions required are:

file_identifier(): Accepts the first 50 lines of a CSV file and returns True if it matches the instrument’s criteria (typically checking header content).

# Example for the pico instrument:
def file_identifier(self, first_lines_of_csv) -> bool:
    if ("win0Fit0,win0Fit1,win0Fit2,win0Fit3,win0Fit4,win0Fit5,win0Fit6,"
        "win0Fit7,win0Fit8,win0Fit9,win1Fit0,win1Fit1,win1Fit2") in first_lines_of_csv[0]:
        return True
    return False

set_time_as_index(): Converts the instrument's timestamp information into a common pandas DateTimeIndex.

# Example for the filter instrument:
def set_time_as_index(self, df: pd.DataFrame) -> pd.DataFrame:
    df['DateTime'] = pd.to_datetime(
        df['#YY/MM/DD'].str.strip() + ' ' + df['HR:MN:SC'].str.strip(),
        format='%y/%m/%d %H:%M:%S'
    )
    df.drop(columns=["#YY/MM/DD", "HR:MN:SC"], inplace=True)
    df.set_index('DateTime', inplace=True)
    return df

For more details and examples, refer to the auto-published documentation.

Configuration

There are three sources of configuration parameters:

Application constants

These are defined in helikite/constants.py and include settings such as filenames, folder paths for inputs/outputs, logging formats, and default plotting parameters.

Runtime configuration

The runtime configuration is stored in config.yaml (located in your inputs folder). This file is generated during the generate_config or preprocess steps. It holds runtime arguments for each instrument (e.g., file locations, time adjustments, and plotting settings).

Below is an example snippet from a generated config.yaml:

global:
  time_trim:
    start: 2022-09-29 10:21:58
    end: 2022-09-29 12:34:36
ground_station:
  altitude: null
  pressure: null
  temperature: 7.8
instruments:
  filter:
    config: filter
    date: null
    file: /app/inputs/220209A3.TXT
    pressure_offset: null
    time_offset:
      hour: 5555
      minute: 0
      second: 0
plots:
  altitude_ground_level: false
  grid:
    resample_seconds: 60

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.4

Mar 3, 2026

This version

1.1.3

Sep 9, 2025

1.1.2

Mar 20, 2025

1.1.1

Mar 17, 2025

1.1.0

Mar 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

helikite_data_processing-1.1.3.tar.gz (7.8 MB view details)

Uploaded Sep 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

helikite_data_processing-1.1.3-py3-none-any.whl (8.2 MB view details)

Uploaded Sep 9, 2025 Python 3

File details

Details for the file helikite_data_processing-1.1.3.tar.gz.

File metadata

Download URL: helikite_data_processing-1.1.3.tar.gz
Upload date: Sep 9, 2025
Size: 7.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.11 Linux/6.11.0-1018-azure

File hashes

Hashes for helikite_data_processing-1.1.3.tar.gz
Algorithm	Hash digest
SHA256	`38b9fb1a601d04f34a1e91554e649b02eb9a299bdc460596d8ba5499af8e946a`
MD5	`687da2d9dfbfd5c43521c67c58c5e05e`
BLAKE2b-256	`a811a148522d0729c15e5d52f8ef9f40653aafd3b29895ac544e989e38b8dbdd`

See more details on using hashes here.

File details

Details for the file helikite_data_processing-1.1.3-py3-none-any.whl.

File metadata

Download URL: helikite_data_processing-1.1.3-py3-none-any.whl
Upload date: Sep 9, 2025
Size: 8.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.12.11 Linux/6.11.0-1018-azure

File hashes

Hashes for helikite_data_processing-1.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9b43338518333f7162f25c3d71ceb3fd4e9b9e31b71805cca8f319224e33da42`
MD5	`5fdf8b429056190b9bbbdb4e1517ccaa`
BLAKE2b-256	`0516a0fe19d33c165d844b315251c3c610b7f0a0862efac6e0dcb28634a0dbc9`

See more details on using hashes here.

helikite-data-processing 1.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

helikite-data-processing

Table of Contents

Getting Started

Pip Installation

Docker

Building and Running with Docker

Makefile

Using the Library

Cleaner

Documentation & Examples

Command-line Usage

Development

The Instrument class

Adding more instruments

Configuration

Application constants

Runtime configuration

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes