Skip to main content

A Python library for SeaFlow data.

Project description

Seaflowpy

A Python package for SeaFlow flow cytometer data.

Table of Contents

  1. Install
  2. Read EVT/OPP/VCT Files
  3. Command-line Interface
  4. Configuration
  5. Integration with R
  6. Testing
  7. Development

Install

This package is compatible with Python 3.7 and 3.8.

Source

This will clone the repo and create a new virtual environment seaflowpy. venv can be replaced with virtualenv, conda, etc.

git clone https://github.com/armbrustlab/seaflowpy
cd seaflowpy
[[ -d ~/venvs ]] || mkdir ~/venvs
python3 -m venv ~/venvs/seaflowpy
source seaflowpy/bin/activate
pip3 install -U pip setuptools wheel
pip3 install -r requirements-test.txt
pip3 install .
# Confirm the seaflowpy command-line tool is accessible
seaflowpy version
# Make sure basic tests pass
pytest
# Leave the new virtual environment
deactivate

PyPI

pip3 install seaflowpy

Docker

Docker images are available from Docker Hub at ctberthiaume/seaflowpy.

docker pull ctberthiaume/seaflowpy
docker run -it ctberthiaume/seaflowpy seaflowpy version

The Docker build file is in this repo at /Dockerfile. The build process for the Docker image is detailed in /build.sh.

Read EVT/OPP/VCT Files

All file reading functions will return a pandas.DataFrame of particle data. Gzipped EVT, OPP, or VCT files can be read if they end with a ".gz" extension. For these code examples assume seaflowpy has been imported as sfp and pandas has been imported as pd, e.g.

import pandas as pd
import seaflowpy as sfp

and *_filepath has been set to the correct data file.

Read an EVT file

evt = sfp.fileio.read_evt_labview(evt_filepath)

Read an OPP file, select the 50th quantile data using pandas.DataFrame boolean indexing, then keep only columns you're interested in.

opp = sfp.fileio.read_opp_labview(opp_filepath)
opp50 = opp[opp["q50"]]
opp50 = opp50[['fsc_small', 'chl_small', 'pe']]

Read a VCT file and attach to an OPP DataFrame.

vct50 = sfp.fileio.read_vct_csv(vct_filepath)  # <-- vct_filepath is for one quantile
df = sfp.particleops.merge_opp_vct(opp50, vct50)

Command-line interface

All seaflowpy CLI tools are accessible from the seaflowpy executable. Run seaflowpy --help to begin exploring the CLI usage documentation.

SFL validation workflow

SFL validation sub-commands are available under the seaflowpy sfl command. The usage details for each command can be accessed as seaflowpy sfl <cmd> -h.

The basic worfkflow should be

  1. If starting with an SDS file, first convert to SFL with seaflowpy sds2sfl

  2. If the SFL file is output from sds2sfl or is a raw SeaFlow SFL file, convert it to a normalized format with seaflowpy sfl print. This command can be used to concatenate multiple SFL files, e.g. merge all SFL files in day-of-year directories.

  3. Check for potential errors or warnings with seaflowpy sfl validate.

  4. Fix errors and warnings. Duplicate file errors can be fixed with seaflowpy sfl dedup. Bad lat/lon errors may be fixed withseaflowpy sfl convert-gga, assuming the bad coordinates are GGA to begin with. This can be checked with with seaflowpy sfl detect-gga. Other errors or missing values may need to be fixed manually.

  5. (Optional) Update event rates based on true event counts and file duration with seaflowpy sfl fix-event-rate. True event counts for raw EVT files can be determined with seaflowpy evt count. If filtering has already been performed then event counts can be pulled from the all_count column of the opp table in the SQLITE3 database. e.g. sqlite3 -separator $'\t' SCOPE_14.db 'SELECT file, all_count ORDER BY file'

  6. (Optional) As a check for dataset completeness, the list of files in an SFL file can be compared to the actual EVT files present with seaflowpy sfl manifest. It's normal for a few files to differ, especially near midnight. If a large number of files are missing it may be a sign that the data transfer was incomplete or the SFL file is missing some days.

  7. Once all errors or warnings have been fixed, do a final seaflowpy validate before adding the SFL file to the appropriate repository.

Configuration

To use seaflowpy sfl manifest AWS credentials need to be configured. The easiest way to do this is to install the awscli Python package and go through configuration.

pip3 install awscli
aws configure

This will store AWS configuration in ~/.aws which seaflowpy will use to access Seaflow data in S3 storage.

Integration with R

To call seaflowpy from R, update the PATH environment variable in ~/.Renviron. For example:

PATH=${PATH}:${HOME}/venvs/seaflowpy/bin

Testing

Seaflowpy uses pytest for testing. Tests can be run from this directory as pytest to test the installed version of the package, or run tox to install the source into a temporary virtual environment for testing.

Development

Source code structure

This project follows the Git feature branch workflow. Active development happens on the develop branch and on feature branches which are eventually merged into develop. Commits on the master branch represent stable release snapshots with version tags and build products, merged from develop with --no-ff to create a single commit in master while keeping the complete commit history in develop.

Build

To build source tarball, wheel, PyInstaller files, and Docker image, run ./build.sh. This will

  • create dist with source tarball and wheel file

  • executable files in ./pyinstaller/macos/dist/seaflowpy and ./pyinstaller/linux64/dist/seaflowpy

  • Docker image named seaflowpy:<version>

To remove all build files, run git clean -fd.

PyInstaller files and Docker image create depend on the wheel file located in dist.

Updating requirements files

Create a new virtual environment

python3 -m venv newenv
source newenv/bin/activate

Update pip, wheel, setuptools

pip3 install -U pip wheel setuptools

And install seaflowpy

pip3 install .

Then freeze the requirements

pip3 freeze | grep -v seaflowpy >requirements.txt

Then install test dependencies, test, and freeze

pip3 install pytest pytest-benchmark
pytest
pip3 freeze | grep -v seaflowpy >requirements-test.txt

Then install dev dependencies, test, and freeze

pip3 install pylint twine
pytest
pip3 freeze | grep -v seaflowpy >requirements-dev.txt

Leave the virtual environment

deactivate

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seaflowpy-3.1.0.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

seaflowpy-3.1.0-py3-none-any.whl (79.2 kB view details)

Uploaded Python 3

File details

Details for the file seaflowpy-3.1.0.tar.gz.

File metadata

  • Download URL: seaflowpy-3.1.0.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.6

File hashes

Hashes for seaflowpy-3.1.0.tar.gz
Algorithm Hash digest
SHA256 2f99efcde3f54b3909a62c1d42e1338ecf9f258987f6d89d5a1a590d85ca2b8c
MD5 1c1bae8be36db3cb6eafe51030262cdc
BLAKE2b-256 aa004494043ba8186043d03466bdd367139909045ff60df69c03571866962900

See more details on using hashes here.

File details

Details for the file seaflowpy-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: seaflowpy-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 79.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.8.6

File hashes

Hashes for seaflowpy-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d4879d6310c0cbbce2c3dcc24722308f3a1fcf1f439ea2e5b93dadaf7fccf44
MD5 485790465b8ef753c3110ba47b52f0e8
BLAKE2b-256 4d2ced5eb3e5af067d8e0f1311a59f8747f1ba53e48320bd2c8efa79b25bc329

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page