Skip to main content

GeoSeeq command line tools and python API

Project description

Geoseeq API Client

This package is a python library to interact with a Geoseeq server. It includes a command line interface that may be used to perform common tasks GeoSeeq tasks from the terminal.

GeoSeeq is a platform for sharing biological, climatological, and public health datasets. Learn more here.

This API client is a work in progress and we welcome suggestions, feedback, comments, and criticisms.


Installation

Install from PyPi

pip install geoseeq

Install from source

Download this directory and run python setup.py install


Using the Command Line

Run the command line by typing geoseeq into a terminal prompt. See available options by adding --help

$ geoseeq --help

Configuration and Using an API token

For many tasks you will need an API token to interact with GeoSeeq. You can get this token by logging into the GeoSeeq Portal going to your user profile and clicking the "Tokens" tab.

Once you have a token you will need to configure GeoSeeq to use it. Run geoseeq config and leave the profile name and url blank. You will be prompted to enter your API token.

$ geoseeq config
Set custom profile name? (Leave blank for default) []:
Enter the URL to use for GeoSeeq (Most users can use the default) [https://backend.geoseeq.com]:
Enter your GeoSeeq API token:
Profile configured.

This command will store your token in a file called ~/.config/geoseeq/profiles.json and will be used by all future commands.

Example Commands

You can find more command line examples in docs/

Download Short Read Sequencing data from one sample in a project as a set of FASTQ files

This command will download data from this project.

$ geoseeq download files --extension fastq.gz "GeoSeeq/Example CLI Project"

Uploading sequencing data

GeoSeeq can automatically group fastq files into samples according to their sample name, read number, and lane number. It supports paired end, single end, nanopore, and pacbio reads.

Assume you have data from a single ended sequencing run stored as fastq files:

  • Sample1_L1_R1.fastq.gz
  • Sample1_L1_R2.fastq.gz
  • Sample1_L2_R1.fastq.gz
  • Sample1_L2_R2.fastq.gz

You can upload these files to GeoSeeq using the command line. This example will upload 32 files in parallel:

# navigate to the directory where the fastq files are stored
$ ls -1 *.fastq.gz > fastq_files.txt  # check that files are present

$ geoseeq upload reads --cores 32 "GeoSeeq/Example CLI Project" fastq_files.txt
Using regex: "(?P<sample_name>[^_]*)_L(?P<lane_num>[0-9]*)_R(?P<pair_num>1|2)\.fastq\.gz"
All files successfully grouped.
sample_name: Sample1
  module_name: short_read::paired_end
    short_read::paired_end::read_1::lane_1: Sample1_L1_R1.fastq.gz
    short_read::paired_end::read_2::lane_1: Sample1_L1_R2.fastq.gz
    short_read::paired_end::read_1::lane_2: Sample1_L2_R1.fastq.gz
    short_read::paired_end::read_2::lane_2: Sample1_L2_R2.fastq.gz
Do you want to upload these files? [y/N]: y
Uploading Sample: Sample1

GeoSeeq will automatically create a new sample named Sample1 if it does not already exist.

This command would upload data to this project.. Since only organization members can upload data, you will need to replace GeoSeeq with your organization name.

To rename samples on the fly, provide a CSV file with current and new names using the --name-map option:

$ geoseeq upload reads --name-map sample_map.csv current_name new_name "GeoSeeq/Example CLI Project" fastq_files.txt

Note: You will need to have an API token set to use this command (see above)

Using the Python API in a program

Please see geoseeq_api/cli/download.py for examples of how to download data using the Python API directly.

Development in GitHub Codespaces

This repository includes a .devcontainer configuration so it can be opened directly in GitHub Codespaces. When the codespace is created the development dependencies are installed and pre-commit hooks are set up automatically.


Notes

Terminology

Some terms have changed in GeoSeeq since this package was written. The command line tool and code may contain references to old names.

Old Name New Name
Sample Group Project
Library defunct
Analysis Result ResultFolder
Analysis Result Field ResultFile

License and Credits

GeoSeeq is built and maintained by Biotia

The GeoSeeq API client is licensed under the MIT license.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geoseeq-0.7.12.tar.gz (38.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geoseeq-0.7.12-py3-none-any.whl (119.0 kB view details)

Uploaded Python 3

File details

Details for the file geoseeq-0.7.12.tar.gz.

File metadata

  • Download URL: geoseeq-0.7.12.tar.gz
  • Upload date:
  • Size: 38.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geoseeq-0.7.12.tar.gz
Algorithm Hash digest
SHA256 f1c9a88ee8c9a14127f4301fa47e9aaae8f3dd2e57be36a19b63f5057b1f3db9
MD5 50dc406965b8e0ba20d876ddc663bfa5
BLAKE2b-256 e044907b83d2a767776275e1d16294c5b9c3237ff32235b08cb1ae3075e62011

See more details on using hashes here.

Provenance

The following attestation bundles were made for geoseeq-0.7.12.tar.gz:

Publisher: python-publish.yml on biotia/geoseeq_api_client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file geoseeq-0.7.12-py3-none-any.whl.

File metadata

  • Download URL: geoseeq-0.7.12-py3-none-any.whl
  • Upload date:
  • Size: 119.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geoseeq-0.7.12-py3-none-any.whl
Algorithm Hash digest
SHA256 51670960b7b3c0babe6297b13a0d1a4b68c8934a75e134da8cd4b33e209e6fc7
MD5 71706b7c621cfd274d482b66b7b5d5f6
BLAKE2b-256 fcaa2be505a0005e51a41a53c15d8faba624fcd6226f3baa9ce5f1e36acf296b

See more details on using hashes here.

Provenance

The following attestation bundles were made for geoseeq-0.7.12-py3-none-any.whl:

Publisher: python-publish.yml on biotia/geoseeq_api_client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page