Fetch, process, analyze, and aggregate microbiome sequencing data with SRA Toolkit and QIIME2.
Project description
q2sra
Conventional microbiome bioinformatics workflows are riddled with inefficiencies, as users must navigate a variety of fragmented tools, command-line utilities, and file management systems. In the contemporary research setting, with multiple individuals contribtuting to a singular project, issues with uniformity often arise, complicating subsequent data aggregation/analysis. The q2sra package reconciles these obstacles by providing a streamlined, centralized, and standardized framework for microbiome data analysis with QIIME 2.
Installation
$ pip install q2sra
Prerequisites
Pythonv3.11.6+QIIME 2v2023.7+SRA Toolkitv3.0.0+
Installing QIIME 2 with Conda
$ wget https://data.qiime2.org/distro/core/qiime2-2023.7-py38-linux-conda.yml
$ conda env create \
-n qiime2-2023.7 \
--file qiime2-2023.7-py38-linux-conda.yml
$ rm qiime2-2023.7-py38-linux-conda.yml
Installing SRA Toolkit
Instructions can be found here.
Creating a q2sra Project
To create a project, simply initialize a q2sra.Proj object, supplying the intended project name as the sole parameter.
>>> from q2sra import Proj
>>> proj = Proj('my_proj')
q2sra Project Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
name |
String | None | Project name |
fields |
List of str | [ ] | Metadata fields |
nsamples |
Integer | 30 | Maximum number of samples from each study |
paired |
Boolean | True | Whether to use forward and reverse reads or exclusively forward reads |
Adding Metadata Fields
q2sra.Proj.add_field(field: str, required: bool) -> None
Arguments
field- Name of fieldrequired- Whether the field is required [default=False]
Example Run
>>> proj.add_field('Phylum')
>>> proj.add_field('Country', required = True)
Saving a Project
>>> proj.save()
Output
<proj name>.pkl - a pickle file storing the project's attributes.
Loading a Pre-configured Project
Any existing q2sra project saved in .pkl format (see previous step) can later be loaded to perform additional actions (adding more studies, merging runs, etc.).
q2sra.proj.load(name: str) -> None
Arguments
name- Name of project to load
Example Run
>>> proj = Proj.load('my_proj')
Adding Studies
q2sra.Proj.run(study_name: str, accession: str, include: list, exclude: list) -> None
Arguments
study_name- Name of studyaccession- Study accession number in the NCBI SRA databaseinclude- List of substrings that must be included when filtering.fastqfiles [default=[]]exclude- List of substrings that must be excluded when filtering.fastqfiles [default=[]]
Example Run (w/ user input)
>>> proj.run('takagi_2022', 'PRJNA809527')
Phylum: Chordata
[Required] Country: Japan
Aggregating Studies
After compiling a satisfactory number of studies, individual metadata files and QIIME 2 feature tables/representative sequences can be merged for further analysis using the following method:
>>> proj.merge()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file q2sra-1.0.0.tar.gz.
File metadata
- Download URL: q2sra-1.0.0.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34df97b8038187eb2cd8945e075fc2d7edf62c63cf0d284a7e0f0a1d38ebbd91
|
|
| MD5 |
aad06a9d5b63a1cd385c67aadd0af27f
|
|
| BLAKE2b-256 |
1edd325c14b3980dc38995fac35119ae3d3b5f11b13ed16fbbe9f1ea089c9f30
|
File details
Details for the file q2sra-1.0.0-py3-none-any.whl.
File metadata
- Download URL: q2sra-1.0.0-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f67e701a4f737d31ffee54339eb988965fbbdd34f6a555635e5a1aa2f41aac4a
|
|
| MD5 |
3ba2769eb1590627bf21ca0599b09e7f
|
|
| BLAKE2b-256 |
8f37864573a58284eddc2aecfff697bee54bc1012fdf82a11737270b9ff2aab0
|