Skip to main content

process SRSLY UMIs

Project description

build

SRSLY UMI processing

SRSLY UMIs are attached to the i7 index, and require a bit of handling to make it through bcl2fastq. This package helps guide that process.

SRSLY UMI dual-index sequencing runs

Illumina sequencing performs read cycles for the i7 and i5 indices in between the fragment reads. SRSLY UMIs are attached to the i7 reads, like this:

Order of read cycles
+------------------------+--------------+------------+----------------+-----------------------+
| Forward fragment read  | i7 index     |   UMI      | i5 index       | Reverse fragment read |
+------------------------+--------------+------------+----------------+-----------------------+

Standard bcl2fastq processing
+------------------------+---------------------------+----------------+-----------------------+
| RunInfo.xml Read 1     | RunInfo.xml R2            | RunInfo.xml R3 | RunInfo.xml R4        |
+------------------------+---------------------------+----------------+-----------------------+
| FASTQ output R1        | FASTQ header, without UMI                  | FASTQ output R2       |
+------------------------+--------------------------------------------+-----------------------+

Reformatted bcl2fastq processing
+------------------------+--------------+------------+----------------+-----------------------+
| new RunInfo.xml Read 1 | R2 IsIndex   | R3         | R4 IsIndex     | R5                    |
+------------------------+--------------+------------+----------------+-----------------------+
| fixed FASTQ output R1  | FASTQ header, with UMI                     | fixed FASTQ output R3 |
+------------------------+--------------------------------------------+-----------------------+

However, bcl2fastq can't insert UMIs into the fragment name in the FASTQ header unless it is part of etiher the output R1 or output R2. (Note that index reads as defined in the RunInfo.xml do not count as output reads.

So to solve this, we define a new RunInfo.xml that defines five reads instead of four:

With standard bcl2fastq processing with the TrimUMI option, this results in the UMI in the fragment name in the FASTQ files. However, it has two side effects: the UMI includes both the 5bp of the UMI as well as followed by the first 5bp of the actual read2. This should be compatible with most UMI analysis. Additionally, the proper R2 file is labeled as R3. Post-processing can easily rename the R3 to R2.

Using this package

After installation of this python package, the srslyumi command will take two arguments: 1) an existing run directory, and 2) an output directory for the FASTQ and bcl2fastq reports. Inside this output directory, a new RunInfo.xml and SampleSheet.csv will be created, along with a bcl2fastq2.sh command that can be used to rerun the process. Note that at the end of this command, the _R3_ files are renamed to _R2_.

Developing this package further

When your working directory is the root of this repository, the same directory that contains setup.py, you can run pip install -e . to install the package in a form that lets you edit your code and run it as a python package at the same time.

Testing and test coverage

During delevopment tox, will setup testing virtual environments for Python 2.7 and Python 3.6 and run all tests. Before code can accepted to the main repository it must pass test on Python 2.7, 3.5, 3.6, 3.7, and 3.8, which will run on GitHub automatically.

For quick tests in your current Python environment, run pytest, though you may need to install the test dependencies as listed under the tox section of pyproject.toml.

To run quick tests in your current environment, run pytest

To assess code coverage of the tests, run pytest --cov --report=html.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

srslyumi-0.3.tar.gz (9.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

srslyumi-0.3-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

srslyumi-0.3-py2-none-any.whl (10.6 kB view details)

Uploaded Python 2

File details

Details for the file srslyumi-0.3.tar.gz.

File metadata

  • Download URL: srslyumi-0.3.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for srslyumi-0.3.tar.gz
Algorithm Hash digest
SHA256 8b728353da2f68e73b54c58ba858002f4f0939067e7d1835491a78a86f25f415
MD5 728bb29b93ad870b8f9a93eecb95f0d4
BLAKE2b-256 d3d10c50023fbac7db21405b4e895a615e8aa4063b884315cdeddb8be641f794

See more details on using hashes here.

File details

Details for the file srslyumi-0.3-py3-none-any.whl.

File metadata

  • Download URL: srslyumi-0.3-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for srslyumi-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b0d0ba02b5582e1eb85690bb9c7b88556494eed2e31b101717feb08c0ee48b6f
MD5 db425e469a7bd7ff1b2d3c769cab5c8e
BLAKE2b-256 830e2e0945d8d21d15b3af5f6781123b76ca9c9f9ef0d69e8bbfead7bab5f6b1

See more details on using hashes here.

File details

Details for the file srslyumi-0.3-py2-none-any.whl.

File metadata

  • Download URL: srslyumi-0.3-py2-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for srslyumi-0.3-py2-none-any.whl
Algorithm Hash digest
SHA256 2986e44df49af8dbee18c170658f290a45fe168dc7925c84c8243648de47d1ed
MD5 9000bce8bfb1f8a0a986ae0a9a3cb60f
BLAKE2b-256 6374f7f6d4b75c30c9b5a59836667873274730ef617a511cf508752659c7edf6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page