Skip to main content

OPU anaysis for Raman single-cell spectroscopy data

Project description

Raman OPU Analysis

An Operational phenotypic unit (OPU) analysis library for Raman single-cell spectroscopy data

Dependencies

This library requires python>=3.6; below packages are also required:

  • numpy, scipy, scikit-learn
  • matplotlib
  • skfeature [1]

[1]: the original repo contains a bug in laplacian score; in the installation below the actual package used (skfeature-gli) can be found at (https://github.com/lguangyu/scikit-feature.git)

Installation

The installation is as easy as a single-line command:

pip install raman-opu-analysis

Synopsis and Basic Command-line Usage

This package provide main command-line scripts:

  • opu_analysis: the main analysis script
  • opu_transform_labspec_txt: a supporting script that converts txt dumps from LabSpec into the tabular format needed by this package.
  • opu_spectra_preview: a convenient visualization of spectra dataset

OPU Analysis

The library can be used both in CLI, Jupyter notebook or with another python library. The example below shows the usage in CLI as a standalone script:

If you have already prepared

Parepare the Dataset Config File

A json config file need to be prepared before using the opu_analysis script. The config is a list of biosample configs, in the following structure:

[
	{
		# biosample-1 configs
	},
	{
		# biosample-2 configs
	},
	... # repeat to add more
]

Each of the biosample config contains 3 keys-value paris:

  • name: name of the biosample, will be shown in outputs
  • color: color of the biosample, currently only used in the HCA plot, in the standard HTML format (#RRGGBB)
  • file: the tabular data file(s) belong to the biosample, it can be a string (as the path to the data file) or a list (paths to the data files).

An example is:

{
	"name": "biosample-1",
	"color": "#0000ff",
	"file": "biosample-1.data.tsv"
}

Another example with file being a list:

{
	"name": "biosample-2",
	"color": "#0000ff",
	"file": [
		"biosample-2.data_1.tsv",
		"biosample-2.data_2.tsv"
	]
}

All files in that list will be combined under the related biosample, and will not be distinguished unless investigating the spectra names in outputs.

An functional example of such config json can be found in doc/example.json.

Analysis

Here we use the example provided in the doc directory. First cd doc to enter the directory, then call the following command in terminal:

opu_analysis example.json \
	-b 5.0 -L 400 -H 1800 -N l2 \
	--metric cosine \
	--cutoff-threshold 0.7 \
	--opu-min-size 0.05 \
	--opu-labels example.json.opu_labels.txt \
	--opu-collection-prefix example.json.opu_collection \
	--opu-hca-plot example.json.hca.png \
	--abund-table example.json.opu_abund.tsv \
	--abund-alpha-diversity example.json.opu_alpha_diversity.tsv \
	--abund-stackbar-plot example.json.opu_abund.png \
	--abund-biplot example.json.opu_pca.png \
	--feature-rank-method fisher_score \
	--feature-rank-table example.json.opu_feature_rank.tsv \
	--feature-rank-plot example.json.opu_feature_rank.png

A full set of output will be generated in the doc folder.

Convert LabSpec txt Dumps

Data in LabSpec txt dump format needs to be converted into the tabular format. The LabSpec format is 2-column tab-delimited table, similar to followings:

401.23	0.39
402.56	0.01
...

The first column is wavenumber and the second column is intensity, and each file encodes only one spectrum. To convert multiple spectra into a single file, first organize them under a same directory (e.g. inputdir), and run following:

opu_transform_labspec_txt \
	-x txt -b 5 -L 400 -H 1800 -N l2 \
	-o output.data.tsv \
	inputdir

The program will scan the inputdir folder and discover all files with an extension of txt, then combine them into a single file output.data.tsv. Other parameters in the above example instruct the program to bin the wavenumbers using a window size of 5, extract only the 400-1800 cm-1 wavenumber range, and do an l2-normalization per spectrum. These additional data processing parameters are optional, however the binning parameter (-b/--bin-size) is high recommended. This option will force aligning and unify the wavenumbers dicovered in multiple spectrum files. In case the bin size is not given (indicating no binning will be performed) but the wavenumbers in different input spetrum files are different, an error will occur.

Jupyter Notebook Usage

To use this package in Jupyter notebook or as a library for integrating with other analysis pipelines, simply do:

from opu_analysis_lib import OPUAnalysis

The detailed analysis and function calls are stated in doc/example.ipynb.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raman-opu-analysis-1.2.4.tar.gz (38.6 kB view details)

Uploaded Source

Built Distribution

raman_opu_analysis-1.2.4-py3-none-any.whl (45.2 kB view details)

Uploaded Python 3

File details

Details for the file raman-opu-analysis-1.2.4.tar.gz.

File metadata

  • Download URL: raman-opu-analysis-1.2.4.tar.gz
  • Upload date:
  • Size: 38.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for raman-opu-analysis-1.2.4.tar.gz
Algorithm Hash digest
SHA256 d9a3620573eefb732806a5d44596c56ea6ef61ff55f8724ce991322f9e14fa17
MD5 7a2c3d4ec45b1d50ba2cd06313085c9f
BLAKE2b-256 e0f0d57771b127fe25defbbcce1a5b4dda7d4c53b78776d8ed466a93097f280b

See more details on using hashes here.

File details

Details for the file raman_opu_analysis-1.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for raman_opu_analysis-1.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 125b357a47c1a4e3409e4f76bd91c8b1522cbf13462003bc6e6b5b7593b43539
MD5 6fe5808bb94bfd7b4f6882553689ea31
BLAKE2b-256 5fd98b1c0046f2fac9d16c91b6f7c1fb735704fea3a7f097a1dbb030c8f80ebf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page