Skip to main content

Collection of scripts to transform and remap MaxQuant output files

Project description

MaxQuant Handler

Setup for proper usage

Install package

pip install mqhandler

Run mqhandler

1. Filter IDs

Filter proteins or genes by organism and/or decoy names.

1.1 Imports

# imports
import pandas as pd
from mqhandler import filter_ids as fi
from mqhandler.mq_utils.runner_utils import find_delimiter

1.2 Load your data

# load data into a dataframe with automated delimiter finder
data = pd.read_table(file, sep=find_delimiter(<file>)).fillna("")

1.3 Set preferences

organism = "human" # Specify organism the ids should match to
in_type = "protein" #  Define what type should be the source
protein_column = "Protein IDs" # Name of column with protein IDs
gene_column = "Gene names" # Name of column with gene names
action = "delete" # What to do, if IDs cell is empty after filtering. Keep empty cell, delete it or fill it based on gene name.
reviewed = True # Bool to indicate if newly retrieved protein IDs should be reduced to reviewed ones
decoy = False # Bool to indicate if protein ids from decoy fasta (REV__, CON__) should be kept

1.4 Filter IDs

# load data into a dataframe with automated delimiter finder
filtered_data = fi.filter_protein_ids(data = data, id_column = protein_column, organism = organism,
                                      decoy = decoy, action = action, gene_column = gene_column,
                                      reviewed = reviewed)

2. Remap gene names

Re-map gene names in MaxQuant file.

2.1 Imports

# imports
import pandas as pd
from mqhandler import remap_genenames as rg
from mqhandler.mq_utils.runner_utils import find_delimiter

2.2 Load your data

# load data into a dataframe with automated delimiter finder
data = pd.read_table(file, sep=find_delimiter(<file>)).fillna("")

2.3 Set preferences

organism = "human" # Specify organism the ids should match to
mode = <mode> #  Mode of refilling. See below for more infos
protein_column = "Protein IDs" # Name of column with protein IDs
gene_column = "Gene names" # Name of column with gene names
skip_filled = True # Bool to indicate if already filled gene names should be skipped
fasta = <file> # Fasta file when mode all or fasta

Modes

all		Use primarly fasta infos and additionally uniprot infos.
fasta		Use information extracted from fasta headers.
uniprot	        Use mapping information from uniprot and use all gene names.
uniprot_primary Use mapping information from uniprot and only all primary gene names.
uniprot_one	Use mapping information from uniprot and only use most frequent single gene name.

2.4 Remap gene names

# load data into a dataframe with automated delimiter finder
remapped_data = rg.remap_genenames(data = data, mode = mode, protein_column = protein_column, gene_column = gene_column,
                                   skip_filled = skip_filled, organism = organism, fasta = fasta)

3. Get orthologs

Get ortholog gene names from origin organism to target organism.

3.1 Imports

# imports
import pandas as pd
from mqhandler import map_orthologs as mo
from mqhandler.mq_utils.runner_utils import find_delimiter

3.2 Load your data

# load data into a dataframe with automated delimiter finder
data = pd.read_table(file, sep=find_delimiter(<file>)).fillna("")

3.3 Set preferences

source_organism = "rat" # Specify organism the ids should match to
target_organism = "human" # Specify organism the ids should match to
protein_column = "Protein IDs" # Name of column with protein IDs
gene_column = "Gene names" # Name of column with gene names

3.4 Get orthologs

# load data into a dataframe with automated delimiter finder
ortholog_data = mo.get_orthologs(data = data, protein_column = protein_column, gene_column = gene_column,
                                 organism = source_organism, tar_organism = target_organism)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mqhandler-0.0.29.tar.gz (30.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page