No project description provided
Project description
Righor-py
Companion to righor, to publish the python package. Install with pip install righor
.
Load a model:
import righor
import matplotlib.pyplot as plt
import seaborn
import pandas as pd
from tqdm.notebook import tqdm
from collections import Counter
import numpy as np
igor_model = righor.load_model("human", "trb")
# alternatively, you can load a model from igor files
# igor_model = righor.load_model_from_files(params.txt, marginals.txt, anchor_v.csv, anchor_j.csv)
Generate sequences fast:
# Create a generator object
generator = igor_model.generator(seed=42) # or igor_model.generator() to run it without a seed
# Generate 10'000 functional sequences (not out-of-frame, no stop codons, right boundaries)
for _ in tqdm(range(10000)):
# generate_without_errors ignore Igor error model, use "generate" if this is needed
sequence = generator.generate_without_errors(functional=True)
if "IGH" in sequence.cdr3_aa:
print("TRB CDR3 containing \"IGH\":", sequence.cdr3_aa)
# Generate one sequence with a particular V/J genes family
V_genes = righor.genes_matching("TRBV5", igor_model) # return all the V genes that match TRBV5
J_genes = righor.genes_matching("TRBJ", igor_model) # all the J genes
generator = igor_model.generator(seed=42, available_v=V_genes, available_j=J_genes)
generation_result = generator.generate_without_errors(functional=True)
print("Result:")
print(generation_result)
print("Explicit recombination event:")
print(generation_result.recombination_event)
Evaluate a given sequence:
my_sequence = "ACCCTCCAGTCTGCCAGGCCCTCACATACCTCTCAGTACCTCTGTGCCAGCAGTGAGGACAGGGACGTCACTGAAGCTTTCTTTGGACAAGGCACC"
# first align the sequence
align_params = righor.AlignmentParameters() # default alignment parameters
aligned_sequence = igor_model.align_sequence(my_sequence, align_params)
# we can also align a sequence from a CDR3 and a list of V-genes and J-genes (much faster)
# v_genes = righor.genes_matching("TRBV1", igor_model)
# j_genes = righor.genes_matching("TRBJ1", igor_model)
# igor_model.align_cdr3('TGTGTGAGAGATATTGTAGTAGTACCAGCTGCTAACCGCTTTCCTTCTTACTACTACTACTACTACATGGACGTCTGG', v_genes, j_genes)
# then evaluate it
infer_params = righor.InferenceParameters() # default inference parameters
result_inference = igor_model.evaluate(aligned_sequence, infer_params)
# Most likely scenario
best_event = result_inference.best_event
print(f"Probability that this specific event chain created the sequence: {best_event.likelihood / result_inference.likelihood:.2f}.")
print(f"Reconstructed sequence (without errors):", best_event.reconstructed_sequence)
print(f"Pgen: {result_inference.pgen:.1e}")
Infer a model:
# here we just generate the sequences needed
generator = igor_model.generator()
example_seq = generator.generate(False)
sequences = [generator.generate(False).full_seq for _ in range(1000)]
# define parameters for the alignment and the inference
align_params = righor.AlignmentParameters()
align_params.left_v_cutoff = 40
infer_params = righor.InferenceParameters()
# generate an uniform model as a starting point
# (it's generally *much* faster to start from an already inferred model)
model = igor_model.copy()
model.p_ins_vd = np.ones(model.p_ins_vd.shape)
model.error_rate = 0
# align multiple sequences at once
aligned_sequences = model.align_all_sequences(sequences, align_params)
# multiple round of expectation-maximization to infer the model
models = {}
model = igor_model.uniform()
model.error_rate = 0
models[0] = model
for ii in tqdm(range(35)):
models[ii+1] = models[ii].copy()
models[ii+1].infer(aligned_sequences, infer_params)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
righor-0.2.1.tar.gz
(1.4 MB
view hashes)
Built Distribution
Close
Hashes for righor-0.2.1-cp310-cp310-manylinux_2_31_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f05e2c1671445405378ef30f85d6215828f2d332d779b881604cc78c6295c626 |
|
MD5 | 187a79d4056efda2f2b686e8b7472804 |
|
BLAKE2b-256 | e7a4f64c3c6593b3b2c96295419108f0ccf6574b4f31e2d0ba694136632b4838 |