Skip to main content

The Cytocraft package provides prediction of chromosome configuration based on subcellular resolution spatial transcriptomics.

Project description

Cytocraft

Overview

The Cytocraft package generates a 3D reconstruction of transcription centers based on subcellular resolution spatial transcriptomics.

Cytocraft employs a multi-start optimization strategy, running the reconstruction algorithm from multiple distinct random initializations (default: 5) and selecting the solution with the lowest final reconstruction error. This approach improves robustness and helps avoid local minima.

System Requirements

  • Operating Systems:
    • Ubuntu 20.04
    • macOS 11.2
    • Windows 10
  • Software Dependencies:
    • Python 3.8 or higher
    • Required Python packages: numpy, pandas, scanpy, matplotlib, shapely
  • Tested Versions:
    • Python 3.8, 3.9, 3.12
  • Hardware Requirements:
    • No specific non-standard hardware required

Installation

pip install cytocraft

Installation Guide

  1. Ensure Python 3.8 or higher is installed.
  2. Run the command above to install Cytocraft and its dependencies.
  3. Typical install time on a "normal" desktop computer is approximately 1-5 minutes.

Interactive Mode Usage

import

import cytocraft.craft as cc

read input

gem_path = 'path_to_your_data.csv'
gem = cc.read_gem_as_csv(gem_path, sep=',')
adata = cc.read_gem_as_adata(gem_path, sep=',')

run cytocraft (see tutorial for more details)

adata = cc.craft(
  gem=gem,
  adata=adata,
  species='YourSpecies',
  seed=your_seed,
  samplename='your_sample_name',
  n_starts=5  # Number of random initializations (default: 5)
  )

The n_starts parameter controls the multi-start optimization strategy. Cytocraft will run the reconstruction from n_starts distinct random initializations and automatically select the solution with the lowest final reconstruction error.

CLI Mode Usage

python craft.py [-h] [-p/-n PERCENT/NUMBER] [-t GENE_FILTER_THRESH] [-r RMSD_THRESH] [--sep \t] [-c CELLTYPE] [--ctkey CTKEY] [--cikey CIKEY] [--csep \t] [--seed SEED] [--n_starts N_STARTS] -i gem_path -o out_path --species {Human,Mouse,Axolotls,Monkey}

Positional arguments:

-i,--gem_path Input: Path of input gene expression matrix file

-o,--out_path Output: Directory to save results

--species Species of the input data, e.g. {Human,Mouse,Axolotls,Monkey}

Optional arguments:

-h, --help Show this help message and exit

-p/-n, --percent/--number Percent/Number of anchor gene for rotation derivation, default: 0.001/10

-t, --gene_filter_thresh The maximum allowable proportion of np.nan values in a column (representing a gene) of the observed transcription centers (Z), default: 0.90

-r, --rmsd_thresh RMSD threshold. If the computed RMSD value is less than or equal to this threshold, it means the process has reached an acceptable level of similarity or convergence, and the loop is exited. default: 0.01

--sep Separator of the input gene expression matrix file

-c, --celltype Path of the annotation file containing cell types, multi-celltype mode only

--ctkey Key of celltype column in the cell type file, multi-celltype mode only

--cikey Key of cell id column in the cell type file, multi-celltype mode only

--csep Separator of the annotation file, multi-celltype mode only, default: \t

--seed Random seed, default: random int between 0 to 1000

--n_starts Number of random initializations for multi-start optimization strategy, default: 5

One-celltype example:

python craft.py -i ./data/SS200000108BR_A3A4_scgem.Spinal_cord_neuron.csv -o ./results/ --species Mouse

Multi-celltype example:

python craft.py -i ./data/SSSS200000108BR_A3A4_scgem.csv -o ./results/ --species Mouse --celltype ./data/cell_feature.csv --ctkey cell_type --cikey cell_id

Demo

Instructions to run on demo data

  1. Download the example data from the repository.
  2. Open and run the tutorial notebook provided in the repository to process the demo data step-by-step.

Expected Output

  • A h5ad format adata file containing the 3D reconstruction, rotation matrices of cells, and all other input information.
    • The adata.uns["multistart_info"] field contains metadata about the multi-start optimization, including:
      • n_starts: Number of initializations used
      • best_start_id: Which initialization produced the best result
      • best_seed: Random seed of the best run
      • best_rmsd: Final RMSD of the best run
      • all_rmsds: List of final RMSDs from all runs
      • convergence_status: Convergence status for each run
  • A log file containing the following information:
    • Species
    • Sample Name
    • Seed
    • Cell Number
    • Gene Number
    • Arguments
    • Task ID
    • Multi-start optimization summary
    • RMSD values in each loop for each start
    • Number of transcription centers in the configuration.

Expected Run Time

  • Approximately 5-20 hours.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cytocraft-1.0.0.tar.gz (8.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cytocraft-1.0.0-py3-none-any.whl (6.1 MB view details)

Uploaded Python 3

File details

Details for the file cytocraft-1.0.0.tar.gz.

File metadata

  • Download URL: cytocraft-1.0.0.tar.gz
  • Upload date:
  • Size: 8.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.17

File hashes

Hashes for cytocraft-1.0.0.tar.gz
Algorithm Hash digest
SHA256 56b45704b6bfe5934db48e9df90f37ed09ffdd3d3559c7a0684c1b25a3861e3d
MD5 c9b9e1e231a74912a21ba9d01ab2deb4
BLAKE2b-256 863e9bdbdfc7f025e877d077ecb783c5361735f4432a98c587642b4a31f4e11e

See more details on using hashes here.

File details

Details for the file cytocraft-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: cytocraft-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 6.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.17

File hashes

Hashes for cytocraft-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 93e598347177011000c94031acec37bf16175e862f8b20b9e13b0d20336055bc
MD5 596a5deb637c477102a64d4647cce177
BLAKE2b-256 8306c5e9c93225196231142386afc42eeb6ca1358fc628a27f03682511592159

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page