Skip to main content

Tool to refine cg/wgMLST Schemas

Reason this release was yanked:

Numpy and pandas version incompatible

Project description

SchemaRefinery - A Tool for Refining Genomic Schemas

For more detailed description refer to SchemaRefinery documentation

Description

The SchemaRefinery repository contains tools and modules for refining genomic schemas. These tools help in identifying paralogous loci, spurious genes, and annotating schemas. The repository supports various genomic data processing tasks and provides configurable parameters for different processes.

Installation Guide

Follow these steps to install the SchemaRefinery package on your system.

  1. Install Git: Ensure that Git is installed on your system. You can install Git using the following command:

    # For macOS
    brew install git
    
    # For Ubuntu/Debian
    sudo apt-get install git
    
    # For Fedora
    sudo dnf install git
    
  2. Install Conda: Ensure that Conda is installed on your system. You can install Miniconda (a minimal Conda installer) using the following commands:

    # For macOS and Linux
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    bash Miniconda3-latest-Linux-x86_64.sh
    
  3. Clone the Repository: Clone the SchemaRefinery repository from GitHub:

    git clone https://github.com/MForofontov/Schema_Refinery.git
    
  4. Change Directory: Navigate to the cloned repository:

    cd Schema_Refinery
    
  5. Create a Conda Environment: It is recommended to create a conda environment to manage dependencies:

    conda create --name schema_refinery python=3.9
    conda activate schema_refinery
    
  6. Install Dependencies: Install the required Python packages:

    conda install blast
    pip install -r requirements.txt
    
  7. Install the Package: Install the SchemaRefinery package:

    python setup.py install
    
  8. Verify Installation: Verify the installation by running the following command:

    SR --help
    
  9. Deactivate the Conda Environment: Once you are done, you can deactivate the conda environment:

    conda deactivate
    

Modules

The repository includes the following main modules:

  1. IdentifyParalogousLoci: Identifies paralogous loci in a schema.
  2. IdentifySpuriousGenes: Identifies spurious genes in a schema.
  3. SchemaAnnotation: Annotates schemas with additional information.
  4. MatchSchemas: Matches schemas in a directory.
  5. DownloadAssemblies: Downloads genomic assemblies from various databases.
  6. AdaptLoci: Adapts loci in fasta format to a schema format.

Dependencies

  • Python 3.9 or higher
  • Biopython library (pip install biopython)
  • NCBI datasets (NCBI datasets)

Modules Usage

Each module can be used independently by running the corresponding script with the required command-line arguments. Below are examples for each module:

IdentifyParalogousLoci

```bash
SR IdentifyParalogousLoci --help
```

IdentifySpuriousGenes

```bash
SR IdentifySpuriousGenes --help
```

SchemaAnnotation

```bash
SR SchemaAnnotation --help
```

MatchSchemas

```bash
SR MatchSchemas --help
```

DownloadAssemblies

```bash
SR DownloadAssemblies --help
```

AdaptLoci

```bash
SR AdaptLoci --help
```

Troubleshooting

If you encounter issues while using the modules, consider the following troubleshooting steps:

  • Verify that the paths to the schema, output, and other directories are correct.
  • Check the output directory for any error logs or messages.
  • Increase the number of CPUs using the -c or --cpu option if the process is slow.
  • Ensure that you have a stable internet connection.

if the issue persists, please report it to the development team using github issues.

Contributing

We welcome contributions to the SchemaRefinery project. If you would like to contribute, please follow these steps:

  1. Fork the repository on GitHub.
  2. Create a new branch for your feature or bugfix.
  3. Make your changes and commit them with a clear message.
  4. Push your changes to your forked repository.
  5. Create a pull request to the main repository.

License

This project is licensed under the GNU General Public License v3.0. See the LICENSE <https://www.gnu.org/licenses/gpl-3.0.html>_ file for details.

Contact Information

For support or to report issues, please contact the development team at GitHub issues in SchemaRefinery GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schemarefinery-0.3.1.tar.gz (246.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

SchemaRefinery-0.3.1-py2.py3-none-any.whl (277.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file schemarefinery-0.3.1.tar.gz.

File metadata

  • Download URL: schemarefinery-0.3.1.tar.gz
  • Upload date:
  • Size: 246.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for schemarefinery-0.3.1.tar.gz
Algorithm Hash digest
SHA256 70e6fcf5947b8df1b68b81013c33e286ec355b194808f0f0c6a5deab542b4092
MD5 5f6e14f347a0fff986713ec1f1dc7354
BLAKE2b-256 064c25464b3ec9156a5cd0e09f605d12b04f9bdfeeedf6b6d88b03b03bb4728b

See more details on using hashes here.

File details

Details for the file SchemaRefinery-0.3.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for SchemaRefinery-0.3.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 85b1f56e0c0dcae3a6f25bd518d19a7d0db6f13ed9306190f426aba272ac8e77
MD5 7eeea0eb3cc9a31ed384a3612690a820
BLAKE2b-256 fd5d867dcabc66b2e6611dc655f4f2271687a8753f46fc41c9d9aadc1a71cb35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page