Skip to main content

Kyoto Encylopedia of Genes and Genomes Markup Language File parser and converter

Project description

KNeXT downloads and parses Kyoto Encylopedia of Genes and Genomes (KEGG) markup language files (KGML). This tool employs NetworkX’s framework to create gene-only networks, but mixed (gene, compound, pathway) networks can also be generated. All output files are in TSV format. KNeXT also retrieves a TXT file of node x-y axis coordinates for use in NetworkX’s graph visualization library, and it is able to convert KEGG IDs into Uniprot and NCBI IDs. KNeXT also maximizes metadata information through preserving each edge’s information.

Usage

Primary line: knext get-kgml [SPECIES_NAME]

  KEGG NetworkX Topological (KNeXT) parser uses the KEGG
  API to gather all KGML files for a single species.
  Input species name in 3 to 4 letter KEGG organism code.

Options:
  --help,   shows options and website for KEGG organism codes
  -d/--d,   directory in which to save output

Primary line: knext genes [Input]

  KNeXT parser deploy's NetworkX's
  framework to create gene-only representations of KGML files.
  Genes between compounds are propagated before compounds are dropped.

Options:
  Input     KGML file or folder of KGML files to parse
  -r/--results      file or folder where output should be stored
  -g/--graphics     outputs TXT file of x-y axis coordinates
  -u/--unique       TSV file's genes have a terminal modifier
  --help    shows options and file types

Primary line: knext mixed [Input]

  KNeXT parser creates mixed (genes, compounds, pathways)
  representations of KGML files.

Options:
  Input     KGML file or folder of KGML files to parse
  -r/--results      file or folder where output should be stored
  -g/--graphics     outputs TXT file of x-y axis coordinates
  -u/--unique       TSV file's genes have a terminal modifier
  --help    shows options and file types

Primary line: knext convert [OPTIONS]

  KNeXT parser converts KEGG entry IDs in TSV output files into
  UniProt or NCBI IDs.

Options:
  file      PATH:   path to TSV file
  species   TEXT:   KEGG 3 to 4 letter organism code
  --uniprot optional flag for output:       use if UniProt IDs are the desired output
  --unique  optional flag for output:       use if the TSV file has terminal modifiers
  --graphics        PATH:   graphics file
  --help    optional flag:  shows options

Options:
  folder    PATH:   path to folder containing TSV files
  species   TEXT:   KEGG 3 to 4 letter organism code
  --uniprot optional flag for output:         use if UniProt IDs are the desired output
  --unique  optional flag for output:         use if the TSV file has terminal modifiers
  --graphics        PATH:       path to folder containing graphics files
  --help    optional flag:            shows options

For example, KNeXT can obtain all KGML files for Homo sapiens:

$ knext get-kgml hsa

The resulting output folder can be used to parse the files:

$ knext genes folder kgml_hsa --graphics

The resulting output folder can be used to convert the TSV files and graphics file:

$ knext convert folder kegg_gene_network_hsa hsa --graphics kegg_gene_network_hsa

Inputs

KNeXT only accepts KGML files downloaded from KEGG

The output of which can be used in successive commands. All input formats must be in TSV format. Column names are mandatory and should not be changed.

Data Frames

Example TSV file with KEGG ID’s

entry1

entry2

type

value

name

hsa:100271927-98

hsa:22800-12

PPrel

–>

activation

hsa:100271927-98

hsa:22808-12

PPrel

–>

activation

hsa:100271927-98

hsa:3265-12

PPrel

–>

activation

Example TSV file for uniprot conversion with –unique output

entry1

entry2

type

value

name

Q9Y243-23

O15111-59

PPrel

–>

activation

Q9Y243-23

Q6GYQ0-240

PPrel,PPrel

|,+p

inhibition,phosphorylation

Q9Y243-23

O14920-59

PPrel

–>

activation

Installation

The current release is v1.1.0 Installation is via pip:

$ pip install https://github.com/everest/knext/knext-1.0.0.tar.gz

Repo can be downloaded and installed through poetry:

$ git clone https://github.com/everest/knext.git
$ cd knext
$ poetry shell
$ poetry install
$ poetry run knext [get-kgml, genes, mixed, or convert]

Requirements

Requirements are (also see pyproject.toml):

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knext-1.1.0.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

knext-1.1.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file knext-1.1.0.tar.gz.

File metadata

  • Download URL: knext-1.1.0.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.9.12 Linux/5.15.0-75-generic

File hashes

Hashes for knext-1.1.0.tar.gz
Algorithm Hash digest
SHA256 7c33a19478197900660f07b34930e7be57491a9f18a7fbeaa2d7f1f3a1569081
MD5 a44f3c76be0ba51bdbb5f623aab7eb00
BLAKE2b-256 545cdcac7877cb5e9bd96321a9c3568394610e363b297e6670b4bccccf7d1c99

See more details on using hashes here.

File details

Details for the file knext-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: knext-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.2 CPython/3.9.12 Linux/5.15.0-75-generic

File hashes

Hashes for knext-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 72bdf57d1c2527419b91d588bf161029c0c12e46edd021bc7dcb964ba3684eda
MD5 0e4487dc88cddea9c0c95c66cfb8fad6
BLAKE2b-256 1065d63c7586c66bd3809a58c1e853e4bdc9f3510c406a855e03e39b2d4b7ee5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page