Skip to main content

Kyoto Encylopedia of Genes and Genomes Markup Language File parser and converter

Project description

KNeXT downloads and parses Kyoto Encylopedia of Genes and Genomes (KEGG) markup language files (KGML). The tool employs NetworkX’s framework to create gene-only networks, but mixed (gene, compound, pathway) networks can also be generated. All output files are in TSV format. KNeXT also retrieves a TXT file of node x-y axis coordinates for use in NetworkX’s graph visualization library, and it is able to convert KEGG IDs into Uniprot and NCBI IDs.

Usage

Primary line: get-kgml [SPECIES_NAME]

  KEGG NetworkX Topological (KNeXT) parser uses the KEGG
  API to gather all KGML files for a single species
  in 3 to 4 letter KEGG organism code.

Options:
  --help,   shows options and website for KEGG organism codes

Primary line: parse-genes [OPTIONS]

  KNeXT parser deploy's NetworkX's
  framework to create gene-only representations of KGML files.

Options:
  file      KGML file
  --unique  TSV file's genes have a terminal modifier
  --graphics        outputs x-y axis coordinates
  --help    shows options and file types

  folder    folder containing KGML files
  --unique  TSV file's genes have a terminal modifier
  --graphics        outputs x-y axis coordinates
  --help    shows options and file types

Primary line: parse-mixed [OPTIONS]

  KNeXT parser creates mixed
  (genes, compounds, pathways) representations of KGML files.

Options:
  file      KGML file
  --unique  TSV file's nodes have a terminal modifier
  --graphics        outputs x-y axis coordinates
  --help    shows options and file types

  folder    folder containing KGML files
  --unique  TSV file's nodes have a terminal modifier
  --graphics        outputs x-y axis coordinates
  --help    shows options and file types

Primary line: convert-network [OPTIONS]

  KNeXT parser converts KEGG entry IDs in TSV output files into
  UniProt or NCBI IDs.

Options:
  file      PATH:   path to TSV file
  species   TEXT:   KEGG 3 to 4 letter organism code
  --uniprot optional flag for output:       use if UniProt IDs are the desired output
  --unique  optional flag for output:       use if the TSV file has terminal modifiers
  --graphics        PATH:   graphics file
  --help    optional flag:  shows options

Options:
  folder    PATH:   path to folder containing TSV files
  species   TEXT:   KEGG 3 to 4 letter organism code
  --uniprot optional flag for output:         use if UniProt IDs are the desired output
  --unique  optional flag for output:         use if the TSV file has terminal modifiers
  --graphics        PATH:       path to folder containing graphics files
  --help    optional flag:            shows options

For example, KNeXT can obtain all KGML files for Homo sapiens:

$ get-kgml hsa

The resulting output folder can be used to parse the files:

$ parse-genes folder kgml_hsa --graphics

The resulting output folder can be used to convert the TSV files and graphics file:

$ convert-network folder kegg_gene_network_hsa hsa --graphics kegg_gene_network_hsa

Inputs

KNeXT only accepts KGML files downloaded from KEGG

The output of which can be used in successive commands. All input formats must be in TSV format. Column names are mandatory and should not be changed.

Data Frames

Example TSV file with KEGG ID’s

entry1

entry2

type

value

name

hsa:100271927-98

hsa:22800-12

PPrel

–>

activation

hsa:100271927-98

hsa:22808-12

PPrel

–>

activation

hsa:100271927-98

hsa:3265-12

PPrel

–>

activation

Example TSV file for uniprot conversion with –unique output

entry1

entry2

type

value

name

Q9Y243-23

O15111-59

PPrel

–>

activation

Q9Y243-23

Q6GYQ0-240

PPrel,PPrel

|,+p

inhibition,phosphorylation

Q9Y243-23

O14920-59

PPrel

–>

activation

Installation

The current release is v1.0.0 Installation is via pip:

$ pip install https://github.com/everest/knext/knext-1.0.0.tar.gz

Repo can be downloaded and installed through poetry:

$ git clone https://github.com/everest/knext.git
$ cd knext
$ poetry shell
$ poetry install
$ poetry run [get-kgml, parse-genes, parse-mixed, or convert-network]

Requirements

Requirements are (also see pyproject.toml):

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knext-1.0.0.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

knext-1.0.0-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file knext-1.0.0.tar.gz.

File metadata

  • Download URL: knext-1.0.0.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.1

File hashes

Hashes for knext-1.0.0.tar.gz
Algorithm Hash digest
SHA256 38008899dc77df55d4319445aae580a26bb44ea961960405f32348c6e91b151d
MD5 91be675b9f92d693ba1b42da0ac69de8
BLAKE2b-256 d9e00e91907a3d504b15132eb314037b04deffe1c0994b38085fe56aed213193

See more details on using hashes here.

File details

Details for the file knext-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: knext-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.1

File hashes

Hashes for knext-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0c23129822bddf33f0fbafee791add2778de93c30a27035506a486e854a74d68
MD5 fae88242826808f1eb5d8d9ab9501ed0
BLAKE2b-256 ae8207634dc14b3fa5621bfbcca3e51492ba005e7a6643616fcd71fb3de12106

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page