NESS: Vector-based Alignment-free Sequence Search
Project description
NESS
NESS is an alignment-free tool for sequence search based on word embedding. The tool is still under development and the code present in this repository is a proof of concept distributed under the GPL v3 license.
Usage
Currently the NESS CLI interface provides the following commands:
ness build_model
Creates a Word2Vec model from a multi FASTA file. For DNA sequences, use --both-strands
.
$ ness build_model \
--input swissprot.fasta \
--output swissprot.model
ness build_database
Similarly to makeblastdb
, formats a sequence database with vectors computed using a
model previously built. For DNA sequences, use --both-strands
.
$ ness build_database \
--input swissprot.fasta \
--model swissprot.model \
--output swissprot
ness search
Similarly to the blast*
programs, compares a multi FASTA file with the previously formated database.
$ ness search --input sequences.fasta --database swissprot --output hits.csv
Cite
Kremer, FS et al (2021). NESS: an word embedding-based tool for alignment-free sequence search. Available at: https://github.com/omixlab/ness.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for ness_search-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97fefe04775c8e8af34f5cbe1b7348e2e05f9fd3753f45b138c580b85277c7bc |
|
MD5 | ec1b8f15ceae25381d90596eef251e21 |
|
BLAKE2b-256 | 67e69d71a829e52d41a09f89f2f3e43f0e09cbe59863d0bbfc61191718448ff4 |