NESS: Vector-based Alignment-free Sequence Search
Project description
NESS
NESS is an alignment-free tool for sequence search based on word embedding an approximate nearest neighbor (ANN) search. The tool is still under development and the code present in this repository is a proof of concept distributed under the GPL v3 license.
Installation
$ pip install ness-search
Usage
Currently the NESS CLI interface provides the following commands:
ness build_model
Creates a Word2Vec model from a multi FASTA file. For DNA sequences, use --both-strands
.
$ ness build_model \
--input swissprot.fasta \
--output swissprot.model
ness build_database
Similarly to makeblastdb
, formats a sequence database with vectors computed using a
model previously built. For DNA sequences, use --both-strands
.
$ ness build_database \
--input swissprot.fasta \
--model swissprot.model \
--output swissprot
ness search
Similarly to the blast*
programs, compares a multi FASTA file with the previously formated database.
$ ness search \
--input sequences.fasta \
--database swissprot \
--output hits.csv
Cite
Kremer, FS et al (2021). NESS: an word embedding-based tool for alignment-free sequence search. Available at: https://github.com/omixlab/ness.
Acknownledgements
NESS was supported by grants from Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (FAPERGS) and is developed in partership with BiomeHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file ness_search-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: ness_search-0.0.6-py3-none-any.whl
- Upload date:
- Size: 25.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d39e1403e7bd7d2fe70ee332e97406e6f808bc96ed0679458afb5c4bbf9a1e54 |
|
MD5 | 2afbe7f4ff058938442c2ccb5a816f7c |
|
BLAKE2b-256 | 008a7e826a8eb51293c842c038efad891939a20dd41f5d5b1394905bf3f3147b |