Skip to main content

A datastructure for efficient storage of string data

Project description

No Frills Prefix Tree (nf_prefix_tree)

This is a prefix tree with no frills. Supports simple insert, search, and basic querying of strings that a (can be) keyed. Written in C++ 11 and has python wrappers. Written because I couldn't find a fast, easy to install and use basic prefix tree in python.

Installation

C++

In order to build the C++, you need to have

  1. C++ 11
  2. g++ compiler

This program uses the STL so no other packages are necessary

Linux and macOS

  1. Change to the nf_prefix_tree directory
  2. Run chmod +x buildAndTest.sh
  3. Run ./buildAndTest.sh

If no errors occur, you should be good to go!

Windows

In order to build, ensure you have the g++ compiler and it is added to path

  1. Change to nf_prefix_tree/src/ directory
  2. Run make
  3. Change to nf_prefix_tree/tests/ directory if you want to run tests
  4. Run make
  5. Run ./testmain

Python

If you are installing on pip and use macOS, all you need to run is pip install nf_prefix_tree

Otherwise you will need to build from source. First follow the above steps for C++. After doing so, make sure you have the following

  1. Python 3
  2. Cython package

Once you have these, change into the python_bindings directory. Run python3 setup.py build_ext --inplace. You can now do from nf_prefix_tree import PyPrefixTree in the same directory!

Examples

C++

#include "PrefixTree.hpp"

int main() {

    PrefixTree * t = new PrefixTree();

    t->addSequence("ABCDEFG", "Key1");
    t->addSequence("ABCDEFG", "Key2");
    t->addSequence("ABCXYZ", "Key3");
    t->addSequence("LMNOP", "Key4");

    t->show();

    t->show(2);
}

Output:

root
|-> Sequence: A	 keys: Key1, Key2, Key3, 
 |-> Sequence: AB	 keys: Key1, Key2, Key3, 
  |-> Sequence: ABC	 keys: Key1, Key2, Key3, 
   |-> Sequence: ABCD	 keys: Key1, Key2, 
    |-> Sequence: ABCDE	 keys: Key1, Key2, 
     |-> Sequence: ABCDEF	 keys: Key1, Key2, 
      |-> Sequence: ABCDEFG	 keys: Key1, Key2, 
   |-> Sequence: ABCX	 keys: Key3, 
    |-> Sequence: ABCXY	 keys: Key3, 
     |-> Sequence: ABCXYZ	 keys: Key3, 
|-> Sequence: L	 keys: Key4, 
 |-> Sequence: LM	 keys: Key4, 
  |-> Sequence: LMN	 keys: Key4, 
   |-> Sequence: LMNO	 keys: Key4, 
    |-> Sequence: LMNOP	 keys: Key4, 
root
|-> Sequence: A	 keys: Key1, Key2, Key3, 
 |-> Sequence: AB	 keys: Key1, Key2, Key3, 
|-> Sequence: L	 keys: Key4, 
 |-> Sequence: LM	 keys: Key4, 

Python

from nf_prefix_tree import PyPrefixTree

t = PyPrefixTree()

t.addSequence('ABCDE', 'key1')
t.addSequence('ABCXY', 'key2')
t.addSequence('ZYAPW', 'key3')

t.show()

Output:

root
|-> Sequence: A	 keys: key1, key2, 
 |-> Sequence: AB	 keys: key1, key2, 
  |-> Sequence: ABC	 keys: key1, key2, 
   |-> Sequence: ABCD	 keys: key1, 
    |-> Sequence: ABCDE	 keys: key1, 
   |-> Sequence: ABCX	 keys: key2, 
    |-> Sequence: ABCXY	 keys: key2, 
|-> Sequence: Z	 keys: key3, 
 |-> Sequence: ZY	 keys: key3, 
  |-> Sequence: ZYA	 keys: key3, 
   |-> Sequence: ZYAP	 keys: key3, 
    |-> Sequence: ZYAPW	 keys: key3, 

API

C++

  • PrefixTree()
    • Constructor for the prefix tree
  • void addSequence(sequence, key)
    • Add a sequence with a key to the prefix tree
    • sequence - std::string the sequence to compress via tree
    • key - std::string the key for this sequence. If not desired, make sequence empty string
  • bool hasPrefix(prefix)
    • Check to see if the input sequence is in the tree
    • prefix - std::string the prefix to look up
  • std::vector<std::string> getKeysWithPrefix(prefix)
    • Get all keys associated with a prefix. Empty vector returned if the prefix is not in the tree
    • prefix - std::string the prefix to look for
  • void show(level=-1)
    • Show the tree in the console down to the specified level
    • level -- int the level down to which to print the tree. Leave as -1 if you want to print the entire tree

Python

  • PyPrefixTree()
    • Constructor for the prefix tree
  • addSequence(sequence, key) -> None
    • Add a sequence with a key to the prefix tree
    • sequence - str the sequence to compress via tree
    • key - str the key for this sequence. If not desired, make
  • hasPrefix(prefix) -> bool
    • Check to see if the input sequence is in the tree
    • prefix - str the prefix to look up
    • Returns True if prefix is found, False otherwise
  • getKeysWithPrefix(prefix) -> list
    • Get all keys associated with a prefix
    • prefix - str the prefix to look for
    • Returns a list of str of all keys. [] is returned if the prefix is not in the tree
  • show(level=-1)
    • Show the tree in the console down to the specified level
    • level -- int the level down to which to print the tree. Leave as -1 if you want to print the entire tree

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nf_prefix_tree-0.1.0.tar.gz (70.0 kB view details)

Uploaded Source

Built Distribution

nf_prefix_tree-0.1.0-cp37-cp37m-macosx_10_9_x86_64.whl (54.5 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file nf_prefix_tree-0.1.0.tar.gz.

File metadata

  • Download URL: nf_prefix_tree-0.1.0.tar.gz
  • Upload date:
  • Size: 70.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.4

File hashes

Hashes for nf_prefix_tree-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cfdd3a4f7967719b296889b8a614e06595ca999467178944825a7cd2ec3ad1eb
MD5 eb329e50f356c7cfb0c743489d7f9d68
BLAKE2b-256 a8b95066b6c1d9e96f4ef02be1d78954261cb00e99c261fb640a1ae6ff0059c2

See more details on using hashes here.

File details

Details for the file nf_prefix_tree-0.1.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: nf_prefix_tree-0.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 54.5 kB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.4

File hashes

Hashes for nf_prefix_tree-0.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ec25f5558633157cafd68764664302b43f41e48a49620626fa5df351f31dc89a
MD5 4460d4166b098f1e6086bb051d4dcd81
BLAKE2b-256 68a4a3b0fb3b1d5fe8de076b2239927fcdf16196b64866b5470e2423d26f0436

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page