Skip to main content

Succinct BWT-Based SequencePrediction

Project description

Succinct BWT-Based Sequence Prediction (Subseq)

What is it ?

This project is a c++ implementation with a python wrapper of the Succinct BWT-Based Sequence Prediction model.

Subseq is a sequence prediction model in a finite alphabet. It is a lossless model (does not discard information while training) and utilizes the succinct Wavelet Tree data structure and the Burrows-Wheeler Transform to compactly store and efficiently access training sequences for prediction.

This implementation is based on the following research paper:

Installation

Subseq is published on pypi. pip install subseq should be enough.

Simple example

You can test the model with the following code:

from subseq.subseq import Subseq
model = Subseq(1)

model.fit([['hello', 'world']])

model.predict(['hello'])
# Output: ['world']

Features

Train

The model can be trained with the fit method.

Tuning

Subseq has only 1 meta parameter that need to be tuned. threshold_query, the number of similar queries that needs to be retrieved to make a confident prediction.

A threshold_query at 0 does not limit the number of query.

Benchmark

The benchmark has been made on the FIFA dataset, the data can be found on the SPMF website.

Details on the benchmark can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subseq-1.0.4.tar.gz (321.3 kB view details)

Uploaded Source

Built Distributions

subseq-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

subseq-1.0.4-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (1.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

subseq-1.0.4-cp311-cp311-macosx_11_0_arm64.whl (96.6 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

subseq-1.0.4-cp311-cp311-macosx_10_9_x86_64.whl (106.0 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

subseq-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

subseq-1.0.4-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (1.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

subseq-1.0.4-cp310-cp310-macosx_11_0_arm64.whl (98.4 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

subseq-1.0.4-cp310-cp310-macosx_10_9_x86_64.whl (107.5 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

subseq-1.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

subseq-1.0.4-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (1.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686

subseq-1.0.4-cp39-cp39-macosx_11_0_arm64.whl (99.2 kB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

subseq-1.0.4-cp39-cp39-macosx_10_9_x86_64.whl (108.7 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

subseq-1.0.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

subseq-1.0.4-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl (1.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686

subseq-1.0.4-cp38-cp38-macosx_11_0_arm64.whl (98.8 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

subseq-1.0.4-cp38-cp38-macosx_10_9_x86_64.whl (108.1 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file subseq-1.0.4.tar.gz.

File metadata

  • Download URL: subseq-1.0.4.tar.gz
  • Upload date:
  • Size: 321.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for subseq-1.0.4.tar.gz
Algorithm Hash digest
SHA256 13439459053d1cd4caeee0faea3b5d992a25d9901a7ceab2f60ae42a8c7873ed
MD5 895237a93166b59924212481571b9359
BLAKE2b-256 35462e62bbb9f4a532efc87fd4a262d16835693d8618897070e3398cb5ee95ca

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a905416d32e837762c48dbe90eb9341467cb9636d323bec13522502e03757c09
MD5 03f390818b6bd289fda9884403b8c615
BLAKE2b-256 5118e330c977714fd33029ec57d8c272f989b83893e678db09df091d7993d72d

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 b3f59ac3744025151632accf7fe59e55248f551a9510c1da251e85d982edd3dd
MD5 2d775b8343d0a93430efefd1c5cf62a0
BLAKE2b-256 861906255784f4fb3f7c4a6111053eb0d6f4207661805134092cd1ced43c0774

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 311dfedf849884655a9f8cffc1742403e8cd59b093c697300583cff2e715c7a3
MD5 42e90a6f345cf2d3032de6a786c42c33
BLAKE2b-256 74ad47efe864872029c46d783bd3d390427576da702f8b96091f872c5170dd30

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c9281111b36a454ad819e65d072ceb2520c10fe4910dd84f156776012e7ad0f1
MD5 08828846c2c66c0964f79221ed0b0c7f
BLAKE2b-256 7575abd5b43fdc74a92b70f132d38819d158f1089d8441bae7e70523883c5dda

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4ca4a1b3a5da8ffbcf76c7dac0ad7a8bbcee9661fc7494dbc8dab00c25a1f876
MD5 cd8f1560f4535c21e5888bdda26edfdb
BLAKE2b-256 39bb821a2d1ffc506e5730914ddace4925a3ad8d3c4edd342e832ba6a5039f43

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 7a561a01cf720f1f5b6ca5d3e50112cc3511eab13f3091ff3e2a62295aeb60f5
MD5 93d536c1d25202d5a3d4dafb14622519
BLAKE2b-256 31cc7428fc562551c212f07408226d7d6fd7649731987e81911ce2072129d209

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 88894876976112135b9b56d8c7c875a76d0c121bbc36043911a862e3c606a394
MD5 e343f01438a0bdb0fb4758d59fd8e437
BLAKE2b-256 2c142edeb6e00adb2bbc457c3f27ae1a7fc46dca6af528d1c3e924e78a638824

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 50653b46f48ff0112d48fe3b87f62a221a1e727c2f816342463f473fbf2a3ad6
MD5 76d19da23d8ad142c1ace6b19195b28e
BLAKE2b-256 810c829a2c69e2a8d2e778934283752fbe6a833cb132017cf276853d78322f8e

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1022b5a768f445580755e997278d32ccf02de3f9bd70217eb668f08e095bbb58
MD5 b721b131d7766c30398c1f2badd78677
BLAKE2b-256 3afa9c9a611c5010a469a1e2ad9aab7070b032d6cdbd3a669b6aaa4c3dea034e

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 4a65f4518c8b2677644dbda1c7c2da8aff054ad8d879f696cbe31a72eaefdb87
MD5 390d19a59be5d4494b85340646d2539a
BLAKE2b-256 6c3268ace8ee5a7b98ac573a38b1b1c2b0add34508b409cf5f36ecfcfb2c5be9

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f7d872c2073553838a5dd98c173f96d62e97d2df0a1d5e5e3fd407a83a5ac267
MD5 41087aab0b3b6797c8915905f56687a0
BLAKE2b-256 4ae810e3f8845204b6826b5dd9a249ded391fc2800783def0849b8ab6a9c3a8d

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 029d5af5edc7fe5bf89a0ebbe8139b6547afd978f77f93603a5c2c54c8141ae2
MD5 44ec4f20d0159ea8d18a9a8fea1fe227
BLAKE2b-256 824c2e4cdc398c8edd483eddf7aff91d123e852d54564ff87d29c817013219e2

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 01ada19639aa3b4d72fcaf7344fdf2f6bad5b93cf73c840014c2902c975348b4
MD5 23eb7518f7776232c40f8bdf83519817
BLAKE2b-256 b41ad35b10d1b96878a8f429efa4e73a18fd540ac4c0ae833ef0b35350ec2c9b

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 96644f40114e9f221bedefe2e19d84221e96a6cec708de755424d7dd7945119c
MD5 3680093035b4c21e758f01d8dfbf080c
BLAKE2b-256 bf278f7aeb6ab6bcda9883d2883c05bd9d8d8197783be0882ee4fc826f3f4a46

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b4e3896910e07b8b7dae6850f62bae8a2233cb0d9f69607d23be984190a0632a
MD5 026f3b730580fdf8738771f08ec8d8b1
BLAKE2b-256 4ac2ec3e282c2d760718be63a8f4b8581e37aca40af59b7bb6acbf998bb8c991

See more details on using hashes here.

File details

Details for the file subseq-1.0.4-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.4-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9e7f92ef239859eccc4e5d90539854d00613e05a12076d549a0de3135002da50
MD5 a0d04c06468cb34486d796f7e35ba275
BLAKE2b-256 47e738effe5749d3ee1d7de60dd6955c520f584504d2c2586143d1ce33072f92

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page