Skip to main content

Succinct BWT-Based SequencePrediction

Project description

Succinct BWT-Based Sequence Prediction (Subseq)

What is it ?

This project is a c++ implementation with a python wrapper of the Succinct BWT-Based Sequence Prediction model.

Subseq is a sequence prediction model in a finite alphabet. It is a lossless model (does not discard information while training) and utilizes the succinct Wavelet Tree data structure and the Burrows-Wheeler Transform to compactly store and efficiently access training sequences for prediction.

This implementation is based on the following research paper:

Installation

Subseq is published on pypi. pip install subseq should be enough.

Simple example

You can test the model with the following code:

from subseq.subseq import Subseq
model = Subseq(1)

model.fit([['hello', 'world']])

model.predict(['hello'])
# Output: ['world']

Features

Train

The model can be trained with the fit method.

Tuning

Subseq has only 1 meta parameter that need to be tuned. threshold_query, the number of similar queries that needs to be retrieved to make a confident prediction.

A threshold_query at 0 does not limit the number of query.

Benchmark

The benchmark has been made on the FIFA dataset, the data can be found on the SPMF website.

Details on the benchmark can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subseq-1.0.3.tar.gz (325.3 kB view details)

Uploaded Source

Built Distributions

subseq-1.0.3-cp310-cp310-win_amd64.whl (66.2 kB view details)

Uploaded CPython 3.10 Windows x86-64

subseq-1.0.3-cp310-cp310-win32.whl (58.2 kB view details)

Uploaded CPython 3.10 Windows x86

subseq-1.0.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.12+ x86-64

subseq-1.0.3-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.12+ i686

subseq-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl (103.5 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

subseq-1.0.3-cp39-cp39-win_amd64.whl (66.2 kB view details)

Uploaded CPython 3.9 Windows x86-64

subseq-1.0.3-cp39-cp39-win32.whl (58.2 kB view details)

Uploaded CPython 3.9 Windows x86

subseq-1.0.3-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

subseq-1.0.3-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ i686

subseq-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl (103.5 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

subseq-1.0.3-cp38-cp38-win_amd64.whl (68.7 kB view details)

Uploaded CPython 3.8 Windows x86-64

subseq-1.0.3-cp38-cp38-win32.whl (57.8 kB view details)

Uploaded CPython 3.8 Windows x86

subseq-1.0.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

subseq-1.0.3-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl (1.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

subseq-1.0.3-cp38-cp38-macosx_10_9_x86_64.whl (103.0 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

subseq-1.0.3-cp37-cp37m-win_amd64.whl (68.4 kB view details)

Uploaded CPython 3.7m Windows x86-64

subseq-1.0.3-cp37-cp37m-win32.whl (57.3 kB view details)

Uploaded CPython 3.7m Windows x86

subseq-1.0.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

subseq-1.0.3-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

subseq-1.0.3-cp37-cp37m-macosx_10_9_x86_64.whl (103.2 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file subseq-1.0.3.tar.gz.

File metadata

  • Download URL: subseq-1.0.3.tar.gz
  • Upload date:
  • Size: 325.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for subseq-1.0.3.tar.gz
Algorithm Hash digest
SHA256 9b3a21a8c5ee7265da0ea11875e797206613bb86c4f40d9b7a5e619a837d8bcb
MD5 ca9ef860e01dff9b093b1d672b01e338
BLAKE2b-256 0937b818daeadf98ef53221e625a6f838d4266e0770586c3a8a440d3c5400292

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 66.2 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for subseq-1.0.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 28d5700b65ddb1b65a184f7f9b4df3b85d14ab530e9a7700ab833e63384bee70
MD5 84bfa6a137583d7afd64bb5cb7526e21
BLAKE2b-256 63da03c7ca9749e2ce0b4bc1fe628fe8570ea15b5126803575b1eaa94d6cf9ae

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp310-cp310-win32.whl.

File metadata

  • Download URL: subseq-1.0.3-cp310-cp310-win32.whl
  • Upload date:
  • Size: 58.2 kB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for subseq-1.0.3-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 a36359c8bca1cedbeffd84bc7581e8bcfc22442b0791fb117173b72f0daf69e4
MD5 d37646cde3dac249e3654cec1154ebe4
BLAKE2b-256 db64825a7f5b2291d585028b43ec093ba159970b11e7fe2b089a087fa2a2261f

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 20ce5d9ebb5edbad1526e7a48dbf123c5da451ec00922aa21c1300cfd5db59e0
MD5 fced2b886cbe8014754541210a3cdcde
BLAKE2b-256 fd16e55de053220cea753cc0abd7e4353fb1ceff35d24429f3d57dd107f4aa19

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 844635a6df0546286a0b66d9adf1d15431f146d5061a735350370ab2d4ba50e0
MD5 f18dcb279c46bbe39b9217d0b85b69d3
BLAKE2b-256 bba11836faf684a47a15b026334eb0ca81e0b78146ecfd747ebfd7403c06eabc

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bee295fde20adf9eaedc2dfc2cb937af46940ff0847412c57903a3681e0df124
MD5 2b4d75e605c139d858fb5324caa40967
BLAKE2b-256 3185c2a1ad0d6abc028967c6f01eff40a2eb33d07c52f28f440289a8bb5ba1b6

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.3-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 66.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for subseq-1.0.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 8ddd1191d07bf8f1b38fd35596d7d0b2bb3da9fa0b83fa8b0bc4fdb786d07e6b
MD5 d1f56135c679fe811e8899cae5d7d8fb
BLAKE2b-256 d6b32020f99c8ae3a94d3a0ff9e91dec2bffd59129f8086cb5c5083db1d6b596

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp39-cp39-win32.whl.

File metadata

  • Download URL: subseq-1.0.3-cp39-cp39-win32.whl
  • Upload date:
  • Size: 58.2 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for subseq-1.0.3-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 725db222a54da7afae2d0ee89e1dad089dff2662c3ff1fd9b59990b9742c7a38
MD5 8289bd6a954d7fb1de6b432d855fe5db
BLAKE2b-256 99f0f907ed40832a59112eb86b68f9f2b3b286b14fb46409a15a868f6acb9721

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 89276465923e8e7fa5d333b3acdea622617d976235221482623155b750461bd4
MD5 c867c68414d821c00fd5bf053826977d
BLAKE2b-256 f7cc5e8d7d815d1864a4bde611b3328aba10bc8266447b5c3540cbd1579d18f7

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 6566327c4df41a37651a73cb786bebd18125215e005f72b8fe314f2cbe133946
MD5 a03e8e445d2067d7033c46048a81ddf5
BLAKE2b-256 8ec649e426dfa58f48fbd07270fddea61a0bc38bd0492043c12508b2f7abae7b

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 27c26ed2be0a136d8fb0620930bf56d3ccb1a0cf2a87467d7b2d3b75ddf41ee7
MD5 c13c375167159923465059bd125698a8
BLAKE2b-256 ace08ee86e6d95398621c90486541e3c4d76403ac5c591d00d84822a06c141cc

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.3-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 68.7 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.0

File hashes

Hashes for subseq-1.0.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 f42a4223770f1185d3e1b8269d6d60e5f4cc0828a0243a1f434951b85f4c39e8
MD5 1de01604e2b35d786db9ef18e0437849
BLAKE2b-256 3d7e99727623fa8cbf6097707377e26fa21ea2c8dfa170218203f18077b8d2d2

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp38-cp38-win32.whl.

File metadata

  • Download URL: subseq-1.0.3-cp38-cp38-win32.whl
  • Upload date:
  • Size: 57.8 kB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.0

File hashes

Hashes for subseq-1.0.3-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 04d8c6afbef5512219c88955c61f6a58d782300719b31286ead37a99a8c0df36
MD5 5ab2066b8367e1f1b18228313544286a
BLAKE2b-256 5aaf18bc688e3a4480cc9872ad38a82a031f2c301cb185009c6f93afe0beecf9

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cf8db8ce07f29514c6d5147f44f9b1fed8211419590f1f1eeff9ed570d27a368
MD5 3cdd58686300d85f9e6aa0391ada8bf0
BLAKE2b-256 7eea587bcffd6f648b5d9fcc5eafde06e4a69f8e15041676c41d2c80b1f6b90b

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 3e0b9cfc2fcb0aaaa5c8edb0d94af377126d8023f283ffbe68cdc6e71f8541dc
MD5 4880cba719e21341f7cdccfdffed6060
BLAKE2b-256 4c6df4e7b7e278bf0ff257f00cc6660c18f32507ce82067ffa67fc7d0ef5c4c6

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 790e13b35d06b93e6519c9988e921f9c1f47730175aae54040012541a9d0f3aa
MD5 709e567c3420a1015bf8647d70c01676
BLAKE2b-256 6d715bbb58e3cb7ed47f883cca41cffe6b3787c897edc791a680ca4f475d8b8f

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.3-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 68.4 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.5

File hashes

Hashes for subseq-1.0.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 68a155e061faded16179750b451c97f367fdda0401c679bc7c459f3ec9c8ffe0
MD5 2ebf5c5985e010c4b53e39ad1134531c
BLAKE2b-256 0c41352abb1bcfa9655db4f79a3742456886d23dbd485f58384e95bd377b741b

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp37-cp37m-win32.whl.

File metadata

  • Download URL: subseq-1.0.3-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 57.3 kB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.5

File hashes

Hashes for subseq-1.0.3-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 b312d488c898c7bd9bc4cdd7c67053c4f83483c84746cd8f927c258577eeb5a8
MD5 a90be85cbfdca57f18e0b4d25405b657
BLAKE2b-256 7879dc17e981eb560e787c124d82555ec1c476766bb7bbaf28fb13d67ffd2fb2

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5ac5092d1738b506e280aed18408934366e74534540464a8cac855dd887d660a
MD5 a30b325cbcc99d685a3855a19e34dcbb
BLAKE2b-256 73e4b05596e25273e73a7901060f802c25b5fb746e72f23330639722b77600ad

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 35374fe46929c75954b048302a8e897b3807613c72d311f318255552df87d660
MD5 fe627768580ba021ae2d69ef68f43d10
BLAKE2b-256 d16069c65ecfbc3e0d9dd856cafdf9dea45eae885a1da61ac4c841bf11d781dc

See more details on using hashes here.

File details

Details for the file subseq-1.0.3-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.3-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 02b4023f08714dd66ec15be1d21a895ccae76124bf86edbec4a248fa2a4f6279
MD5 5b7cec32cca230606f5c1f45376b08b0
BLAKE2b-256 357454046f158ea86cd54709881a43c26b9a4285080547f6f1fbfba202a5d539

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page