Skip to main content

Succinct BWT-Based SequencePrediction

Project description

Succinct BWT-Based Sequence Prediction (Subseq)

What is it ?

This project is a c++ implementation with a python wrapper of the Succinct BWT-Based Sequence Prediction model.

Subseq is a sequence prediction model in a finite alphabet. It is a lossless model (does not discard information while training) and utilizes the succinct Wavelet Tree data structure and the Burrows-Wheeler Transform to compactly store and efficiently access training sequences for prediction.

This implementation is based on the following research paper:

Installation

Subseq is published on pypi. pip install subseq should be enough.

Simple example

You can test the model with the following code:

from subseq.subseq import Subseq
model = Subseq(1)

model.fit([['hello', 'world']])

model.predict(['hello'])
# Output: ['world']

Features

Train

The model can be trained with the fit method.

Tuning

Subseq has only 1 meta parameter that need to be tuned. threshold_query, the number of similar queries that needs to be retrieved to make a confident prediction.

A threshold_query at 0 does not limit the number of query.

Benchmark

The benchmark has been made on the FIFA dataset, the data can be found on the SPMF website.

Details on the benchmark can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subseq-1.0.2.tar.gz (324.7 kB view details)

Uploaded Source

Built Distributions

subseq-1.0.2-cp310-cp310-win_amd64.whl (65.8 kB view details)

Uploaded CPython 3.10 Windows x86-64

subseq-1.0.2-cp310-cp310-win32.whl (57.8 kB view details)

Uploaded CPython 3.10 Windows x86

subseq-1.0.2-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.12+ x86-64

subseq-1.0.2-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.12+ i686

subseq-1.0.2-cp310-cp310-macosx_10_9_x86_64.whl (103.1 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

subseq-1.0.2-cp39-cp39-win_amd64.whl (65.8 kB view details)

Uploaded CPython 3.9 Windows x86-64

subseq-1.0.2-cp39-cp39-win32.whl (57.8 kB view details)

Uploaded CPython 3.9 Windows x86

subseq-1.0.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

subseq-1.0.2-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ i686

subseq-1.0.2-cp39-cp39-macosx_10_9_x86_64.whl (103.0 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

subseq-1.0.2-cp38-cp38-win_amd64.whl (68.4 kB view details)

Uploaded CPython 3.8 Windows x86-64

subseq-1.0.2-cp38-cp38-win32.whl (57.4 kB view details)

Uploaded CPython 3.8 Windows x86

subseq-1.0.2-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

subseq-1.0.2-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

subseq-1.0.2-cp38-cp38-macosx_10_9_x86_64.whl (102.4 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

subseq-1.0.2-cp37-cp37m-win_amd64.whl (67.8 kB view details)

Uploaded CPython 3.7m Windows x86-64

subseq-1.0.2-cp37-cp37m-win32.whl (56.9 kB view details)

Uploaded CPython 3.7m Windows x86

subseq-1.0.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

subseq-1.0.2-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl (1.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ i686

subseq-1.0.2-cp37-cp37m-macosx_10_9_x86_64.whl (102.6 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file subseq-1.0.2.tar.gz.

File metadata

  • Download URL: subseq-1.0.2.tar.gz
  • Upload date:
  • Size: 324.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for subseq-1.0.2.tar.gz
Algorithm Hash digest
SHA256 e7199953f1303a1e070d82700d19e14b28af4042211d00288bd393de802ab88c
MD5 a3162f037abc39c7be24004df7bd8af4
BLAKE2b-256 21a9d7ea9f65c26ac9737e912ce0e77622a0efc666c6c71eb50ef4ce8962a3cf

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 65.8 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for subseq-1.0.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a1d48df1880947d4fc08b0bf8c82ece19f1ac129c02b39cb7bb91cc8a794b20d
MD5 94b9cb3982b261b2b93d3856b9cdd4c7
BLAKE2b-256 750c0b19ee08aa7fc34dc19d9dfcb44c8b9722f3ddba3a4f4f5234443b17d959

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp310-cp310-win32.whl.

File metadata

  • Download URL: subseq-1.0.2-cp310-cp310-win32.whl
  • Upload date:
  • Size: 57.8 kB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for subseq-1.0.2-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 99c920b47a03af42ad91232604ed211c62cb738565c6d793627424a22efbfe55
MD5 140b9f8fda8eae61d61d6de8c2e25a7a
BLAKE2b-256 8cc5136dc9fa5acf36c12082d8782046c54e6fbcd60f227ec271b2449119ebea

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3e84a9d974747eff64bdca09ed87b0090b6396c2b1b3955e134e035fb632b1c6
MD5 f469caa2116f8ec3443944b71694465e
BLAKE2b-256 ffd4cf0c6f20efa43e2b0143bba04615da732f108c95576949d786d420111209

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 c80f7900d3d494a302135c98d5a258383bf8292e5e00a7245e69774a91dac424
MD5 ee15eaa71456b620f121b26e167f0e0b
BLAKE2b-256 dc6053b27b255a91b9f2fcd462272c1d2eb13c957ff974ec1a81434b3a83b278

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 20d3ab26aabb11fdfa30d2f5a91a76750fbfc7e081db4c7e0cc198da65c269ff
MD5 7c8cc06dea6d2a66a88778da28b66445
BLAKE2b-256 310b44be430397b8dbc9d3afae857a818eac8a3b2f47d2bccfc74da4f604ba0c

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 65.8 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for subseq-1.0.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b0dac8417351a4c66185dcb6afd21d2b650936892676bda4b01341bb298febad
MD5 f7098bc697b51554753bc0f24ffd5ee2
BLAKE2b-256 61c99f146cd590bea5ebb019b75903121e355c381f867709ae14d61092e7187b

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp39-cp39-win32.whl.

File metadata

  • Download URL: subseq-1.0.2-cp39-cp39-win32.whl
  • Upload date:
  • Size: 57.8 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for subseq-1.0.2-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 db9084ed104ca1c373059413c045fc3308ba8d64c482436d03293be79d85b39e
MD5 4bcd27b3760d9bded21e0f658d30a2a4
BLAKE2b-256 dcbde4c9a0b21861d3ecc7295075bd0ca79814dfd1e63bd91e7151c9978376bf

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 47ad48784ac36947625704d294a927658ea31499037b84b19027190ff00a32d1
MD5 7ba6eeb7352467f407265f506c157fd2
BLAKE2b-256 1cb79fd22662399e0ceb5a2861892e1b995965b6e13d0097a0bdbee3672752a2

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 22b74f7f795c2fb41982d337e167ed332df7fcce8f7f86d82983b82011d6860e
MD5 109b6e91dc385ac46fdf67f4f9b87ec9
BLAKE2b-256 845ef86b2a7a834c199341933edc427f12e43ee430794bcf465e480c3afb2e85

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 460be11f73cfbcc064863d816611d9aef37fcf783238a6510fdee4547d182aee
MD5 b84952c85d58b3e00aaad41966370360
BLAKE2b-256 5cbbaaa4fa304d0e2ce8455374d93006657a0853aa3c5cb77014b671ca62571c

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 68.4 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.0

File hashes

Hashes for subseq-1.0.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 98fed21a97c02fbe32d90ffefb9cd4e52184a2d19891e8a8c95f8b5e3adac78a
MD5 5387613d8ae609a6ff1a2e5f3a494747
BLAKE2b-256 12470e9e274056c2d2e079f80df7c56456d69afeccca54299cf5299fec839af6

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp38-cp38-win32.whl.

File metadata

  • Download URL: subseq-1.0.2-cp38-cp38-win32.whl
  • Upload date:
  • Size: 57.4 kB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.8.0

File hashes

Hashes for subseq-1.0.2-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 52b719c3901c1fc1a48072ae4eb4e75d76cc4c324cbf5e9f830f253e22350b05
MD5 6d8b397cf4c8ee42bb1f1ba1b63927d0
BLAKE2b-256 b7ecd45822a1a42562b36d2e3c4ca5783054ed6df4b28a6254122030725e91cb

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9c4db230b72f0f1d021dce20aa6edbd26190d47de37224f67834be86759fdaad
MD5 8149e9d62089910e86550c51461e7426
BLAKE2b-256 fcf281f354589067f583913169154330198a0ea68e797d7c857aca0c6baa8842

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 fb588dac74c9516c20b090ff360a92e0000c397e7895d383ccb1511de1d6f906
MD5 28ecda53c5aa9a273fc9b3e7110cb7ab
BLAKE2b-256 474be22ca6e5ab5f81b3a01a9fced15cad2915aa0f1a08d9cf3128c3cf03da75

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d31769b4cd85b5f792056e9217811ab42e8ae64de7b5641ba90b8931dfc8b1d1
MD5 9e745bb1c908692b8e6099f2364df28f
BLAKE2b-256 ea9d6df23829811db19e3aeb06d09493d425dadcdd0d80bab109a02817411115

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: subseq-1.0.2-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 67.8 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.5

File hashes

Hashes for subseq-1.0.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 b380e429252cd6596006ff39dba08affc9d4d441f92fd1cce5ff566d2ba31f15
MD5 d4fe5f72b1747897988960b44132efd8
BLAKE2b-256 963a9a72fa1f0664ad74265768ce00d9afcbb14046bfed6bd1d3364bf55865de

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp37-cp37m-win32.whl.

File metadata

  • Download URL: subseq-1.0.2-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 56.9 kB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.7.5

File hashes

Hashes for subseq-1.0.2-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 d5039bb8e4c6769b3822849416a6cee61d8a1eabd6bd67820e4a92cb155f8a51
MD5 d4930f96b1497592a744b279d1341199
BLAKE2b-256 c0784eefd8baf23de96cb84b3b79f204af9db153fac534e0c2ad9272032bde7d

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ddac40bf7b8fb310fb2aae54c0c246c033543efb96b6f4a28d802d1844c976e2
MD5 df4271be3109fe08e486276a6d5fed8c
BLAKE2b-256 89eb323a9d9f2d26cf0f4fea90fa24a18c5fec08058de79e19b1689a09c234da

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 c831ba4ece369bc7e41db30541b141dda77845d49d838dd10c73e5e8a5a535ae
MD5 a74e0c2b50177d7bec62c4f1bf37a2bc
BLAKE2b-256 c02d5203d7ba377ee2e6042f39af9ab6f939f5296207d946d8f52ac2461ebeb9

See more details on using hashes here.

File details

Details for the file subseq-1.0.2-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for subseq-1.0.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a4c92e7b3acd53ddcef8c7bca994e7e3dce129954fb0f219381dbd083d372bbc
MD5 404a1089c57c95df96c6633d97040c0b
BLAKE2b-256 ba905b71bad23e47fb27754f723f943a4ae9f743b0ec96f97654e93e4875cc24

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page