pysettrie package
Project description
pysettrie
https://github.com/GregoryMorse/pysettrie
pysettrie is an efficient Cython python3 package that provides support for efficient storage and querying of sets of sets using the trie data structure, supporting operations like finding all the supersets/subsets of a given set from a collection of sets. The original motivation for this module was to provide efficient search for supersets of sets of feature-value pairs in our natural language parser project (e.g. matching nouns against verb argument positions).
The following classes are included:
- SetTrie: set-trie container for sets; supports efficient supersets/subsets of a given search set calculations.
- SetTrieMap: mapping container using sets as keys; supports efficient operations like SetTrie but also stores values associated to the key sets.
- SetTrieMultiMap: like SetTrieMap, but supports multiple values associated to each key.
For further information, please see documentation
Module test_settrie.py contains unittests for all the containers.
Authors: Gregory Morse and Márton Miháltz https://sites.google.com/site/mmihaltz/
One recommended way to install (tested on Ubuntu): If you don't have pip3:
sudo apt-get install python3-setuptools
sudo easy_install3 pip
pysettrie is partly based on: I.Savnik: Index data structure for fast subset and superset queries. CD-ARES, IFIP LNCS, 2013. http://osebje.famnit.upr.si/~savnik/papers/cdares13.pdf Remarks on paper:
- Algorithm 1. does not mention to sort children (or do sorted insert) in insert operation (line 5)
- Algorithm 4. is wrong, will always return false, line 7 should be: "for (each child of node labeled l: word.currentElement <= l) & (while not found) do"
- the descriptions of getAllSubSets and getAllSuperSets operations are wrong, would not produce all sub/supersets
See also:
- http://stackoverflow.com/questions/9353100/quickly-checking-if-set-is-superset-of-stored-sets
- http://stackoverflow.com/questions/1263524/superset-search?rq=1
Changes:
- Version 1.0.2:
- Continuous integration, remove unnecessary files, improve build requirements
- Version 1.0.0:
- Some bug fixes, complete Cython translation for improved performance.
- Version 0.1.3:
- SetTrieMultiMap.assign() returns number of values associated to key after assignment.
Licensed under the GNU LESSER GENERAL PUBLIC LICENSE, Version 3.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pysettrie-1.0.2.tar.gz
.
File metadata
- Download URL: pysettrie-1.0.2.tar.gz
- Upload date:
- Size: 141.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbad8b94f7b2b4bca2acf7e2bdb5acb7a5ccd245d0b60fcfe06e6b70d698b65e |
|
MD5 | 027d516cfab24526adb7962314843422 |
|
BLAKE2b-256 | a99d23b4b75062d89f7c9dd1a7fb4db99882d20f9ccd2d593a0c073817ee28ac |
File details
Details for the file pysettrie-1.0.2-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: pysettrie-1.0.2-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 218.9 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6736b02cbc9284ce981740341ba8da55fb1deb0565f5e57982ad0c8f5f8d29e4 |
|
MD5 | 4dcbe66585e7b76dcb7afaffdf5e8126 |
|
BLAKE2b-256 | 62f9823fc1a518d5daadae7212a179ec7368c9bc996d4c613291bfb050325fd2 |