Fast and memory efficient DAWG (DAFSA) for Python
Project description
DAWG2
This is a fork of DAWG project rebuilt with Python 3.10+ support.
This package provides DAWG (DAFSA)-based dictionary-like read-only objects for Python.
String data in a DAWG may take 200x less memory than in a standard Python dict and the raw lookup speed is comparable; it also provides fast advanced methods like prefix search.
Source code: https://github.com/pymorphy2-fork/DAWG
New issue tracker: https://github.com/pymorphy2-fork/DAWG/issues
“Old” issue tracker: https://github.com/pytries/DAWG/issues
License
Wrapper code is licensed under MIT License. Bundled dawgdic C++ library is licensed under BSD license. Bundled libb64 is Public Domain.
Changes
0.10.0 (2023-09-05)
More flexible char substitutes (by bt2901)
Support of Python versions older then 3.8 is dropped
Building binary wheels for pypi.org
0.9.0 (2023-05-23)
Python 3.9, 3.10 and 3.11 support is added
0.8.0 (2020-02-19)
Python 3.8 support is added
Python 3.2, 3.3 and 3.4 support is dropped
Extension is rebuilt with Cython 0.29.15
0.7.8 (2015-04-18)
extra type annotations are added to make the code a bit faster;
mercurial mirror at bitbucket is dropped;
wrapper is rebuilt with Cython 0.22.
0.7.7 (2014-11-19)
DAWG.b_prefixes method for avoiding utf8 encoding/decoding (thanks Ikuya Yamada);
wrapper is rebuilt with Cython 0.21.1.
0.7.6 (2014-08-10)
Wrapper is rebuilt with Cython 0.20.2 to fix some issues.
0.7.5 (2014-06-05)
Switched to setuptools;
some wheels are uploaded to pypi.
0.7.4 (2014-05-29)
Fixed a bug in DAWG building: input should be sorted according to its binary representation.
0.7.3 (2014-05-29)
Wrapper is rebuilt with Cython 0.21dev;
Python 3.4 compatibility is verified.
0.7.2 (2013-10-03)
has_keys_with_prefix(prefix) method (thanks Matt Hickford)
0.7.1 (2013-05-25)
Extension is rebuilt with Cython 0.19.1;
fixed segfault that happened on lookup from incorrectly loaded DAWG (thanks Alex Moiseenko).
0.7 (2013-04-05)
IntCompletionDAWG
0.6.1 (2013-03-23)
Installation issues in environments with LC_ALL=C are fixed;
PyPy is officially unsupported now (use DAWG-Python with PyPy).
0.6 (2013-03-22)
many thread-safety bugs are fixed (at the cost of slowing library down).
0.5.5 (2013-02-19)
fix installation under PyPy (note: DAWG is slow under PyPy and may have bugs).
0.5.4 (2013-02-14)
small tweaks for docstrings;
the extension is rebuilt using Cython 0.18.
0.5.3 (2013-01-03)
small improvements to .compile_replaces method;
benchmarks for .similar_items method;
the extension is rebuilt with Cython pre-0.18; this made .prefixes and .iterprefixes methods faster (up to 6x in some cases).
0.5.2 (2013-01-02)
tests are included in source distribution;
benchmark results in README was nonrepresentative because of my broken (slow) Python 3.2 install;
installation is fixed under Python 3.x with LC_ALL=C (thanks Jakub Wilk).
0.5.1 (2012-10-11)
better error reporting while building DAWGs;
__contains__ is fixed for keys with zero bytes;
dawg.Error exception class;
building of BytesDAWG and RecordDAWG fails instead of producing incorrect results if some of the keys has unsupported characters.
0.5 (2012-10-08)
The storage scheme of BytesDAWG and RecordDAWG is changed in this release in order to provide the alphabetical ordering of items.
This is a backwards-incompatible release. In order to read BytesDAWG or RecordDAWG created with previous versions of DAWG use payload_separator constructor argument:
>>> BytesDAWG(payload_separator=b'\xff').load('old.dawg')
0.4.1 (2012-10-01)
Segfaults with empty DAWGs are fixed by updating dawgdic to latest svn.
0.4 (2012-09-26)
iterkeys, iteritems and iterprefixes methods (thanks Dan Blanchard).
0.3.2 (2012-09-24)
prefixes method for finding all prefixes of a given key.
0.3.1 (2012-09-20)
bundled dawgdic C++ library is updated to the latest version.
0.3 (2012-09-13)
similar_keys, similar_items and similar_item_values methods for more permissive lookups (they may be useful e.g. for umlaut handling);
load method returns self;
Python 3.3 support.
0.2 (2012-09-08)
Greatly improved memory usage for DAWGs loaded with load method.
There is currently a bug somewhere in a wrapper so DAWGs loaded with read() method or unpickled DAWGs uses 3x-4x memory compared to DAWGs loaded with load() method. load() is fixed in this release but other methods are not.
0.1 (2012-09-08)
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for DAWG2-0.10.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48ef99d675b1ed2401688ab70c9c9f421bd13f780774c48703856f20dd4aad2d |
|
MD5 | bc6a14bf8528cd4ee686be703c1d4576 |
|
BLAKE2b-256 | f622f96bc3f0c84383a2e6e364bfb49c9d1194fbaf6ba1cf029100d56c5a4909 |
Hashes for DAWG2-0.10.0-cp311-cp311-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4398cd96146be3f04a61675fc039cb3760dc6c30504835af6a11b3df25a72a5c |
|
MD5 | f4f7e19a48c66ee288361fcf9413290b |
|
BLAKE2b-256 | 3fff90c32e0560fdaee0e68df0a33d5370ece01a62d3075ae9480e86d16a0992 |
Hashes for DAWG2-0.10.0-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 751f50c213faa4ce8a687851976cda63f40af5d3888dbdfe63011b55dc80e43f |
|
MD5 | 84b1191561a42fc8c7e264608301f3fb |
|
BLAKE2b-256 | 195f733290d1136f17f029bb4966d7f977d9de5424d0288c5c276a0653cb112d |
Hashes for DAWG2-0.10.0-cp311-cp311-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee4b55ff7ef05af2205c4a3819f8c374b6b760e810e992b04f3b898de82de9cf |
|
MD5 | c131b20abcf3b30bc2c62d040eda0dfa |
|
BLAKE2b-256 | eab583603e0113c4e73af28016ef698f23778f9c95e196924199e2125db5d3f2 |
Hashes for DAWG2-0.10.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97c74b06b7395957e507fcd72bd32333f7b9751a4a48ca80e6735fdb624aa98f |
|
MD5 | cad87f37d25865adfa17b6b5df66dcb8 |
|
BLAKE2b-256 | 3627a4002135b9de964751a7b8bb64c5a1659e1074023cedc65af39e8204be33 |
Hashes for DAWG2-0.10.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c381febbed68ab4cacb29dc9f6eddb55a43a4b7eb7b4d4ce302395e0b430fe49 |
|
MD5 | 7c7ecdcbd65705f493ef02b655d5076c |
|
BLAKE2b-256 | 42e89559d43a210ed780bd9dc3c48edd547b88ac0d3c7385f4030788166dfc4d |
Hashes for DAWG2-0.10.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f9310884f0cab128921384fe8a89b2f90255c0bf52bf15a11a7c7420c8d6574 |
|
MD5 | f254767713a5081634b92b989cd631d8 |
|
BLAKE2b-256 | 8773661d93bafea267fdb7a3ed0274ba7cf7229e9c0e3b17e65128ef78d87ca9 |
Hashes for DAWG2-0.10.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f39f431949d53e0edc9edecfbbcbc943ec94efac8cd3739bbd3b363adb84e2e |
|
MD5 | 152605fe601fb14ef65b229d8ea72fd6 |
|
BLAKE2b-256 | 7f17cb65d924177f7c001b277c14815709d1edbb80d6a4f8680aa0bcc75088f7 |
Hashes for DAWG2-0.10.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2a755735045fd08e1d6d587da235edbc3ea83731454b99033b707f7c43d76ba |
|
MD5 | c38c0fb87c37813b318c2856c29b8b5d |
|
BLAKE2b-256 | 103e22c789a7043a5904d535f7301991f78acdc51bc9d43a42c8bdcdf8e5bd32 |
Hashes for DAWG2-0.10.0-cp310-cp310-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19c583654725cad04c35364e4917da8f9ffaae17b63a7c92c7f26b36c288c395 |
|
MD5 | cc0761f22ac0f7e78f9cffec01036dfc |
|
BLAKE2b-256 | 96009ffa24e784f5e4b952567bc788baad09521697fc7ad75388502094de419c |
Hashes for DAWG2-0.10.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 953ff7a28485aba93dca12f3fdb09112b7730c08ec0a237fa6a6186b633ddd3d |
|
MD5 | 4d275dd7d7bb1c8751f805bbbaa0d544 |
|
BLAKE2b-256 | 3b5e48c315f3bc3d03f46e52b5647d740a17a88e9296beac702049bff1916342 |
Hashes for DAWG2-0.10.0-cp310-cp310-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc72c84a3d748e726d7da39825b5ba7650bc26f18177b196daf3fd5836faf16a |
|
MD5 | 041750b98b1cd1157a602be8d0dcd7ca |
|
BLAKE2b-256 | f821055419fa471c55421def88d4496e267fbc9b19393f02e82a1f1a7ff1fc72 |
Hashes for DAWG2-0.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 474d51a96fd9c21b37daf028bf84a41529d52b319a30097638e04849937e9ba0 |
|
MD5 | 43227575bc5ee4e538e70951841f6ce7 |
|
BLAKE2b-256 | 0d7416e4bb2df719f4d9979fe5745abe6f2fcbdee17e83725f260d98c2f48ef9 |
Hashes for DAWG2-0.10.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b48b29eb73882872743edd87b20d97166ce9874cef562f6d1e3ce403410f9c3 |
|
MD5 | 99c8cf306628c46c76ee2831c1ec73e2 |
|
BLAKE2b-256 | 858f295d99e05c1db1dfc82941549482159e7218a3ffef381af1549b4ed3e20b |
Hashes for DAWG2-0.10.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32697f9d0ff50b11ce67a4ea9deb7e3816b3d34f434ee805a22013d96b569ad7 |
|
MD5 | 4fdc277e074dd0c25767702f9280193c |
|
BLAKE2b-256 | 1fb4263e24575f27362cbfdc9acf5dac8e8df1c5161b74a9f6c2f2d7f0d75eb7 |
Hashes for DAWG2-0.10.0-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5279b85be146104f935a754f893f35d2bd0c3cd7af9d80dbf89e40e33dd7ee5 |
|
MD5 | e9cf988da123208736031bbd4d2e67c7 |
|
BLAKE2b-256 | d7b268208a0093d64cb749e885772d4b8e9bd180d8e52496e7e8c1a16002d558 |
Hashes for DAWG2-0.10.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dd22fdd947f0d7ef6e7b295363bd076c33fa815231873d468d0d7441aaff63c9 |
|
MD5 | de40940df331f40b84b7e0cf67996e53 |
|
BLAKE2b-256 | ae03948676a31c9b4eaf8aedd232f8c682431574f8e12f1975771565cb20be69 |
Hashes for DAWG2-0.10.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 332e90a9f74f69158d82290d29c618fbec35546aa14b56e2637b22c19ef3d444 |
|
MD5 | 6a1888f1a2dca911bade240d93a54b42 |
|
BLAKE2b-256 | 47e1af553a11db750747a26bebab75f95f773878c9a7275138c2c3071a162952 |
Hashes for DAWG2-0.10.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b293293ac6345632672c95d4f9c5084772711e6e9e2d130776433ac19445b65a |
|
MD5 | b31408fc8c32c1ebd7fecdb62d2debd0 |
|
BLAKE2b-256 | e0838d39c16ef6607299939b2234393ac00d9201da40d666783d7d9aaac91f71 |
Hashes for DAWG2-0.10.0-cp39-cp39-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a8f57d140764898c5dceccac327535e38d2a8c84340f76f5e76f041acf2c45d |
|
MD5 | f2fea149be349f67c802d12e1c4573f6 |
|
BLAKE2b-256 | 91b5b7fb92b2ff67cd2676f5135bdfa8051ba79ef9b574b61c7ef8a2a3a0cf78 |
Hashes for DAWG2-0.10.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6044c2eb32e1374b372bd6fc0b517d7af59241b4486d994be9860e65fc057620 |
|
MD5 | 6fcef521eb2decdd76ebe6ff426c6abc |
|
BLAKE2b-256 | e0a82d6259045ee90e9c51e6b27409329ff2fd197ce24848b54144a5f56c433d |
Hashes for DAWG2-0.10.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 960b03ada5768fc9b36ae6cb7ce7f809b55eebb3ca0bf1209ca8a7db567141ab |
|
MD5 | 0479000552979fbf9fba3b4b5d2298c2 |
|
BLAKE2b-256 | a9021e5d16be16977ca94ccdf087255dbecbe7d02cc960c4ca9905a012a52d24 |
Hashes for DAWG2-0.10.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7bb4605c193aff14967b69addc09bda22bfe0dec8777393e30b8cc1a65156e0 |
|
MD5 | 1f2d20362d84b1deda13e3b488280b61 |
|
BLAKE2b-256 | 5161f6e9c157789bdccfe61a44dc2a22bcb99a3ac19dcb473175d6acc5ebacf5 |
Hashes for DAWG2-0.10.0-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 974ba62b846d8cb6f4a3af284149d4a62c0eedf6d8ee3e798690ca633e6d8888 |
|
MD5 | d5eb6b72d33eb00c2664db871b034a3e |
|
BLAKE2b-256 | 6d125d4faa26530566fef7e32f52fc061f0b0719a2cbe48bcac1248e5d4d546a |
Hashes for DAWG2-0.10.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46316dee2423254370459b0308d16b421078f0d75b50215bed8d656f68ee980f |
|
MD5 | 8a49d2648dccaaac89c5b5b7b1e95a5f |
|
BLAKE2b-256 | 34262417203f27a7198b0f3cb8a688d35aac6bfe0ba8a25d79a51992ecbd140a |
Hashes for DAWG2-0.10.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 236e1ce29a8ec5c5073dfa931be1891cbe06f0b0a6898c11fee38fb5247c1bbc |
|
MD5 | 8ece502a2be5cbb1a72026fcf9a1ffe1 |
|
BLAKE2b-256 | 087add3641c66f78b212caf9e2494117498f5b05f9062f1bc089e849d1e7522c |
Hashes for DAWG2-0.10.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a15aff5e51c417d3ea3b0ca933f96f9612b3252ba41550ba4254f4adf61207c |
|
MD5 | 3e0d07d8eadea0fdcb8c34debaa3b616 |
|
BLAKE2b-256 | 322387aacd8d1005fb7ecfb04dc318383c19450397750e1d9ca60671aec83926 |
Hashes for DAWG2-0.10.0-cp38-cp38-musllinux_1_1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66a97b6341a9e0951f3bfcefddca0511aef1e6c6caab69e87c815b14f7ea4f82 |
|
MD5 | eb9abb9f97cdc61cd42c2a9ec3cc917b |
|
BLAKE2b-256 | e902504d3bfef0a373e2ae4324cd8ac7ef8d06e98ed827a7906248e08fe69059 |
Hashes for DAWG2-0.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7ca32675de35e4aacf6230dc6c68896c646f4417059e58b196a65f8074e4543 |
|
MD5 | e22174c202f12e8d2b01f6c2c27088c0 |
|
BLAKE2b-256 | bb064dfb348e2d9e6079088d86c3fc8a3e1441995bff0c17d3485ad5ced45ef0 |
Hashes for DAWG2-0.10.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b499ddd2c69d84b95fcfeb076142a989f1c67d2406c5191931fdd41b8365eff |
|
MD5 | 0a9edadef277c0c4e5808d5f74decb7a |
|
BLAKE2b-256 | b682154a3da2228c629d7f92a8cce77723be434ca026c3847ed85cd618d717f9 |
Hashes for DAWG2-0.10.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 060bb22dea99d9ab1a47db37178cef1a57a28c4f1ae7c9958fc20816bbb18c2d |
|
MD5 | 5619e6c7707a0f06adfb57debc21a7a6 |
|
BLAKE2b-256 | 9f54f75cacfbc650f97033435cf3dc142c4124ed7fa693a5a6a341777a78f4f2 |
Hashes for DAWG2-0.10.0-cp38-cp38-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab74c487f674f392376562beedb74d691f578ecf64ddfc24047237fbf962e99c |
|
MD5 | 9158547038d4ddf02ba56da6f4383e35 |
|
BLAKE2b-256 | 33610af13c9638174ccad60010388154a0f25185fb7893b703ce79c24c745344 |