Fast and memory efficient DAWG (DAFSA) for Python
Project description
DAWG
This package provides DAWG(DAFSA)-based dictionary-like read-only objects for Python (2.x and 3.x).
String data in a DAWG may take 200x less memory than in a standard Python dict and the raw lookup speed is comparable; it also provides fast advanced methods like prefix search.
Source code: https://github.com/kmike/DAWG
Issue tracker: https://github.com/kmike/DAWG/issues
License
Wrapper code is licensed under MIT License. Bundled dawgdic C++ library is licensed under BSD license. Bundled libb64 is Public Domain.
Changes
0.7.8 (2015-04-18)
extra type annotations are added to make the code a bit faster;
mercurial mirror at bitbucket is dropped;
wrapper is rebuilt with Cython 0.22.
0.7.7 (2014-11-19)
DAWG.b_prefixes method for avoiding utf8 encoding/decoding (thanks Ikuya Yamada);
wrapper is rebuilt with Cython 0.21.1.
0.7.6 (2014-08-10)
Wrapper is rebuilt with Cython 0.20.2 to fix some issues.
0.7.5 (2014-06-05)
Switched to setuptools;
some wheels are uploaded to pypi.
0.7.4 (2014-05-29)
Fixed a bug in DAWG building: input should be sorted according to its binary representation.
0.7.3 (2014-05-29)
Wrapper is rebuilt with Cython 0.21dev;
Python 3.4 compatibility is verified.
0.7.2 (2013-10-03)
has_keys_with_prefix(prefix) method (thanks Matt Hickford)
0.7.1 (2013-05-25)
Extension is rebuilt with Cython 0.19.1;
fixed segfault that happened on lookup from incorrectly loaded DAWG (thanks Alex Moiseenko).
0.7 (2013-04-05)
IntCompletionDAWG
0.6.1 (2013-03-23)
Installation issues in environments with LC_ALL=C are fixed;
PyPy is officially unsupported now (use DAWG-Python with PyPy).
0.6 (2013-03-22)
many thread-safety bugs are fixed (at the cost of slowing library down).
0.5.5 (2013-02-19)
fix installation under PyPy (note: DAWG is slow under PyPy and may have bugs).
0.5.4 (2013-02-14)
small tweaks for docstrings;
the extension is rebuilt using Cython 0.18.
0.5.3 (2013-01-03)
small improvements to .compile_replaces method;
benchmarks for .similar_items method;
the extension is rebuilt with Cython pre-0.18; this made .prefixes and .iterprefixes methods faster (up to 6x in some cases).
0.5.2 (2013-01-02)
tests are included in source distribution;
benchmark results in README was nonrepresentative because of my broken (slow) Python 3.2 install;
installation is fixed under Python 3.x with LC_ALL=C (thanks Jakub Wilk).
0.5.1 (2012-10-11)
better error reporting while building DAWGs;
__contains__ is fixed for keys with zero bytes;
dawg.Error exception class;
building of BytesDAWG and RecordDAWG fails instead of producing incorrect results if some of the keys has unsupported characters.
0.5 (2012-10-08)
The storage scheme of BytesDAWG and RecordDAWG is changed in this release in order to provide the alphabetical ordering of items.
This is a backwards-incompatible release. In order to read BytesDAWG or RecordDAWG created with previous versions of DAWG use payload_separator constructor argument:
>>> BytesDAWG(payload_separator=b'\xff').load('old.dawg')
0.4.1 (2012-10-01)
Segfaults with empty DAWGs are fixed by updating dawgdic to latest svn.
0.4 (2012-09-26)
iterkeys, iteritems and iterprefixes methods (thanks Dan Blanchard).
0.3.2 (2012-09-24)
prefixes method for finding all prefixes of a given key.
0.3.1 (2012-09-20)
bundled dawgdic C++ library is updated to the latest version.
0.3 (2012-09-13)
similar_keys, similar_items and similar_item_values methods for more permissive lookups (they may be useful e.g. for umlaut handling);
load method returns self;
Python 3.3 support.
0.2 (2012-09-08)
Greatly improved memory usage for DAWGs loaded with load method.
There is currently a bug somewhere in a wrapper so DAWGs loaded with read() method or unpickled DAWGs uses 3x-4x memory compared to DAWGs loaded with load() method. load() is fixed in this release but other methods are not.
0.1 (2012-09-08)
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for DAWG-0.7.8-cp36-cp36m-macosx_10_12_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a5ea13d5a424542d1a7fa908db974e712be90ccdd86cec9e24c6b20794f5f5e |
|
MD5 | b9c2a5a22c9d581f9906a0562fef0f65 |
|
BLAKE2b-256 | 874ae2933c2e02abe8034ea7e61f0694d5d170b2facf5f8e68d91f89f133da65 |
Hashes for DAWG-0.7.8-cp35-cp35m-macosx_10_11_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 402659e3044a5fb79dadefeaabb15ba9c0ef56c844bb4bcde6b102afbf4788f8 |
|
MD5 | 6a558b550481fdd3123d5650491e9f3e |
|
BLAKE2b-256 | bb150aa44dc0d70450a3364e8899e2aacca65379ad28b8c9770b08921cc83a7f |
Hashes for DAWG-0.7.8-cp34-cp34m-macosx_10_10_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1f9c72bb3eca530f78fcf82f2d60ff41298f10e1c9f018b402af0ecbe246171 |
|
MD5 | 57a38ee7a52781972c620a26d48decd2 |
|
BLAKE2b-256 | e499d35b9459c15988d869ff04b698e59ccf10e4f17c498ab03eab165b7ec762 |
Hashes for DAWG-0.7.8-cp27-none-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7accbfe484a353e1f02a947f84f817846f30738d1170d4e855f536d5708632a3 |
|
MD5 | e2ef2a4e6fb4861cb856df1b2f984385 |
|
BLAKE2b-256 | 82f0b5c567db487355a8d14fed1b2eb5af9209b52faf175b4111f1f6924d9365 |