Fast and memory efficient DAWG for Python
Project description
DAWG
This package provides DAWG(DAFSA)-based dictionary-like read-only objects for Python (2.x and 3.x).
String data in a DAWG may take 200x less memory than in a standard Python dict and the raw lookup speed is comparable; it also provides fast advanced methods like prefix search.
Docs: http://dawg.readthedocs.org
Source code:
Issue tracker: https://github.com/kmike/DAWG/issues
License
Wrapper code is licensed under MIT License. Bundled dawgdic C++ library is licensed under BSD license. Bundled libb64 is Public Domain.
Changes
0.7.6 (2014-08-10)
Wrapper is rebuilt with Cython 0.20.2 to fix some issues.
0.7.5 (2014-06-05)
Switched to setuptools;
some wheels are uploaded to pypi.
0.7.4 (2014-05-29)
Fixed a bug in DAWG building: input should be sorted according to its binary representation.
0.7.3 (2014-05-29)
Wrapper is rebuilt with Cython 0.21dev;
Python 3.4 compatibility is verified.
0.7.2 (2013-10-03)
has_keys_with_prefix(prefix) method (thanks Matt Hickford)
0.7.1 (2013-05-25)
Extension is rebuilt with Cython 0.19.1;
fixed segfault that happened on lookup from incorrectly loaded DAWG (thanks Alex Moiseenko).
0.7 (2013-04-05)
IntCompletionDAWG
0.6.1 (2013-03-23)
Installation issues in environments with LC_ALL=C are fixed;
PyPy is officially unsupported now (use DAWG-Python with PyPy).
0.6 (2013-03-22)
many thread-safety bugs are fixed (at the cost of slowing library down).
0.5.5 (2013-02-19)
fix installation under PyPy (note: DAWG is slow under PyPy and may have bugs).
0.5.4 (2013-02-14)
small tweaks for docstrings;
the extension is rebuilt using Cython 0.18.
0.5.3 (2013-01-03)
small improvements to .compile_replaces method;
benchmarks for .similar_items method;
the extension is rebuilt with Cython pre-0.18; this made .prefixes and .iterprefixes methods faster (up to 6x in some cases).
0.5.2 (2013-01-02)
tests are included in source distribution;
benchmark results in README was nonrepresentative because of my broken (slow) Python 3.2 install;
installation is fixed under Python 3.x with LC_ALL=C (thanks Jakub Wilk).
0.5.1 (2012-10-11)
better error reporting while building DAWGs;
__contains__ is fixed for keys with zero bytes;
dawg.Error exception class;
building of BytesDAWG and RecordDAWG fails instead of producing incorrect results if some of the keys has unsupported characters.
0.5 (2012-10-08)
The storage scheme of BytesDAWG and RecordDAWG is changed in this release in order to provide the alphabetical ordering of items.
This is a backwards-incompatible release. In order to read BytesDAWG or RecordDAWG created with previous versions of DAWG use payload_separator constructor argument:
>>> BytesDAWG(payload_separator=b'\xff').load('old.dawg')
0.4.1 (2012-10-01)
Segfaults with empty DAWGs are fixed by updating dawgdic to latest svn.
0.4 (2012-09-26)
iterkeys, iteritems and iterprefixes methods (thanks Dan Blanchard).
0.3.2 (2012-09-24)
prefixes method for finding all prefixes of a given key.
0.3.1 (2012-09-20)
bundled dawgdic C++ library is updated to the latest version.
0.3 (2012-09-13)
similar_keys, similar_items and similar_item_values methods for more permissive lookups (they may be useful e.g. for umlaut handling);
load method returns self;
Python 3.3 support.
0.2 (2012-09-08)
Greatly improved memory usage for DAWGs loaded with load method.
There is currently a bug somewhere in a wrapper so DAWGs loaded with read() method or unpickled DAWGs uses 3x-4x memory compared to DAWGs loaded with load() method. load() is fixed in this release but other methods are not.
0.1 (2012-09-08)
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for DAWG-0.7.6-cp34-cp34m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a95d52d88d5ed6675483df58a3ece6c38331a246dbf34493f0ca73dc9b1516a |
|
MD5 | 2bbc2aaa87fdc2186d9349ee8baabee1 |
|
BLAKE2b-256 | 579f9476ba37d1ee5c0807f15f9376970d99ecfa0199a52871476ad08d414646 |
Hashes for DAWG-0.7.6-cp27-none-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d31b95365f2349918b00fa7af53a0af393b837e0d2c738f31f466c426ce9fa29 |
|
MD5 | 10ac96e2ef9ab62dbde8a75746fe58c8 |
|
BLAKE2b-256 | cc188c1b31df06c8fdb2b88d1aaf314611aeb8006ec6893693b8e9fb76273716 |