Skip to main content

Fast and memory efficient DAWG for Python

Project description

DAWG

https://travis-ci.org/kmike/DAWG.png?branch=master

This package provides DAWG(DAFSA)-based dictionary-like read-only objects for Python (2.x and 3.x).

String data in a DAWG may take 200x less memory than in a standard Python dict and the raw lookup speed is comparable; it also provides fast advanced methods like prefix search.

Docs: http://dawg.readthedocs.org

Source code:

Issue tracker: https://github.com/kmike/DAWG/issues

License

Wrapper code is licensed under MIT License. Bundled dawgdic C++ library is licensed under BSD license. Bundled libb64 is Public Domain.

Changes

0.7.5 (2014-06-05)

  • Switched to setuptools;

  • some wheels are uploaded to pypi.

0.7.4 (2014-05-29)

  • Fixed a bug in DAWG building: input should be sorted according to its binary representation.

0.7.3 (2014-05-29)

  • Wrapper is rebuilt with Cython 0.21dev;

  • Python 3.4 compatibility is verified.

0.7.2 (2013-10-03)

0.7.1 (2013-05-25)

  • Extension is rebuilt with Cython 0.19.1;

  • fixed segfault that happened on lookup from incorrectly loaded DAWG (thanks Alex Moiseenko).

0.7 (2013-04-05)

  • IntCompletionDAWG

0.6.1 (2013-03-23)

  • Installation issues in environments with LC_ALL=C are fixed;

  • PyPy is officially unsupported now (use DAWG-Python with PyPy).

0.6 (2013-03-22)

  • many thread-safety bugs are fixed (at the cost of slowing library down).

0.5.5 (2013-02-19)

  • fix installation under PyPy (note: DAWG is slow under PyPy and may have bugs).

0.5.4 (2013-02-14)

  • small tweaks for docstrings;

  • the extension is rebuilt using Cython 0.18.

0.5.3 (2013-01-03)

  • small improvements to .compile_replaces method;

  • benchmarks for .similar_items method;

  • the extension is rebuilt with Cython pre-0.18; this made .prefixes and .iterprefixes methods faster (up to 6x in some cases).

0.5.2 (2013-01-02)

  • tests are included in source distribution;

  • benchmark results in README was nonrepresentative because of my broken (slow) Python 3.2 install;

  • installation is fixed under Python 3.x with LC_ALL=C (thanks Jakub Wilk).

0.5.1 (2012-10-11)

  • better error reporting while building DAWGs;

  • __contains__ is fixed for keys with zero bytes;

  • dawg.Error exception class;

  • building of BytesDAWG and RecordDAWG fails instead of producing incorrect results if some of the keys has unsupported characters.

0.5 (2012-10-08)

The storage scheme of BytesDAWG and RecordDAWG is changed in this release in order to provide the alphabetical ordering of items.

This is a backwards-incompatible release. In order to read BytesDAWG or RecordDAWG created with previous versions of DAWG use payload_separator constructor argument:

>>> BytesDAWG(payload_separator=b'\xff').load('old.dawg')

0.4.1 (2012-10-01)

  • Segfaults with empty DAWGs are fixed by updating dawgdic to latest svn.

0.4 (2012-09-26)

  • iterkeys, iteritems and iterprefixes methods (thanks Dan Blanchard).

0.3.2 (2012-09-24)

  • prefixes method for finding all prefixes of a given key.

0.3.1 (2012-09-20)

  • bundled dawgdic C++ library is updated to the latest version.

0.3 (2012-09-13)

  • similar_keys, similar_items and similar_item_values methods for more permissive lookups (they may be useful e.g. for umlaut handling);

  • load method returns self;

  • Python 3.3 support.

0.2 (2012-09-08)

Greatly improved memory usage for DAWGs loaded with load method.

There is currently a bug somewhere in a wrapper so DAWGs loaded with read() method or unpickled DAWGs uses 3x-4x memory compared to DAWGs loaded with load() method. load() is fixed in this release but other methods are not.

0.1 (2012-09-08)

Initial release.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DAWG-0.7.5.tar.gz (250.1 kB view details)

Uploaded Source

Built Distributions

DAWG-0.7.5-cp34-cp34m-macosx_10_9_x86_64.whl (138.9 kB view details)

Uploaded CPython 3.4m macOS 10.9+ x86-64

DAWG-0.7.5-cp33-cp33m-macosx_10_6_intel.whl (262.6 kB view details)

Uploaded CPython 3.3m macOS 10.6+ intel

DAWG-0.7.5-cp27-none-macosx_10_9_x86_64.whl (136.5 kB view details)

Uploaded CPython 2.7 macOS 10.9+ x86-64

File details

Details for the file DAWG-0.7.5.tar.gz.

File metadata

  • Download URL: DAWG-0.7.5.tar.gz
  • Upload date:
  • Size: 250.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for DAWG-0.7.5.tar.gz
Algorithm Hash digest
SHA256 98738dd7a33875632f83aea34b46b7acf32c50d85e07391dd92037fbc8744e11
MD5 ae7ed1637c66b407f79fb4aebc5b87ae
BLAKE2b-256 35857abba561d3c4f92f2f1823109dd8d77d1e7d11989f3cb02e3c12ff1fb6ba

See more details on using hashes here.

File details

Details for the file DAWG-0.7.5-cp34-cp34m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for DAWG-0.7.5-cp34-cp34m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 bd563b683d8fe78517102df9768c59dafb04c8ab6a371ddeef4c31cd63668a21
MD5 0f7d5d2daf644a460e34d65360dbbd52
BLAKE2b-256 68d7f15002a9e072c5fc30f6339a59831aafbb171caf1cfab983027ab56e8812

See more details on using hashes here.

File details

Details for the file DAWG-0.7.5-cp33-cp33m-macosx_10_6_intel.whl.

File metadata

File hashes

Hashes for DAWG-0.7.5-cp33-cp33m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 45fd67803dc3b5fe322d2183eacdb3b7f72dc00781e701f80a2941ac8f6721dd
MD5 79f7a76b074185dbb04ff51d070fbd7a
BLAKE2b-256 0fddc3e9cc3ec55777313c588ec52398b0e619bfc8786d0291c6d415ad4284d0

See more details on using hashes here.

File details

Details for the file DAWG-0.7.5-cp27-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for DAWG-0.7.5-cp27-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6a914ee881c26ecb3681ff677291f31ff52c51c04f47b8ac304c16776ffd224f
MD5 7078484100ceb3ebf2df1bfdcbe44d6f
BLAKE2b-256 072e608b2efd76af178e14e9404f07bf97fedab927782a9b3ae10c03d2d82c02

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page