Skip to main content

Trigram based algorithm for Addok.

Project description

Addok-trigrams

Alternative indexation pattern for Addok, based on trigrams.

Installation

pip install addok-trigrams

Configuration

In your local configuration file:

  • remove unwanted RESULTS_COLLECTORS_PYPATHS:

      from addok.config.default import RESULTS_COLLECTORS_PYPATHS
      RESULTS_COLLECTORS_PYPATHS.remove('addok.helpers.collectors.extend_results_reducing_tokens')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_but_geohash_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.no_meaningful_but_common_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.autocomplete_meaningful_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.fuzzy.fuzzy_collector')
    
  • remove all autocomplete and fuzzy RESULTS_COLLECTORS_PYPATHS, add new ones:

      RESULTS_COLLECTORS_PYPATHS += [
          'addok_trigrams.extend_results_removing_numbers',
          'addok_trigrams.extend_results_removing_one_whole_word',
          'addok_trigrams.extend_results_removing_successive_trigrams',
      ]
    
  • add trigramize to PROCESSORS_PYPATHS:

      from addok.config.default import PROCESSORS_PYPATHS
      PROCESSORS_PYPATHS += [
          'addok_trigrams.trigramize',
      ]
    
  • remove pairs and autocomplete indexers from INDEXERS_PYPATHS:

      from addok.config.default import INDEXERS_PYPATHS
      INDEXERS_PYPATHS.remove('addok.pairs.PairsIndexer')
      INDEXERS_PYPATHS.remove('addok.autocomplete.EdgeNgramIndexer')
    

By default, digit only words are not turned into trigrams. To prevent this, set TRIGRAM_SKIP_DIGIT=False.

Usage

Use addok batch just like with genuine addok for importing documents, but no need for running addok ngrams, given they are already part of the index strategy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for addok-trigrams, version 1.1.1
Filename, size File type Python version Upload date Hashes
Filename, size addok_trigrams-1.1.1-py3-none-any.whl (3.4 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size addok-trigrams-1.1.1.tar.gz (3.3 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page