Skip to main content

Trigram based algorithm for Addok.

Project description

Addok-trigrams

Alternative indexation pattern for Addok, based on trigrams.

Installation

pip install addok-trigrams

Configuration

In your local configuration file:

  • remove unwanted RESULTS_COLLECTORS_PYPATHS:

      from addok.config.default import RESULTS_COLLECTORS_PYPATHS
      RESULTS_COLLECTORS_PYPATHS.remove('addok.helpers.collectors.extend_results_reducing_tokens')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_but_geohash_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.no_meaningful_but_common_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_try_autocomplete_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.autocomplete_meaningful_collector')
      RESULTS_COLLECTORS_PYPATHS.remove('addok.fuzzy.fuzzy_collector')
    
  • remove all autocomplete and fuzzy RESULTS_COLLECTORS_PYPATHS, add new ones:

      RESULTS_COLLECTORS_PYPATHS += [
          'addok_trigrams.extend_results_removing_numbers',
          'addok_trigrams.extend_results_removing_one_whole_word',
          'addok_trigrams.extend_results_removing_successive_trigrams',
      ]
    
  • add trigramize to PROCESSORS_PYPATHS:

      from addok.config.default import PROCESSORS_PYPATHS
      PROCESSORS_PYPATHS += [
          'addok_trigrams.trigramize',
      ]
    
  • remove pairs and autocomplete indexers from INDEXERS_PYPATHS:

      from addok.config.default import INDEXERS_PYPATHS
      INDEXERS_PYPATHS.remove('addok.pairs.PairsIndexer')
      INDEXERS_PYPATHS.remove('addok.autocomplete.EdgeNgramIndexer')
    

By default, digit only words are not turned into trigrams. To prevent this, set TRIGRAM_SKIP_DIGIT=False.

Usage

Use addok batch just like with genuine addok for importing documents, but no need for running addok ngrams, given they are already part of the index strategy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

addok-trigrams-1.1.1.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

addok_trigrams-1.1.1-py3-none-any.whl (3.4 kB view details)

Uploaded Python 3

File details

Details for the file addok-trigrams-1.1.1.tar.gz.

File metadata

File hashes

Hashes for addok-trigrams-1.1.1.tar.gz
Algorithm Hash digest
SHA256 7a7a0f3055f83e2c27b2fb5df2292317f707326b56bb1020e420e63dd1875ef0
MD5 74476eae2b61dd7434d03a1fc74cef23
BLAKE2b-256 d275e118256b0042a35914f3063ac4efe0237d525e6a0a5c138ec4f6b18679cb

See more details on using hashes here.

File details

Details for the file addok_trigrams-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for addok_trigrams-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 af2bb4b87cd116ab92da2b2e658cddcb894d438e83c58b0482c9fd92ba4752dd
MD5 bfa60fc5b1c923323679ac332828756e
BLAKE2b-256 effa883eb846c6c26bcaeae3256e9c7e326d1765c1011c04d84f5d29f9aae23e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page