Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Composite index for the Catalog

Project Description

Introduction

CompositeIndex is a plugin index for the ZCatalog. Indexes containing more than one attribute to index an object are called “composite index”. Such indexes should be created if you expect to run queries that will have multiple attributes in the search phrase and all attributes combined will give significantly less hits than the any of the attributes alone. The key of a composite index is called “composite key” and is composed of two or more attributes of an object.

Catalog queries containing attributes managed by CompositeIndex are transparently catched and transformed into a CompositeIndex query (monkey patch). In particular, large sites with a combination of additional indexes (FieldIndex, KeywordIndex) and lots of content (>100k) will profit. The expected performance enhancement for catalog queries is about a factor of >2-3.

Statistics

Ratio of Calculation Time between Atomic- and Composite Index queries.

The plot shows that the performance of CompositeIndex increases significantly with increasing number of indexed objects (>1000 catalog entries) and with increasing number of combined attributes. The hit rate of the queries was about 6% for two combined attributes and 1% for three combined attributes of the total number of catalog entries. For uniform comparability, the ZODB cache was cleared before each query.

Usage

From the ZCatalog indexes tab, add an index of type CompositeIndex.

Id
pick any valid id you like
Composite key
names of attributes to concatenate

Example for Plone’s portal_catalog

Many catalog queries in plone are based on the combination of indexed attributes as follows: is_default_page, review_state, portal_type and allowedRolesAndUsers. Normally, the ZCatalog sequentially executes each corresponding atomic index and calculates intersection between each result. This strategy, in particular for large sites, decreases the performance of the catalog and simultaneously increases the volatility of ZODB’s object cache, because each index individually has a high number of hits whereas the the intersection between each index result has a low number of hits.

CompositeIndex overcomes this difficulty because it already contains a pre-calculateted intersection by means of its composite keys. The loading of large sets and the following expensive computation of the intersection is therefore obsolete.

Here we show a configuration example for plone. From the portal_catalog indexes tab, add a index of type CompositeIndex.

Id: comp01

Composite key: is_default_page,review_state,portal_type,allowedRolesAndUsers

Reindex the CompositeIndex “comp01”.

Now each query which contains two ore more components of the composite key is automatically transformed into a query on the CompositeIndex “comp01”.

Changelog

0.1 - rc1

  • Initial release
Release History

Release History

This version
History Node

0.1rc1-r84937

History Node

0.1rc1

History Node

0.1dev-r84257

History Node

0.1dev-r84057

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
unimr.compositeindex-0.1rc1_r84937-py2.4.egg (29.4 kB) Copy SHA256 Checksum SHA256 2.4 Egg Apr 23, 2009
unimr.compositeindex-0.1rc1-r84937.tar.gz (42.5 kB) Copy SHA256 Checksum SHA256 Source Apr 23, 2009

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting