experimental.catalogqueryplan

Static query optimized with one plan

These details have been verified by PyPI

Maintainers

fschulze hannosch mj tesdal wichert witsch

These details have not been verified by PyPI

Project links

Homepage

Framework
- Plone
Programming Language
- Python
Topic
- Software Development :: Libraries :: Python Modules

Project description

Introduction

While the catalog tool in Zope is immensely useful, we have seen some slowdowns in large Plone sites with a combination of additional indexes and lots of content.

The catalog implementation is using BTree set operations like union, multiunion and intersection. Those operations are fairly fast, especially when everything is in memory. However, the catalog implementation is rather naive which leads to lots of set operations on rather big sets.

Query plan

Search engines and databases uses query optimizers to select query plans that will minimize the result set as early as possible, because working with large amounts of data is time consuming.

What we want to do is to search against the indexes giving the smallest result set first. However, for that to be useful, we need to pass that result along into the indexes to allow the indexes to limit the result set as soon as possible internally. When calculating a path search, there is no need to look in all 150000 results if the portal type index has already limited the possible result to 10000. If we have already limited the result to 10000 results, all set operations are going to be significantly faster.

We identify different searches by the list of indexes that are searched. If there are no query plans for a set of indexes, the query is run like normal while storing the number of results for each index. When all indexes have been checked, the list is sorted on number of results and stored as a query plan. Next time a search on the same indexes comes in, the query plan is looked up.

To get different query plans for similar queries, you can provide additional bogus index names. They will be ignored by the catalog, but will become part of the key. This means that if you search for Documents in draft state for a worklist, you can have a different ordering than when searching for published Documents, as there are likely to be very few items in draft state, but many in published state.

Testing

To test, import the monkey patch in other tests, like CMFPlone:

import experimental.catalogqueryplan

and run the test.

Changelog

1.0 - 2009-01-02

Removed redundant intersections, added type checking to difference [tesdal]
Add alternative weightedIntersection, and reuse BTree tests [tesdal]
Don’t monkeypatch intersection as zc.relationship will try to pickle the function. Added new ExtendedPathIndex code. [tesdal]
Optimize UnIndex.apply_index internally, sort sets for AND, use multiunion for OR. [tesdal]
Limit the number of if-statements in intersection, and added test for fastest way of finding max and min. [tesdal]
Monkeypatch difference to handle big/tiny difference in Python This doesn’t belong in queryplan, as it’s only a BTree patch, and should be refactored out. [tesdal]
Added performance tests. [tesdal]
Fixed a bug with UnIndex return result missing index id [tesdal]
Added tests for intersection, fixed a bug with empty second argument set [tesdal]
Monkeypatch intersect to handle big/tiny intersects in Python [tesdal]
Improved UnIndex query, to avoid redundant intersections [tesdal]
Clarified LanguageIndex support. We are missing fallback support right now and now disable the optimization when fallback is enabled. [hannosch, mj]

0.9 - 2008-10-18

Added support for LinguaPlone’s LanguageIndex. [hannosch]

0.8 - 2008-09-03

Let each index patch register itself with the ADVANCEDTYPES list. This should enable patching of other indexes as well, and remove the dependency on ExtendedPathIndex. [tesdal]

0.7 - 2008-08-22

Check whether we’re supposed to use daterangeindex at all before retrieving cached data. [tesdal]

0.6 - 2008-07-03

Use a volatile instance variable to store the prioritymap. [mj]

0.5 - 2008/06/23

DateRangeIndex shouldn’t overwrite the semi-request passed into the apply_index method. [mj]

0.4 - 2008/06/23

DateRangeIndex now doesn’t assume that REQUEST is available. [tesdal]

0.3

Handle request being a dictionary. [tesdal]

0.3

Refactored patches into multiple files. [tesdal]
Dynamic query optimization based on result set analysis from queries against the same indexes. [tesdal]
Manual query optimization based on typical usage pattern. [tesdal]

0.1

Initial release

Project details

These details have been verified by PyPI

Maintainers

fschulze hannosch mj tesdal wichert witsch

These details have not been verified by PyPI

Project links

Homepage

Framework
- Plone
Programming Language
- Python
Topic
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

3.2.8

Jan 7, 2013

3.2.7

Aug 23, 2011

3.2.6

Aug 20, 2011

3.2.5

May 27, 2011

3.2.4

Apr 27, 2011

3.2.3

Apr 10, 2011

3.2.2

Apr 9, 2011

3.2.1

Mar 16, 2011

3.2.0

Mar 8, 2011

3.1.0

Dec 27, 2010

3.0.2

Sep 28, 2010

3.0.1

Sep 24, 2010

3.0

May 13, 2010

3.0a3 pre-release

Mar 7, 2010

3.0a2 pre-release

Feb 21, 2010

3.0a1 pre-release

Feb 21, 2010

2.1

Nov 19, 2009

2.0

Nov 10, 2009

1.9

Nov 6, 2009

1.8

Nov 6, 2009

1.7

Oct 17, 2009

1.6

Sep 10, 2009

1.5

Jul 27, 2009

1.4

May 20, 2009

1.3

Mar 15, 2009

1.2

Mar 3, 2009

1.1

Jan 2, 2009

This version

1.0

Jan 2, 2009

0.9

Oct 18, 2008

0.8

Sep 3, 2008

0.7

Aug 22, 2008

0.6

Jul 29, 2008

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

experimental.catalogqueryplan-1.0.zip (30.3 kB view details)

Uploaded Jan 2, 2009 Source

File details

Details for the file experimental.catalogqueryplan-1.0.zip.

File metadata

Download URL: experimental.catalogqueryplan-1.0.zip
Upload date: Jan 2, 2009
Size: 30.3 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for experimental.catalogqueryplan-1.0.zip
Algorithm	Hash digest
SHA256	`1acb679acd13156f850a590c2d188406f07c7e14e5cb9c4f202b31f1cfaf1651`
MD5	`31d06a53c476460959b5d4f8f1285e8d`
BLAKE2b-256	`56f60b2b7e407aa3c5a950a5c5d66d9f229f29d4346402e12692cf8d2e404e69`

See more details on using hashes here.

experimental.catalogqueryplan 1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Introduction

Query plan

Testing

Changelog

1.0 - 2009-01-02

0.9 - 2008-10-18

0.8 - 2008-09-03

0.7 - 2008-08-22

0.6 - 2008-07-03

0.5 - 2008/06/23

0.4 - 2008/06/23

0.3

0.3

0.1

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes