Solr integration for Plone using collective.solr
Project description
Introduction
============
``ftw.solr`` provides various customizations and enhancements on top of
``collective.solr`` which integrates the Solr search engine with Plone.
.. contents:: Table of Contents
Features
========
Atomic updates (aka partial updates)
------------------------------------
Since Solr 4.0 it's possible to update fields in a Solr document individually,
sending only the fields that actually changed, whereas before it was necessary
to send *all* the fields again every time something changed (and therefore ask
Plone again to index them all, causing a massive performance penalty).
``ftw.solr`` supports atomic updates for Solr version 4.1 and above.
In order for atomic updates to work, three things must be taken care of:
- An ``<updateLog />`` must be enabled in ``solrconfig.xml``. If it's missing,
Solr will reject any update messages that contain atomic update instructions.
- A ``_version_`` field must be defined in the Solr schema.
- All fields in the Solr schema must be defined with ``stored="true"``
In the stock Solr configs from 4.1 upwards ``<updateLog />`` and the
``_version_`` field are already configured correctly. If you're using
``collective.recipe.solrinstance``, check the generated ``solrconfig.xml``,
it might not have been updated for the use of atomic updates yet.
If there's a field in the Solr schema that's *not* ``stored="true"``,
it will get
**dropped** from documents in Solr on the next update to that document.
Indexing won't fail, but that field simply won't have any content any more.
Apart from those prerequisites, there's nothing more to be done in order to use
atomic updates. ``ftw.solr`` will automatically perform atomic updates whenever
possible.
Also see http://wiki.apache.org/solr/Atomic_Updates
Highlighting (aka Snippets)
---------------------------
When displaying search results, Plone by default displays the title and the
description of an item. Solr, like Google and other search engines, can return a
snippet of the text containing the words searched for.
``ftw.solr`` enables this feature in Plone.
Live search grouping
--------------------
Search results in Plone's live search can be grouped by ``portal_type``.
This is the way search results are shown in Spotlight on Mac OS X.
Facet queries
-------------
In addition to facet fields support provided by ``collective.solr``,
``ftw.solr`` adds support for facet queries.
This type of faceting offers a lot of flexibility.
Instead of choosing a specific field to facet its values, multiple
Solr queries can be specified, that themselves become facets.
Word Cloud
----------
Assuming there is a correctly configured index ``wordCloudTerms``,
a Word Cloud
showing the most common terms across documents can be displayed.
The Word Cloud is implemented in a browser view that can either be displayed
stand-alone by traversing to ``/@@wordcloud`` or rendered in a portlet.
Ajax-ified search form
----------------------
The search form is fully ajax-ified which leads to faster search results when
changing search criteria.
Tika
----
Until ftw.solr 1.11.0 we unregistered the BinaryAdder from collective.solr, since there were unresolved issues with this feature.
Since collective.solr 5.0.3 this issue is solved.
By default ftw.solr (from 1.12.0 on) is using solr's tika integration to extract text from content.
If you have ftw.tika installed, please unregister the ftw.tika portal transform or don't install it at all.
Otherwise the content will be indexed twice. With ftw.tika and in solr (also with the integrated tika in solr).
To use ftw.tika's transform please unregister the adders from collective.solr by placing the following code into your project:
<include package="z3c.unconfigure" file="meta.zcml" />
<unconfigure>
<adapter
factory="collective.solr.indexer.BinaryAdder"
name="File"
/>
<adapter
factory="collective.solr.indexer.BinaryAdder"
name="Image"
/>
</unconfigure>
Solr connection configuration in ZCML
-------------------------------------
The connections settings for Solr can be configured in ZCML and thus in
buildout. This makes it easier when copying databases between multiple Zope
instances with different Solr servers. Example::
zcml-additional =
<configure xmlns:solr="http://namespaces.plone.org/solr">
<solr:connection host="localhost" port="8983" base="/solr"/>
</configure>
Solr Configuration
==================
Search Handlers
---------------
``ftw.solr`` requires two custom search handlers that must be configured on the
Solr server. Search handlers are configured in ``solrconfig.xml`` of your
collection.
The ``livesearch`` request handler is used for live search and should limit the
returned fields to a minimum for maximum speed. Example::
<requestHandler name="livesearch" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">1000</int>
</lst>
<lst name="invariants">
<str name="fl">Title Description portal_type path_string getIcon</str>
</lst>
</requestHandler>
The ``hlsearch`` request handler should contain the configuration for highlighting.
Example::
<requestHandler name="hlsearch" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<bool name="hl">true</bool>
<bool name="hl.useFastVectorHighlighter">true</bool>
<str name="hl.fl">snippetText</str>
<int name="hl.fragsize">200</int>
<str name="hl.alternateField">Description</str>
<int name="hl.maxAlternateFieldLength">200</int>
<int name="hl.snippets">3</int>
</lst>
</requestHandler>
Field types and indexes
-----------------------
Highlighting
~~~~~~~~~~~~
Highlighting requires an index named ``snippetText``
with its own field type which does not do too much text analysis.
Fields and indexes are configured in ``schema.xml`` of your collection.
Example::
<fieldType name="text_snippets" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="snippetText" type="text_snippets" indexed="true"
stored="true" required="false" multiValued="false"
termVectors="true" termPositions="true"
termOffsets="true"/>
Word Cloud
~~~~~~~~~~
The Word Cloud feature requires an index named ``wordCloudTerms``
with it's own field type.
It's basically a copy of ``SearchableText`` but with less analysis and
filtering (no lowercasing, no character normalization, etc...).
Field type example::
<fieldType name="cloud_terms" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="${buildout:directory}/german_stop.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory"
splitOnCaseChange="1"
splitOnNumerics="1"
stemEnglishPossessive="1"
generateWordParts="0"
generateNumberParts="0"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
preserveOriginal="1"/>
<!-- Strip punctuation characters from beginning and end of terms -->
<filter class="solr.PatternReplaceFilterFactory" pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
<!-- Filter everything that does not contain at least 3 regular letters -->
<filter class="solr.PatternReplaceFilterFactory" pattern="^([^a-zA-Z]*)([a-zA-Z]{0,2})([^a-zA-Z]*)$" replacement=""/>
<!-- Filter any term shorter than 3 characters (incl. empty string) -->
<filter class="solr.LengthFilterFactory" min="2" max="50"/>
</analyzer>
</fieldType>
Index example::
<field name="wordCloudTerms" type="cloud_terms" indexed="true"
stored="true" required="false" multiValued="false"
termVectors="true" termPositions="true"
termOffsets="true"/>
<copyField source="SearchableText" dest="wordCloudTerms"/>
Search / Livesearch
-------------------
``ftw.solr`` provides a better livesearch implementation using jQuery Autocomplete widget.
A new search and result template is also included.
Suggestions
-----------
By default suggestions are disabled on the advanced search input field.
if you want autocomplete while typing you need to install the autocomplete profile of ftw.solr and...
**Prerequisit (solr config)**::
<!-- Suggester for autocomplete -->
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.WFSTLookupFactory</str>
<str name="field">SearchableText</str>
<float name="threshold">0.0005</float>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The portal searchbox no longer provides this feature in favor of the new livesearch autocomplete feature.
Installation
============
Add as dependency
-----------------
Install ``ftw.solr`` by adding it to the list of eggs in your
buildout or by adding it as a dependency of your policy package.
.. code:: rst
[instance]
eggs +=
ftw.solr
Extend your buildout
--------------------
For production:
.. code:: ini
[buildout]
extends =
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/production.cfg
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/solr.cfg
deployment-number = 05
For local development:
.. code:: ini
[buildout]
extends =
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/plone-development.cfg
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/plone-development-solr.cfg
Update medatata.xml
-------------------
Add ``ftw.solr`` to your metadata.xml:
.. code:: xml
<?xml version="1.0"?>
<metadata>
<dependencies>
<dependency>profile-ftw.solr:default</dependency>
</dependencies>
</metadata>
Run buildout
------------
If you configured your solr, you can buildout and restart your instance.
- Install the generic setup profile of ``ftw.solr``.
Links
=====
- Github: https://github.com/4teamwork/ftw.solr
- Issues: https://github.com/4teamwork/ftw.solr/issues
- Pypi: http://pypi.python.org/pypi/ftw.solr
- Continuous integration: https://jenkins.4teamwork.ch/search?q=ftw.solr
Copyright
=========
This package is copyright by `4teamwork <http://www.4teamwork.ch/>`_.
``ftw.solr`` is licensed under GNU General Public License, version 2.
Changelog
=========
1.12.0 (2019-04-23)
-------------------
- Use binary adder from collective.solr. They work since version 5.0.3 [mathias.leimgruber]
1.11.0 (2019-04-11)
-------------------
- Accessibility: Don't use "cite" for breadcrumbs in results, use a list [mathias.leimgruber]
- Accessibility: Add heading for searchresult and hint for link path. [mathias.leimgruber]
- Fix use of highlightSearchTerms on livesearch if searching for multiple terms.
jquery.highlightsearchterms needs a list of searchterms to work properly.
[elioschmutz]
- Fix bad return value for facet infos.
[elioschmutz]
- Fix disallowed html tag id name.
[elioschmutz]
- Trigger an event after updating the search-results.
[elioschmutz]
- Fix broken html-markup in available-facets.pt.
[elioschmutz]
1.10.0 (2017-11-28)
-------------------
- Refactor `ftw.solr.browser.livesearch.FtwSolrLiveSearchReplyView` for easier
subclassing. [mbaechtold]
1.9.5 (2017-11-03)
------------------
- Make snippet text upgrade from 1.9.4 more robust. [jone]
1.9.4 (2017-11-03)
------------------
- Fix escaping issue when highlighting plaintext. [jone]
- Always quote values of a list containing an operator.
[mathias.leimgruber]
- Use searchform action as endpoint in search.js.
[mathias.leimgruber]
1.9.3 (2017-08-28)
------------------
- Fix position of facet closing icon.
[mbaechtold]
1.9.2 (2017-08-08)
------------------
- Also prevent a invalid urls generated by the livesearch.py
[mathias.leimgruber]
1.9.1 (2017-08-08)
------------------
- Prevent invalid url if the object has a get parameter in his path.
[lknoepfel]
- Remove search viewlet submit from tabindex.
[lknoepfel]
1.9.0 (2017-06-28)
------------------
- Implement first suggestion on livesearch view, if suggestions are enabled.
[mathias.leimgruber]
1.8.6 (2017-06-06)
------------------
- Hide and order the viewlets for all skins and not only for the "Sunburst Theme"-skin
[elioschmutz]
- Prevent invalid url if the object has a get parameter in his path.
[lknoepfel]
1.8.5 (2017-04-19)
------------------
- Lower searchwords in ISearchwords indexer, since the value is always lowered in ftw.solr.
This applies only for DX content.
[mathias.leimgruber]
- Fix font for livesearch results.
[Kevin Bieri]
1.8.4 (2016-12-16)
------------------
- Make getting the facets more robust which caused projects having customizations
of selected facets to crash when displaying the search results.
[mbaechtold]
1.8.3 (2016-12-15)
------------------
- Remove hidden input from tabindex.
[mathias.leimgruber]
- Only append "Search in current folder" option to livesearch if not on plone root.
[mathias.leimgruber]
1.8.2 (2016-11-24)
------------------
- Scan for external links after every search page refresh.
[mathias.leimgruber]
- Do not use TALES string expressions for facet links, since it automatically transforms
chars into html entities.
The facet links were like "&amp;amp;amp;" every time you add/remove a facet.
Since we have three facts by default this happened tree times on every click.
[mathias.leimgruber]
1.8.1 (2016-11-04)
------------------
- Fix suggestion with umlauts by decode it to unicode.
[mathias.leimgruber]
- Avoid AttributeError "solr_response" on solr errors. [jone]
1.8.0 (2016-10-31)
------------------
- Unquote the quoted description in autocomplete search result.
[mathias.leimgruber]
- Explicitly remove invalid characters for solr search before passing the SearchableText to the catalog.
Basically this means all Lucene query syntax special characters are stripped.
Check https://lucene.apache.org/core/2_9_4/queryparsersyntax.html
[mathias.leimgruber]
1.7.1 (2016-10-13)
------------------
- Allow hyphens, commas and single quotes in simple terms and simple searches.
Fixes issue where a comma/hyphen/single quote in a search term causes
search_pattern template to be circumvented.
[mathias.leimgruber]
1.7.0 (2016-07-22)
------------------
- Implement link to remove the path from the querystring on the result page.
[mathias.leimgruber]
1.6.3 (2016-07-18)
------------------
- Abort livesearch init if no searchfield is available to make the script failsafe.
[elioschmutz]
1.6.2 (2016-07-15)
------------------
- Prevent the result text from read out through screenreaders
[Kevin Bieri]
- Set searchwords as a non-required field.
[elioschmutz]
1.6.1 (2016-07-11)
------------------
- Style selected facettes.
[mathias.leimgruber]
- Add icon to facets dropdown menu.
[Kevin Bieri]
- Fix faulty upgrade step which incorrectly added the Solr index queue processor
utility in 1.5.0.
[mbaechtold]
1.6.0 (2016-06-07)
------------------
- Add IShowInSearch and ISearchwords behaviors. [jone]
- Fix html structure of search in current folder. input in a tag is not allowed.
[mathias.leimgruber]
1.5.3 (2016-05-23)
------------------
- BugFix: Search for "current folder". Label was some kind of unclickable. :-( omg sry.
[mathias.leimgruber]
1.5.2 (2016-05-20)
------------------
- BugFix: Search for "current folder". Checkbox was unclickable.
[mathias.leimgruber]
- Disable special links on facets dropdown.
[Kevin Bieri]
1.5.1 (2016-04-29)
------------------
- Fix faulty upgrade step which removed the Solr connection manager utility
in 1.5.0.
[mbaechtold]
- Respect the facets configured in the Solr control panel.
[mbaechtold]
1.5.0 (2016-03-31)
------------------
- Fix css selector of selected facest.
[mathias.leimgruber]
- Implement accessible remove facet link.
[mathias.leimgruber]
- Implement accessibility support for solr facets.
[Kevin Bieri]
- Make `only in current folder section` selectable in the dropdown menu.
[Kevin Bieri]
- Adjust stylings for the new ftw.theming.
[elioschmutz]
- Do not reset the page title on facette search.
[mathias.leimgruber]
- Use a different request handler for the livesearch.
[buchi]
- Implement a new livesearch based on jquery autocomplete widget.
[mathias.leimgruber]
- Moved atomic update feature to collective.solr.
[mathias.leimgruber]
1.4.4 (2015-12-23)
------------------
- Add English, French and Italian translation of the time range facets (and
a selection of other message ids).
[mbaechtold]
1.4.3 (2015-08-11)
------------------
- Fix display of results when the search returns Brains and not Solr flares.
This fixes searches which don't include searchableText.
[tschanzt]
1.4.2 (2015-04-24)
------------------
- Fix quoting of searchterm url parameter in search result links.
[buchi]
1.4.1 (2015-03-25)
------------------
- Allow dots in simple terms / simple searches.
Fixes issue where a dot in a search term causes the search_pattern template to
be circumvented, and therefore messing up the relevancy ranking.
[lgraf]
1.4.0 (2015-02-26)
------------------
- Add support for search results from external sites. Solr flares having a
getRemoteUrl attribute that starts with 'http' are handled as external
results. External sites can be indexed with ftw.crawler.
[buchi]
- BugFix in "recursive_index_security" method. Do not try to reindex the
security on not catalog aware objects.
[mathias.leimgruber]
- Some work on docs (fix Pythonism, add some info, improve markup). [jean]
1.3.1 (2014-06-12)
------------------
- Fix error in suggestions when there is no solr response.
[buchi]
- Make sure that string attributes of a PloneFlare are always utf-8 encoded
byte strings to be consistent with catalog brains.
[buchi]
1.3 (2013-12-19)
----------------
- Updated jquery.history.js to latest version (1.8.0b2) which fixes issues
with URI encoding in IE9.
[buchi]
- Removed link to advanced search in livesearch when no results are found.
[buchi]
- "show all"-link includes the path attribute.
[elioschmutz]
- Added support for forward as well as reverse wildcard search.
This is done by providing two additional dynamic variables in the search pattern,
value_lwc and value_twc that have leading respectively trailing wildcards appended
to each of the search terms.
[lgraf]
- Fix querystring of suggestions with list parameters.
[buchi]
1.2.2 (2013-09-24)
------------------
- Added class around link to advanced search.
[Julian Infanger]
1.2.1 (2013-09-10)
------------------
- Fixed monkey patch of mangleQuery.
[buchi]
1.2 (2013-09-10)
----------------
- Improve reindexing object security performance.
We now walk down the children and stop walking down if the security indexes
of an object have not changed.
[jone]
- Added support for atomic updates.
This means whenever possible, only the necessary / specified attributes get updated
in Solr, and more importantly, re-indexed by Plone's indexers.
IMPORTANT: This requires the Solr instance to have an <updateLog/> configured in
solrconfig.xml and the schema needs to contain a _version_ field.
See http://wiki.apache.org/solr/Atomic_Updates for details.
[lgraf]
- Make sure values in search patterns are all lowercase.
[buchi]
1.1.2 (2013-07-18)
------------------
- Sort facet fields in the order specified in the Solr control panel.
[buchi]
- Fixed handling of path filter which was always removed when respect_navroot
is set to False.
[buchi]
- Handle invalid facet parameters.
[buchi]
- Monkey patch reindexObjectSecurity for both CatalogAware and CatalogMultiplex
so the relevant security indexes in solr also get updated.
[lgraf]
- Only add the default search argument to the query if it's not None and if
Solr has a default search field defined in it's schema (which is deprecated
in Solr). This mainly prevents logging of 'dropping unknown search attribute'
warnings.
[buchi]
- Escape forward slashes in all query values, not only in paths.
[buchi]
- Always insert the default 'select' search handler into the query parameters if
no 'qt' parameter is provided. We need this because we have to disable the
/select search handler in the Solr configuration to be able to select other search
handlers by parameter.
[buchi]
1.1.1 (2013-06-01)
------------------
- Also use livesearch request handler in livesearch when grouping is disabled.
[buchi]
- Fixed "show more" link in live search to point to @@search view.
[buchi]
1.1 (2013-05-31)
----------------
- Reorganized monkey patches.
Everything patch-related now lives in the patches subpackage.
[lgraf]
- Make sure @@search view doesn't fail when called without parameters.
[lgraf]
- Only display selected facets list if there actually are selected facets.
[lgraf]
- Added spellchecking feature (aka "Did you mean ...").
[buchi]
- Made respecting the navroot for searches configurable.
Only if `ISearchSettings.respect_navroot` is set searches will be constrained
to the navigation root (defaults to False).
[lgraf]
- Added autocomplete support based on Solr's suggester component.
[buchi]
1.0.2 (2013-05-28)
------------------
- Fixed querytarget of livesearch for Plone 4.2 and later.
Use our @@livesearch_reply view instead of livesearch_reply.
[buchi]
- Include description in snippetText.
[buchi]
- If there's a SearchableText indexer, use it for snippetText generation.
[buchi]
- Make length of breadcrumbs shown in search results configurable.
[buchi]
- Added option to generate breadcrumbs from path rather than calling
breadcrumbs_view for each item.
[buchi]
- Added support for dexterity content types in snippetText indexer.
[buchi]
1.0.1 (2013-05-21)
------------------
- Monkey patching c.solr.search.Search.buildQuery in order to escape slahes in paths.
[lgraf]
- Overwrite search extender: Add write permissions, fixed translations and
allowed content types in textfield.
[Julian Infanger]
- Added option to scale Word Cloud by a constant factor.
[lgraf]
- Added basic portlet to display Word Cloud.
[lgraf]
- Added basic Word Cloud browser view.
[lgraf]
1.0a1 (2012-08-22)
------------------
- Initial release
============
``ftw.solr`` provides various customizations and enhancements on top of
``collective.solr`` which integrates the Solr search engine with Plone.
.. contents:: Table of Contents
Features
========
Atomic updates (aka partial updates)
------------------------------------
Since Solr 4.0 it's possible to update fields in a Solr document individually,
sending only the fields that actually changed, whereas before it was necessary
to send *all* the fields again every time something changed (and therefore ask
Plone again to index them all, causing a massive performance penalty).
``ftw.solr`` supports atomic updates for Solr version 4.1 and above.
In order for atomic updates to work, three things must be taken care of:
- An ``<updateLog />`` must be enabled in ``solrconfig.xml``. If it's missing,
Solr will reject any update messages that contain atomic update instructions.
- A ``_version_`` field must be defined in the Solr schema.
- All fields in the Solr schema must be defined with ``stored="true"``
In the stock Solr configs from 4.1 upwards ``<updateLog />`` and the
``_version_`` field are already configured correctly. If you're using
``collective.recipe.solrinstance``, check the generated ``solrconfig.xml``,
it might not have been updated for the use of atomic updates yet.
If there's a field in the Solr schema that's *not* ``stored="true"``,
it will get
**dropped** from documents in Solr on the next update to that document.
Indexing won't fail, but that field simply won't have any content any more.
Apart from those prerequisites, there's nothing more to be done in order to use
atomic updates. ``ftw.solr`` will automatically perform atomic updates whenever
possible.
Also see http://wiki.apache.org/solr/Atomic_Updates
Highlighting (aka Snippets)
---------------------------
When displaying search results, Plone by default displays the title and the
description of an item. Solr, like Google and other search engines, can return a
snippet of the text containing the words searched for.
``ftw.solr`` enables this feature in Plone.
Live search grouping
--------------------
Search results in Plone's live search can be grouped by ``portal_type``.
This is the way search results are shown in Spotlight on Mac OS X.
Facet queries
-------------
In addition to facet fields support provided by ``collective.solr``,
``ftw.solr`` adds support for facet queries.
This type of faceting offers a lot of flexibility.
Instead of choosing a specific field to facet its values, multiple
Solr queries can be specified, that themselves become facets.
Word Cloud
----------
Assuming there is a correctly configured index ``wordCloudTerms``,
a Word Cloud
showing the most common terms across documents can be displayed.
The Word Cloud is implemented in a browser view that can either be displayed
stand-alone by traversing to ``/@@wordcloud`` or rendered in a portlet.
Ajax-ified search form
----------------------
The search form is fully ajax-ified which leads to faster search results when
changing search criteria.
Tika
----
Until ftw.solr 1.11.0 we unregistered the BinaryAdder from collective.solr, since there were unresolved issues with this feature.
Since collective.solr 5.0.3 this issue is solved.
By default ftw.solr (from 1.12.0 on) is using solr's tika integration to extract text from content.
If you have ftw.tika installed, please unregister the ftw.tika portal transform or don't install it at all.
Otherwise the content will be indexed twice. With ftw.tika and in solr (also with the integrated tika in solr).
To use ftw.tika's transform please unregister the adders from collective.solr by placing the following code into your project:
<include package="z3c.unconfigure" file="meta.zcml" />
<unconfigure>
<adapter
factory="collective.solr.indexer.BinaryAdder"
name="File"
/>
<adapter
factory="collective.solr.indexer.BinaryAdder"
name="Image"
/>
</unconfigure>
Solr connection configuration in ZCML
-------------------------------------
The connections settings for Solr can be configured in ZCML and thus in
buildout. This makes it easier when copying databases between multiple Zope
instances with different Solr servers. Example::
zcml-additional =
<configure xmlns:solr="http://namespaces.plone.org/solr">
<solr:connection host="localhost" port="8983" base="/solr"/>
</configure>
Solr Configuration
==================
Search Handlers
---------------
``ftw.solr`` requires two custom search handlers that must be configured on the
Solr server. Search handlers are configured in ``solrconfig.xml`` of your
collection.
The ``livesearch`` request handler is used for live search and should limit the
returned fields to a minimum for maximum speed. Example::
<requestHandler name="livesearch" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">1000</int>
</lst>
<lst name="invariants">
<str name="fl">Title Description portal_type path_string getIcon</str>
</lst>
</requestHandler>
The ``hlsearch`` request handler should contain the configuration for highlighting.
Example::
<requestHandler name="hlsearch" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<bool name="hl">true</bool>
<bool name="hl.useFastVectorHighlighter">true</bool>
<str name="hl.fl">snippetText</str>
<int name="hl.fragsize">200</int>
<str name="hl.alternateField">Description</str>
<int name="hl.maxAlternateFieldLength">200</int>
<int name="hl.snippets">3</int>
</lst>
</requestHandler>
Field types and indexes
-----------------------
Highlighting
~~~~~~~~~~~~
Highlighting requires an index named ``snippetText``
with its own field type which does not do too much text analysis.
Fields and indexes are configured in ``schema.xml`` of your collection.
Example::
<fieldType name="text_snippets" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<field name="snippetText" type="text_snippets" indexed="true"
stored="true" required="false" multiValued="false"
termVectors="true" termPositions="true"
termOffsets="true"/>
Word Cloud
~~~~~~~~~~
The Word Cloud feature requires an index named ``wordCloudTerms``
with it's own field type.
It's basically a copy of ``SearchableText`` but with less analysis and
filtering (no lowercasing, no character normalization, etc...).
Field type example::
<fieldType name="cloud_terms" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="${buildout:directory}/german_stop.txt" enablePositionIncrements="true"/>
<filter class="solr.WordDelimiterFilterFactory"
splitOnCaseChange="1"
splitOnNumerics="1"
stemEnglishPossessive="1"
generateWordParts="0"
generateNumberParts="0"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
preserveOriginal="1"/>
<!-- Strip punctuation characters from beginning and end of terms -->
<filter class="solr.PatternReplaceFilterFactory" pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
<!-- Filter everything that does not contain at least 3 regular letters -->
<filter class="solr.PatternReplaceFilterFactory" pattern="^([^a-zA-Z]*)([a-zA-Z]{0,2})([^a-zA-Z]*)$" replacement=""/>
<!-- Filter any term shorter than 3 characters (incl. empty string) -->
<filter class="solr.LengthFilterFactory" min="2" max="50"/>
</analyzer>
</fieldType>
Index example::
<field name="wordCloudTerms" type="cloud_terms" indexed="true"
stored="true" required="false" multiValued="false"
termVectors="true" termPositions="true"
termOffsets="true"/>
<copyField source="SearchableText" dest="wordCloudTerms"/>
Search / Livesearch
-------------------
``ftw.solr`` provides a better livesearch implementation using jQuery Autocomplete widget.
A new search and result template is also included.
Suggestions
-----------
By default suggestions are disabled on the advanced search input field.
if you want autocomplete while typing you need to install the autocomplete profile of ftw.solr and...
**Prerequisit (solr config)**::
<!-- Suggester for autocomplete -->
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.WFSTLookupFactory</str>
<str name="field">SearchableText</str>
<float name="threshold">0.0005</float>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The portal searchbox no longer provides this feature in favor of the new livesearch autocomplete feature.
Installation
============
Add as dependency
-----------------
Install ``ftw.solr`` by adding it to the list of eggs in your
buildout or by adding it as a dependency of your policy package.
.. code:: rst
[instance]
eggs +=
ftw.solr
Extend your buildout
--------------------
For production:
.. code:: ini
[buildout]
extends =
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/production.cfg
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/solr.cfg
deployment-number = 05
For local development:
.. code:: ini
[buildout]
extends =
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/plone-development.cfg
https://raw.githubusercontent.com/4teamwork/ftw-buildouts/master/plone-development-solr.cfg
Update medatata.xml
-------------------
Add ``ftw.solr`` to your metadata.xml:
.. code:: xml
<?xml version="1.0"?>
<metadata>
<dependencies>
<dependency>profile-ftw.solr:default</dependency>
</dependencies>
</metadata>
Run buildout
------------
If you configured your solr, you can buildout and restart your instance.
- Install the generic setup profile of ``ftw.solr``.
Links
=====
- Github: https://github.com/4teamwork/ftw.solr
- Issues: https://github.com/4teamwork/ftw.solr/issues
- Pypi: http://pypi.python.org/pypi/ftw.solr
- Continuous integration: https://jenkins.4teamwork.ch/search?q=ftw.solr
Copyright
=========
This package is copyright by `4teamwork <http://www.4teamwork.ch/>`_.
``ftw.solr`` is licensed under GNU General Public License, version 2.
Changelog
=========
1.12.0 (2019-04-23)
-------------------
- Use binary adder from collective.solr. They work since version 5.0.3 [mathias.leimgruber]
1.11.0 (2019-04-11)
-------------------
- Accessibility: Don't use "cite" for breadcrumbs in results, use a list [mathias.leimgruber]
- Accessibility: Add heading for searchresult and hint for link path. [mathias.leimgruber]
- Fix use of highlightSearchTerms on livesearch if searching for multiple terms.
jquery.highlightsearchterms needs a list of searchterms to work properly.
[elioschmutz]
- Fix bad return value for facet infos.
[elioschmutz]
- Fix disallowed html tag id name.
[elioschmutz]
- Trigger an event after updating the search-results.
[elioschmutz]
- Fix broken html-markup in available-facets.pt.
[elioschmutz]
1.10.0 (2017-11-28)
-------------------
- Refactor `ftw.solr.browser.livesearch.FtwSolrLiveSearchReplyView` for easier
subclassing. [mbaechtold]
1.9.5 (2017-11-03)
------------------
- Make snippet text upgrade from 1.9.4 more robust. [jone]
1.9.4 (2017-11-03)
------------------
- Fix escaping issue when highlighting plaintext. [jone]
- Always quote values of a list containing an operator.
[mathias.leimgruber]
- Use searchform action as endpoint in search.js.
[mathias.leimgruber]
1.9.3 (2017-08-28)
------------------
- Fix position of facet closing icon.
[mbaechtold]
1.9.2 (2017-08-08)
------------------
- Also prevent a invalid urls generated by the livesearch.py
[mathias.leimgruber]
1.9.1 (2017-08-08)
------------------
- Prevent invalid url if the object has a get parameter in his path.
[lknoepfel]
- Remove search viewlet submit from tabindex.
[lknoepfel]
1.9.0 (2017-06-28)
------------------
- Implement first suggestion on livesearch view, if suggestions are enabled.
[mathias.leimgruber]
1.8.6 (2017-06-06)
------------------
- Hide and order the viewlets for all skins and not only for the "Sunburst Theme"-skin
[elioschmutz]
- Prevent invalid url if the object has a get parameter in his path.
[lknoepfel]
1.8.5 (2017-04-19)
------------------
- Lower searchwords in ISearchwords indexer, since the value is always lowered in ftw.solr.
This applies only for DX content.
[mathias.leimgruber]
- Fix font for livesearch results.
[Kevin Bieri]
1.8.4 (2016-12-16)
------------------
- Make getting the facets more robust which caused projects having customizations
of selected facets to crash when displaying the search results.
[mbaechtold]
1.8.3 (2016-12-15)
------------------
- Remove hidden input from tabindex.
[mathias.leimgruber]
- Only append "Search in current folder" option to livesearch if not on plone root.
[mathias.leimgruber]
1.8.2 (2016-11-24)
------------------
- Scan for external links after every search page refresh.
[mathias.leimgruber]
- Do not use TALES string expressions for facet links, since it automatically transforms
chars into html entities.
The facet links were like "&amp;amp;amp;" every time you add/remove a facet.
Since we have three facts by default this happened tree times on every click.
[mathias.leimgruber]
1.8.1 (2016-11-04)
------------------
- Fix suggestion with umlauts by decode it to unicode.
[mathias.leimgruber]
- Avoid AttributeError "solr_response" on solr errors. [jone]
1.8.0 (2016-10-31)
------------------
- Unquote the quoted description in autocomplete search result.
[mathias.leimgruber]
- Explicitly remove invalid characters for solr search before passing the SearchableText to the catalog.
Basically this means all Lucene query syntax special characters are stripped.
Check https://lucene.apache.org/core/2_9_4/queryparsersyntax.html
[mathias.leimgruber]
1.7.1 (2016-10-13)
------------------
- Allow hyphens, commas and single quotes in simple terms and simple searches.
Fixes issue where a comma/hyphen/single quote in a search term causes
search_pattern template to be circumvented.
[mathias.leimgruber]
1.7.0 (2016-07-22)
------------------
- Implement link to remove the path from the querystring on the result page.
[mathias.leimgruber]
1.6.3 (2016-07-18)
------------------
- Abort livesearch init if no searchfield is available to make the script failsafe.
[elioschmutz]
1.6.2 (2016-07-15)
------------------
- Prevent the result text from read out through screenreaders
[Kevin Bieri]
- Set searchwords as a non-required field.
[elioschmutz]
1.6.1 (2016-07-11)
------------------
- Style selected facettes.
[mathias.leimgruber]
- Add icon to facets dropdown menu.
[Kevin Bieri]
- Fix faulty upgrade step which incorrectly added the Solr index queue processor
utility in 1.5.0.
[mbaechtold]
1.6.0 (2016-06-07)
------------------
- Add IShowInSearch and ISearchwords behaviors. [jone]
- Fix html structure of search in current folder. input in a tag is not allowed.
[mathias.leimgruber]
1.5.3 (2016-05-23)
------------------
- BugFix: Search for "current folder". Label was some kind of unclickable. :-( omg sry.
[mathias.leimgruber]
1.5.2 (2016-05-20)
------------------
- BugFix: Search for "current folder". Checkbox was unclickable.
[mathias.leimgruber]
- Disable special links on facets dropdown.
[Kevin Bieri]
1.5.1 (2016-04-29)
------------------
- Fix faulty upgrade step which removed the Solr connection manager utility
in 1.5.0.
[mbaechtold]
- Respect the facets configured in the Solr control panel.
[mbaechtold]
1.5.0 (2016-03-31)
------------------
- Fix css selector of selected facest.
[mathias.leimgruber]
- Implement accessible remove facet link.
[mathias.leimgruber]
- Implement accessibility support for solr facets.
[Kevin Bieri]
- Make `only in current folder section` selectable in the dropdown menu.
[Kevin Bieri]
- Adjust stylings for the new ftw.theming.
[elioschmutz]
- Do not reset the page title on facette search.
[mathias.leimgruber]
- Use a different request handler for the livesearch.
[buchi]
- Implement a new livesearch based on jquery autocomplete widget.
[mathias.leimgruber]
- Moved atomic update feature to collective.solr.
[mathias.leimgruber]
1.4.4 (2015-12-23)
------------------
- Add English, French and Italian translation of the time range facets (and
a selection of other message ids).
[mbaechtold]
1.4.3 (2015-08-11)
------------------
- Fix display of results when the search returns Brains and not Solr flares.
This fixes searches which don't include searchableText.
[tschanzt]
1.4.2 (2015-04-24)
------------------
- Fix quoting of searchterm url parameter in search result links.
[buchi]
1.4.1 (2015-03-25)
------------------
- Allow dots in simple terms / simple searches.
Fixes issue where a dot in a search term causes the search_pattern template to
be circumvented, and therefore messing up the relevancy ranking.
[lgraf]
1.4.0 (2015-02-26)
------------------
- Add support for search results from external sites. Solr flares having a
getRemoteUrl attribute that starts with 'http' are handled as external
results. External sites can be indexed with ftw.crawler.
[buchi]
- BugFix in "recursive_index_security" method. Do not try to reindex the
security on not catalog aware objects.
[mathias.leimgruber]
- Some work on docs (fix Pythonism, add some info, improve markup). [jean]
1.3.1 (2014-06-12)
------------------
- Fix error in suggestions when there is no solr response.
[buchi]
- Make sure that string attributes of a PloneFlare are always utf-8 encoded
byte strings to be consistent with catalog brains.
[buchi]
1.3 (2013-12-19)
----------------
- Updated jquery.history.js to latest version (1.8.0b2) which fixes issues
with URI encoding in IE9.
[buchi]
- Removed link to advanced search in livesearch when no results are found.
[buchi]
- "show all"-link includes the path attribute.
[elioschmutz]
- Added support for forward as well as reverse wildcard search.
This is done by providing two additional dynamic variables in the search pattern,
value_lwc and value_twc that have leading respectively trailing wildcards appended
to each of the search terms.
[lgraf]
- Fix querystring of suggestions with list parameters.
[buchi]
1.2.2 (2013-09-24)
------------------
- Added class around link to advanced search.
[Julian Infanger]
1.2.1 (2013-09-10)
------------------
- Fixed monkey patch of mangleQuery.
[buchi]
1.2 (2013-09-10)
----------------
- Improve reindexing object security performance.
We now walk down the children and stop walking down if the security indexes
of an object have not changed.
[jone]
- Added support for atomic updates.
This means whenever possible, only the necessary / specified attributes get updated
in Solr, and more importantly, re-indexed by Plone's indexers.
IMPORTANT: This requires the Solr instance to have an <updateLog/> configured in
solrconfig.xml and the schema needs to contain a _version_ field.
See http://wiki.apache.org/solr/Atomic_Updates for details.
[lgraf]
- Make sure values in search patterns are all lowercase.
[buchi]
1.1.2 (2013-07-18)
------------------
- Sort facet fields in the order specified in the Solr control panel.
[buchi]
- Fixed handling of path filter which was always removed when respect_navroot
is set to False.
[buchi]
- Handle invalid facet parameters.
[buchi]
- Monkey patch reindexObjectSecurity for both CatalogAware and CatalogMultiplex
so the relevant security indexes in solr also get updated.
[lgraf]
- Only add the default search argument to the query if it's not None and if
Solr has a default search field defined in it's schema (which is deprecated
in Solr). This mainly prevents logging of 'dropping unknown search attribute'
warnings.
[buchi]
- Escape forward slashes in all query values, not only in paths.
[buchi]
- Always insert the default 'select' search handler into the query parameters if
no 'qt' parameter is provided. We need this because we have to disable the
/select search handler in the Solr configuration to be able to select other search
handlers by parameter.
[buchi]
1.1.1 (2013-06-01)
------------------
- Also use livesearch request handler in livesearch when grouping is disabled.
[buchi]
- Fixed "show more" link in live search to point to @@search view.
[buchi]
1.1 (2013-05-31)
----------------
- Reorganized monkey patches.
Everything patch-related now lives in the patches subpackage.
[lgraf]
- Make sure @@search view doesn't fail when called without parameters.
[lgraf]
- Only display selected facets list if there actually are selected facets.
[lgraf]
- Added spellchecking feature (aka "Did you mean ...").
[buchi]
- Made respecting the navroot for searches configurable.
Only if `ISearchSettings.respect_navroot` is set searches will be constrained
to the navigation root (defaults to False).
[lgraf]
- Added autocomplete support based on Solr's suggester component.
[buchi]
1.0.2 (2013-05-28)
------------------
- Fixed querytarget of livesearch for Plone 4.2 and later.
Use our @@livesearch_reply view instead of livesearch_reply.
[buchi]
- Include description in snippetText.
[buchi]
- If there's a SearchableText indexer, use it for snippetText generation.
[buchi]
- Make length of breadcrumbs shown in search results configurable.
[buchi]
- Added option to generate breadcrumbs from path rather than calling
breadcrumbs_view for each item.
[buchi]
- Added support for dexterity content types in snippetText indexer.
[buchi]
1.0.1 (2013-05-21)
------------------
- Monkey patching c.solr.search.Search.buildQuery in order to escape slahes in paths.
[lgraf]
- Overwrite search extender: Add write permissions, fixed translations and
allowed content types in textfield.
[Julian Infanger]
- Added option to scale Word Cloud by a constant factor.
[lgraf]
- Added basic portlet to display Word Cloud.
[lgraf]
- Added basic Word Cloud browser view.
[lgraf]
1.0a1 (2012-08-22)
------------------
- Initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ftw.solr-1.12.0.tar.gz
(111.6 kB
view hashes)