Plone blueprints for collective.transmogrifier pipelines

Project description

This package contains several blueprints for collective.transmogrifier pipelines, commonly used to import content into a Plone site.


See docs/INSTALL.txt for installation instructions.


Development sponsored by

Elkjøp Nordic AS

Design and development

Martijn Pieters at Jarn Florian Schulze at Jarn

Detailed Documentation

ATSchema updater section

An AT schema updater pipeline section is another important transmogrifier content import pipeline element. It updates field values for Archetypes objects based on their schema based on the items it processes. The AT schema updater section blueprint name is AT Schema updater sections operate on objects already present in the ZODB, be they created by a constructor or pre-existing objects.

Schema updating needs at least 1 piece of information: the path to the object to update. To determine the path, the schema updater section inspects each item and looks for one key, as described below. Any item missing this piece of information will be skipped. Similarly, items with a path that doesn’t exist or are not Archetypes objects will be skipped as well.

For the object path, it’ll look (in order) for[sectionname]_path,, _[sectionname]_path and _path, where [sectionname] is replaced with the name given to the current section. This allows you to target the right section precisely if needed. Alternatively, you can specify what key to use for the path by specifying the path-key option, which should be a list of keys to try (one key per line, use a re: or regexp: prefix to specify regular expressions).

Paths to objects are always interpreted as relative to the context. Any writable field who’s id matches a key in the current item will be updated with the corresponding value, using the field’s mutator.

>>> import pprint
>>> atschema = """
... [transmogrifier]
... pipeline =
...     schemasource
...     schemaupdater
...     printer
... [schemasource]
... blueprint =
... [schemaupdater]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'', atschema)
>>> transmogrifier(u'')
[('_path', '/spam/eggs/foo'),
 ('fieldnotchanged', 'nochange'),
 ('fieldone', 'one value'),
 ('fieldtwo', 2),
 ('fieldunicode', u'\xe5'),
 ('nosuchfield', 'ignored')]
[('_path', 'not/existing/bar'),
 ('fieldone', 'one value'),
 ('title', 'Should not be updated, not an existing path')]
[('fieldone', 'one value'), ('title', 'Should not be updated, no path')]
[('_path', '/spam/eggs/notatcontent'),
 ('fieldtwo', 2),
 ('title', 'Should not be updated, not an AT base object')]
>>> pprint.pprint(plone.updated)
(('spam/eggs/foo', 'fieldone', 'one value'), ('spam/eggs/foo', 'fieldtwo', 2))

UID updater section

If an Archetypes content object is created in a pipeline, e.g. by the standard content constructor section, it will get a new UID. If you are importing content from another Plone site, and you have references (or links embedded in content using Plone’s link-by-UID feature) to existing content, you may want to retain UIDs. The UID updater section allows you to set the UID on an existing object for this purpose.

The UID updater blueprint name is

UID updating requires two pieces of information: the path to the object to update, and the new UID to set.

To determine the path, the UID updater section inspects each item and looks for a path key, as described below. Any item missing this key will be skipped. Similarly, items with a path that doesn’t exist or are not referenceable (Archetypes) objects will be skipped.

The object path will be found under the first key found among the following:



  • _[sectionname]_path

  • _path

where [sectionname] is replaced with the name given to the current section. This allows you to target the right section precisely if needed.

Alternatively, you can specify what key to use for the path by specifying the path-key option, which should be a list of keys to try (one key per line; use a re: or regexp: prefix to specify regular expressions).

Paths to objects are always interpreted as relative to the context.

Similarly, the UID to set must be a string under a given key. You can set the key with the uid-key option, which behaves much like path-key. The default is to look under:



  • _[sectionname]_uid

  • _uid

If the UID key is missing, the item will be skipped.

Below is an example of a standard updater. The test uid source produces items with two keys: a path under _path and a UID string under _uid.

>>> import pprint
>>> atschema = """
... [transmogrifier]
... pipeline =
...     schemasource
...     schemaupdater
...     printer
... [schemasource]
... blueprint =
... [schemaupdater]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'', atschema)
>>> transmogrifier(u'')
[('_path', '/spam/eggs/foo'), ('_uid', 'abc')]
[('_path', '/spam/eggs/bar'), ('_uid', 'xyz')]
[('_path', 'not/existing/bar'), ('_uid', 'def')]
[('_uid', 'geh')]
[('_path', '/spam/eggs/baz')]
[('_path', '/spam/notatcontent'), ('_uid', 'ijk')]
>>> pprint.pprint(plone.uids_set)
[('spam/eggs/foo', 'abc')]

Workflow updater section

A workflow updater pipeline section is another important transmogrifier content import pipeline element. It executes workflow transitions on Plone content based on the items it processes. The workflow updater section blueprint name is Workflow updater sections operate on objects already present in the ZODB, be they created by a constructor or pre-existing objects.

Workflow updating needs 2 pieces of information: the path to the object, and what transitions to execute. To determine these, the workflow updater section inspects each item and looks for two keys, as described below. Any item missing any of these two pieces will be skipped. Similarly, items with a path that doesn’t exist will be skipped as well.

For the object path, it’ll look (in order) for[sectionname]_path,, _[sectionname]_path and _path, where [sectionname] is replaced with the name given to the current section. This allows you to target the right section precisely if needed. Alternatively, you can specify what key to use for the path by specifying the path-key option, which should be a list of keys to try (one key per line, use a re: or regexp: prefix to specify regular expressions).

For the transitions, use the transitions-key option (same interpretation as path-key), defaulting to[sectionname]_transitions,, _[sectionname]_transitions and _transitions.

Unicode paths are encoded to ASCII. Paths to objects are always interpreted as relative to the context object. Transitions are specified as a sequence of transition names, or as a string specifying one transition. Transitions are executed in order, failing transitions are silently ignored.

>>> import pprint
>>> workflow = """
... [transmogrifier]
... pipeline =
...     workflowsource
...     workflowupdater
...     printer
... [workflowsource]
... blueprint =
... [workflowupdater]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'',
...                workflow)
>>> transmogrifier(u'')
[('_path', '/spam/eggs/foo'), ('_transitions', 'spam')]
[('_path', '/spam/eggs/baz'), ('_transitions', ('spam', 'eggs'))]
[('_path', 'not/existing/bar'),
 ('_transitions', ('spam', 'eggs')),
 ('title', 'Should not be updated, not an existing path')]
[('_path', 'spam/eggs/incomplete'),
 ('title', 'Should not be updated, no transitions')]
[('_path', '/spam/eggs/nosuchtransition'),
 ('_transitions', ('nonsuch',)),
 ('title', 'Should not be updated, no such transition')]
>>> pprint.pprint(plone.updated)
(('spam/eggs/foo', 'spam'),
 ('spam/eggs/baz', 'spam'),
 ('spam/eggs/baz', 'eggs'))

Browser default section

A browser default pipeline section sets the default-page on a folder, and the layout template on content objects. They are the Transmogrifier equivalent of the display menu in Plone. The browser default section blueprint name is Browser default sections operate on objects already present in the ZODB, be they created by a constructor or pre-existing objects.

Setting the browser default needs at least 1 piece of information: the path to the object to modify. To determine the path, the browser default section inspects each item and looks for one key, as described below. Any item missing this piece of information will be skipped. Similarly, items with a path that doesn’t exist or do not support the Plone ISelectableBrowserDefault interface will be skipped as well.

For the object path, it’ll look (in order) for[sectionname]_path,, _[sectionname]_path and _path, where [sectionname] is replaced with the name given to the current section. This allows you to target the right section precisely if needed. Alternatively, you can specify what key to use for the path by specifying the path-key option, which should be a list of keys to try (one key per line, use a re: or regexp: prefix to specify regular expressions).

Once an object has been located, the section will looks for defaultpage and layout keys. Like the path key, these can be specified in the source configuration, named by the default-page-key and layout-key options, respectively, and like the path key, the default keys the section looks for are the usual list of specific-to-generic keys based on blueprint and section names, from[sectionname]_defaultpage and[sectionname]_layout down to _defaultpage and _layout.

The defaultpage key will set the id of the default page that should be presented when the content object is loaded, and the layout key will set the id of the layout to use for the content item.

>>> import pprint
>>> browserdefault = """
... [transmogrifier]
... pipeline =
...     browserdefaultsource
...     browserdefault
...     printer
... [browserdefaultsource]
... blueprint =
... [browserdefault]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'',
...                browserdefault)
>>> transmogrifier(u'')
[('_layout', 'spam'), ('_path', '/spam/eggs/foo')]
[('_defaultpage', 'eggs'), ('_path', '/spam/eggs/bar')]
[('_defaultpage', 'eggs'), ('_layout', 'spam'), ('_path', '/spam/eggs/baz')]
[('_layout', 'spam'),
 ('_path', 'not/existing/bar'),
 ('title', 'Should not be updated, not an existing path')]
[('_path', 'spam/eggs/incomplete'),
 ('title', 'Should not be updated, no layout or defaultpage')]
[('_layout', ''),
 ('_path', 'spam/eggs/emptylayout'),
 ('title', 'Should not be updated, no layout or defaultpage')]
[('_defaultpage', ''),
 ('_path', 'spam/eggs/emptydefaultpage'),
 ('title', 'Should not be updated, no layout or defaultpage')]
>>> pprint.pprint(plone.updated)
(('spam/eggs/foo', 'layout', 'spam'),
 ('spam/eggs/bar', 'defaultpage', 'eggs'),
 ('spam/eggs/baz', 'layout', 'spam'),
 ('spam/eggs/baz', 'defaultpage', 'eggs'))

Criterion adder section

A criterion adder section is used to add criteria to collections. It’s section blueprint name is Criterion adder sections operate on objects already present in the ZODB, be they created by a constructor or pre-existing objects.

Given a path, a criterion type and a field name, this section will look up a Collection at the given path, and add a criterion field, then alter the path of the item so further sections will act on the added criterion. For example, an item with keys _path=bar/baz, _field=modified and _criterion=ATFriendlyDateCriteria will result in a new date criterion added inside the bar/baz collection, and the item’s path will be updated to bar/baz/crit__ATFriendlyDateCriteria_modified.

For the path, criterion type and field keys, it’ll look (in order) for[sectionname]_[key],[key], _[sectionname]_[key] and _[key], where [sectionname] is replaced with the name given to the current section and [key] is path, criterion and field respectively. This allows you to target the right section precisely if needed. Alternatively, you can specify what key to use for these by specifying the path-key, criterion-key and field-key options, which should be a list of keys to try (one key per line, use a re: or regexp: prefix to specify regular expressions).

Paths to objects are always interpreted as relative to the context, and must resolve to IATTopic classes.

>>> import pprint
>>> criteria = """
... [transmogrifier]
... pipeline =
...     criteriasource
...     criterionadder
...     printer
... [criteriasource]
... blueprint =
... [criterionadder]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'', criteria)
>>> transmogrifier(u'')
[('_criterion', 'bar'),
 ('_field', 'baz'),
 ('_path', '/spam/eggs/foo/crit__baz_bar')]
[('_criterion', 'bar'),
 ('_field', 'baz'),
 ('_path', 'not/existing/bar'),
 ('title', 'Should not be updated, not an existing path')]
[('_path', 'spam/eggs/incomplete'),
 ('title', 'Should not be updated, no criterion or field')]
>>> pprint.pprint(plone.criteria)
(('spam/eggs/foo', 'baz', 'bar'),)

Portal Transforms section

A portal transforms pipeline section lets you use Portal Transforms to transform item values. The portal transforms section blueprint name is

What values to transform is determined by the keys option, which takes a set of newline-separated key names. If a key name starts with re: or regexp: it is treated as a regular expression instead.

You can specify what transformation to apply in two ways. Firstly, you can directly specify a transformation by naming it with the transform option; the named transformation is run directly. Alternatively you can let the portal transforms tool figure out what transform to use by specifying target and an optional from mimetype. The portal transforms tool will select one or more transforms based on these mimetypes, and if no from option is given the original item value is used to determine one.

Also optional is the condition option, which lets you specify a TALES expression that when evaluating to False will prevent any transformations from happening. The condition is evaluated for every matched key.

>>> ptransforms = """
... [transmogrifier]
... pipeline =
...     source
...     transform-id
...     transform-title
...     transform-status
...     printer
... [source]
... blueprint = collective.transmogrifier.sections.tests.samplesource
... encoding = utf8
... [transform-id]
... blueprint =
... transform = identity
... keys = id
... [transform-title]
... blueprint =
... target = text/plain
... keys = title
... [transform-status]
... blueprint =
... from = text/plain
... target = text/plain
... keys = status
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'',
...                ptransforms)
>>> transmogrifier(u'')
[('id', "Transformed 'foo' using the identity transform"),
 ('status', "Transformed '\\xe2\\x84\\x97' from text/plain to text/plain"),
 ('title', "Transformed 'The Foo Fighters \\xe2\\x84\\x97' to text/plain")]
[('id', "Transformed 'bar' using the identity transform"),
 ('status', "Transformed '\\xe2\\x84\\xa2' from text/plain to text/plain"),
 ('title', "Transformed 'Brand Chocolate Bar \\xe2\\x84\\xa2' to text/plain")]
[('id', "Transformed 'monty-python' using the identity transform"),
 ('status', "Transformed '\\xc2\\xa9' from text/plain to text/plain"),
  'Transformed "Monty Python\'s Flying Circus \\xc2\\xa9" to text/plain')]

The condition expression has access to the following:


the current pipeline item


the name of the matched key


if the key was matched by a regular expression, the match object, otherwise boolean True


the transmogrifier


the name of the splitter section


the splitter options



URL Normalizer section

A URLNormalizer section allows you to parse any piece of text into a url-safe string which is then assigned to a specified key. It uses plone.i18n.normalizer to perform the normalization. The url normalizer section blueprint name is

The URL normalizer accepts the following optional keys - source-key: The name of the object key that you wish to normalize, destination-key: Where you want the normalized string to be stored, locale: if you want the normalizer to be aware of locale, use this.

>>> import pprint
>>> urlnormalizer = """
... [transmogrifier]
... pipeline =
...     urlnormalizersource
...     urlnormalizer
...     printer
... [urlnormalizersource]
... blueprint =
... [urlnormalizer]
... blueprint =
... source-key = title
... destination-key = string:id
... locale = string:en
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'',
...                urlnormalizer)
>>> transmogrifier(u'')
[('id', 'mytitle'), ('title', 'mytitle')]
[('id', 'is-this-a-title-of-any-sort'),
 ('title', 'Is this a title of any sort?')]
[('id', 'put-some-br-1lly-v4lues-here-there'),
 ('title', 'Put some <br /> $1llY V4LUES -- here&there')]
[('id', 'what-about-\r\n-line-breaks-system'),
 ('title', 'What about \r\n line breaks (system)')]
[('id', 'try-one-of-these-oh'), ('title', 'Try one of these --------- oh')]
[('language', 'My language is de')]
[('language', 'my language is en')]

As you can see, only items containing the specified source-key have been processed, the others have been ignored and yielded without change.

Destination-key and locale accept TALES expressions, so for example you could set your destination-key based on your locale element, which is in turn derived from your source-key:

>>> import pprint
>>> urlnormalizer = """
... [transmogrifier]
... pipeline =
...     urlnormalizersource
...     urlnormalizer
...     printer
... [urlnormalizersource]
... blueprint =
... [urlnormalizer]
... blueprint =
... source-key = language
... locale = python:str(item.get('${urlnormalizer:source-key}', 'na')[-2:])
... destination-key = ${urlnormalizer:locale}
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'',
...                urlnormalizer)
>>> transmogrifier(u'')
[('title', 'mytitle')]
[('title', 'Is this a title of any sort?')]
[('title', 'Put some <br /> $1llY V4LUES -- here&there')]
[('title', 'What about \r\n line breaks (system)')]
[('title', 'Try one of these --------- oh')]
[('de', 'my-language-is-de'), ('language', 'My language is de')]
[('en', 'my-language-is-en'), ('language', 'my language is en')]

In this case only items containing the ‘language’ key have been processed, and the destination-key has been set to the same value as the locale was. This is more to illuminate the fact that the locale was set, rather than providing a sensible use-case for destination-key.

If ZERO options are specified, the normalizer falls back to a set of default values as follows: source-key: title, locale: en, destination-key: _id

>>> import pprint
>>> urlnormalizer = """
... [transmogrifier]
... pipeline =
...     urlnormalizersource
...     urlnormalizer
...     printer
... [urlnormalizersource]
... blueprint =
... [urlnormalizer]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'',
...                urlnormalizer)
>>> transmogrifier(u'')
[('_id', 'mytitle'), ('title', 'mytitle')]
[('_id', 'is-this-a-title-of-any-sort'),
 ('title', 'Is this a title of any sort?')]
[('_id', 'put-some-br-1lly-v4lues-here-there'),
 ('title', 'Put some <br /> $1llY V4LUES -- here&there')]
[('_id', 'what-about-\r\n-line-breaks-system'),
 ('title', 'What about \r\n line breaks (system)')]
[('_id', 'try-one-of-these-oh'), ('title', 'Try one of these --------- oh')]
[('language', 'My language is de')]
[('language', 'my language is en')]

In this case, the destination-key is set to a controller variable, like _path, as it is expected that the newly formed Id will in most cases be used further down the pipeline in constructing the full, final path to the new Plone object.

It should be noted that this section can effectively transform any section of text and turn it into a normalized, web safe string (max 255 chars) This string does not necessarily need to be used for a URL.

Mime encapsulator section

A mime encapsulator section wraps arbitrary data in OFS.Image.File objects, together with a MIME type. This wrapping is a pre-requisite for Archetypes image, file or text fields, which can only take such File objects. The mime encapsulator blueprint name is

An encapsulator section needs 3 pieces of information: the key at which to find the data to encapsulate, the MIME type of this data, and the name of the field where the encapsulated data will be stored. The idea is that the data is copied from a “data key” (defaulting to _data and settable with the data-key option), wrapped into a File object with a MIME type (read from the mimetype option, which contains a TALES expression), and then saved into the pipeline item dictionary under a new key, most likely corresponding to an Archetypes field name (read from the field option, which is also a TALES expression).

The data key defaults to the series _[blueprintname]_[sectionname]_data, _[blueprintname]_data, _[sectionname]_data and _data, where [blueprintname] is and [sectionname] is replaced with the name of the current section. You can override this by specifying the data-key option.

You specify the mimetype with the mimetype option, which takes a TALES expression.

The field option, also a TALES expression, sets the output field name.

Optionally, you can specify a condition option, again a TALES expression, that when evaluating to False, causes the section to skip encapsulation for that item.

>>> encapsulator = """
... [transmogrifier]
... pipeline =
...     source
...     encapsulator
...     conditionalencapsulator
...     printer
... [source]
... blueprint =
... [encapsulator]
... blueprint =
... # Read the mimetype from the item
... mimetype = item/_mimetype
... field = string:datafield
... [conditionalencapsulator]
... blueprint =
... data-key = portrait
... mimetype = python:item.get('_%s_mimetype' % key)
... # replace the data in-place
... field = key
... condition = mimetype
... [printer]
... blueprint =
... """
>>> registerConfig(u'',
...                encapsulator)
>>> transmogrifier(u'')
datafield: (application/x-test-data) foobarbaz
portrait: (image/jpeg) someportraitdata

The field expression has access to the following:


the current pipeline item


the name of the matched data key


if the key was matched by a regular expression, the match object, otherwise boolean True


the transmogrifier


the name of the splitter section


the splitter options



The mimetype expression has access to the same information as the field expression, plus:


the name of the field in which the encapsulated data will be stored.

The condition expression has access to the same information as the mimetype expression, plus:


the mimetype used to encapsulate the data.

Indexing section

A ReindexObject section allows you to reindex an existing object in the portal_catalog. ReindexObject sections operate on objects already present in the ZODB, be they created by a constructor or pre-existing objects.

The ReindexObject blueprint name is

To determine the path, the ReindexObject section inspects each item and looks for a path key, as described below. Any item missing this key will be skipped. Similarly, items with a path that doesn’t exist or are not referenceable (Archetypes) or do not inherit from CMFCatalogAware will be skipped as well.

The object path will be found under the first key found among the following:



  • _[sectionname]_path

  • _path

where [sectionname] is replaced with the name given to the current section. This allows you to target the right section precisely if needed.

Alternatively, you can specify what key to use for the path by specifying the path-key option, which should be a list of keys to try (one key per line; use a re: or regexp: prefix to specify regular expressions).

Paths to objects are always interpreted as relative to the context.

>>> import pprint
>>> reindexobject = """
... [transmogrifier]
... pipeline =
...     reindexobjectsource
...     reindexobject
...     printer
... [reindexobjectsource]
... blueprint =
... [reindexobject]
... blueprint =
... [printer]
... blueprint = collective.transmogrifier.sections.tests.pprinter
... """
>>> registerConfig(u'', reindexobject)
>>> transmogrifier(u'')
[('_path', '/spam/eggs/foo')]
[('_path', '/spam/eggs/bar')]
[('_path', '/spam/eggs/baz')]
[('_path', 'not/a/catalog/aware/content'),
 ('title', 'Should not be reindexed, not a CMFCatalogAware content')]
[('_path', 'not/existing/bar'),
 ('title', 'Should not be reindexed, not an existing path')]
>>> pprint.pprint(plone.reindexed)
(('spam/eggs/foo', 'reindexed'),
 ('spam/eggs/bar', 'reindexed'),
 ('spam/eggs/baz', 'reindexed'))

Change History

(name of developer listed in brackets)

1.1 (2010-03-30)

  • Added Indexing section. See reindexobject.txt. [sylvainb]

  • Added UID updated section. See uidupdater.txt. [optilude]

  • Fixed tests for Plone 4, in the same way that they were fixed in collective.transmogrifier. [optilude]

1.0 (2009-08-09)

  • Initial package. [mj]


