Migrate old plone content to current plone
ILRT Content Migrator
Ed Crewe, ILRT at University of Bristol, February 2010
This egg and the companion Product it contains was written to migrate content from pre-Archetypes plone 2.0 sites (or later) to current plone.
The ilrt.contentmigrator egg extends the generic setup content import system to handle binary files and custom content. Hence a fully populated site can be generated from file system content held in a profiles structure folder.
The egg follows the paradigm of the existing generic setup, but adds workflow state to the properties metadata. It also adds ..ini files for each binary content item so that these can have all their associated metadata imported and exported.
It contains a companion old-style plone product. This can be dropped into the Products folder in an old plone site. The site gains a portal_exportcontent tool. Running the export from this tool exports the content to a structure folder in the var directory ready for using to populate a current plone site, and hence migrate the content.
The code was arrived at due to the need to migrate a large number of obselete plone sites and having researched the issue, found that most tools assumed a plone version within the last few years, where Archetypes, Five, Marshall and XML, or in place content migration is viable.
Instead the code applies the methodology discussed in Andreas Jungs’ blog posting Plone migration fails - doing content-migration only
Using the Content Migrator
Copy ilrt.contentmigrator/Products/ContentMigrator to the Products directory of the old plone site. Restart and you should have a ‘Content Migrator Tool’ listed in the right hand content drop down. Pick this and add it to the portal.
There will be a new portal_exportcontent tool in your site. Select this and choose the Export content tab. Click export and wait whilst you site becomes files in var/zope/structure If you only wish to export a subsection of your site then specify the path in the textbox at the top of the page.
Go to your new plone install. Add ilrt.contentmigrator to your buildout config and run bin/buildout.
Copy (or symlink) the exported structure folder to a profile folder either in the ilrt.contentmigrator egg or in the main theme egg for your new plone site and restart, eg. ilrt.contentmigrator/ilrt/contentmigrator/profiles/import/structure
If you wish to specify another path for the structure folder import just adjust the directory in the profile that you are using e.g. directory=”c:\import” in profiles.zcml
If a default profile is used then generic setup will automatically create the content when the egg with the profile is reinstalled or selected for Plone site creation. Where as if another profile is used (such as /import above) then it has to be manually selected first and then run via the setup tool or this migrator tool. For large content imports this is likely to be preferable.
Standard generic setup runs the adapter in CMFCore.exportimport.content which will only populate content for HTML documents, and no properties or workflow states will be added.
To do a full import you must first install ilrt.contentmigrator via the quick installer. The content adapter for generic setup will be modified so that the content import step will now add all content and set workflow states. However if you go to the new portal_setupcontent tool you can run a further enhanced version of the generic setup content step that also sets up users, groups and memberdata and provides fuller logging to screen. In addition the tool provides access to the exporter so that you can re-export a site or a subfolder of its content.
The ilrt.contentmigrator modifies the generic setup site creation to do the following
Populate binary content formats and archetypes if matching ones are found.
Use Marshall’s RFC822 marshaller to extract and apply the properties data.
Apply workflow state transitions. NB: The workflow migration requires the ilrt.migrationtool egg.
Translate old content types and add memberdata (see below)
Please not that the import is takes much longer than the export. So for example a Gigabyte of content might only take 5 minutes to export, but take an hour to import!
Content Types Translation
There is a mapping of old types to newer archetypes for old plone sites. Currently this just handles ‘Calendar Item’ to ‘Event’ and ‘Link’ to ‘ATLink’. It is in the ilrt/contentmigrator/ContentMigrator/config.py file. By modifying the TYPEMAP and NONATPROPS dictionaries of configuration data you can map other old custom types to new content, or even use it to migrate content from one new type to another.
The contentmigrator will also export and import zope held users, including passwords. It does so by generating the user, roles and groups data from GRUF or PAS based sites as generic setup xml files in the /structure/acl_users folder. Memberdata is saved as a csv file for each member in the portal_memberdata directory within acl_users.
Changelog for ilrt.contentmigrator
(name of developer listed in brackets)
ilrt.contentmigrator - 0.6 (2010-02-10)
Replace user export page template xml generation with DTML to be more compatible with old zope
Fix adding of empty portal.REQUEST attribute causing error with site when exporting users from plone 2
Use indexObject not reindexObject so modified date is preserved
Document how to change import path
ilrt.contentmigrator - 0.5 (2009-11-20)
Export any folderish object by default
Make user export and reindexing optional
Just log failed object deletions and continue
Add the os.O_BINARY flag to all file writing to stop line ending tampering on Windows
Check for all string types when doing export
ilrt.contentmigrator - 0.4
Add conversion of old links to archetype links.
0.3 release was missing some files, doh!
[Jerry Van Baren]
ilrt.contentmigrator - 0.3
Bug fixes of utils AT types conversion methods - setting empty dates to current date (hence expiring most content!) and lines fields not converted correctly to tuples.
ilrt.contentmigrator - 0.2
Contains stand alone ContentMigrator product for exporting content from old plone sites
Uses generic setup style exportimport/content for importing content
Imports file and image content
Sets workflow state of content (requires ilrt.migrationtool)
Imports users, groups and roles
Translates old calendar item type to ATEvent
ilrt.contentmigrator - 0.1 Unreleased
Initial package structure.
Fix old plone quick installer addition for the ContentMigrator Product part of the egg
Add ATReference handling for export / import like GSXML
Fix AT file field handling to work if object also has default file data attribute
Add portrait handling to memberdata export / import
More old plone type translations?
Adapt to use for old zope only based content, eg. add html filters to grab the body text from ZPT content.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for ilrt.contentmigrator-0.6-py2.4.egg