Skip to main content

Migrate old plone content to current plone

Project description

ILRT Content Migrator

Ed Crewe, ILRT at University of Bristol, November 2009

Overview

This egg and the companion Product it contains was written to migrate content from pre-Archetypes plone 2.0 sites (or later) to current plone.

The ilrt.contentmigrator egg extends the generic setup content import system to handle binary files and custom content. Hence a fully populated site can be generated from file system content held in a profiles structure folder.

The egg follows the paradigm of the existing generic setup, but adds workflow state to the properties metadata. It also adds ..ini files for each binary content item so that these can have all their associated metadata imported and exported.

It contains a companion old-style plone product. This can be dropped into the Products folder in an old plone site. The site gains a portal_exportcontent tool. Running the export from this tool exports the content to a structure folder in the var directory ready for using to populate a current plone site, and hence migrate the content.

Concept

The code was arrived at due to the need to migrate a large number of obselete plone sites and having researched the issue, found that most tools assumed a plone version within the last few years, where Archetypes, Five, Marshall and XML, or in place content migration is viable.

Instead the code applies the methodology discussed in Andreas Jungs’ blog posting Plone migration fails - doing content-migration only

Using the Content Migrator

Copy ilrt.contentmigrator/Products/ContentMigrator to the Products directory of the old plone site. Restart and you should have a ‘Content Migrator Tool’ listed in the right hand content drop down. Pick this and add it to the portal.

There will be a new portal_exportcontent tool in your site. Select this and choose the Export content tab. Click export and wait whilst you site becomes files in var/zope/structure If you only wish to export a subsection of your site then specify the path in the textbox at the top of the page.

Go to your new plone install. Add ilrt.contentmigrator to your buildout config and run bin/buildout.

Copy (or symlink) the exported structure folder to a profile folder either in the ilrt.contentmigrator egg or in the main theme egg for your new plone site and restart, eg. ilrt.contentmigrator/ilrt/contentmigrator/profiles/import/structure

If a default profile is used then generic setup will automatically create the content when the egg with the profile is reinstalled or selected for Plone site creation. Where as if another profile is used (such as /import above) then it has to be manually selected first and then run via the setup tool or this migrator tool. For large content imports this is likely to be preferable.

Standard generic setup runs the adapter in CMFCore.exportimport.content which will only populate content for HTML documents, and no properties or workflow states will be added.

To do a full import you must first install ilrt.contentmigrator via the quick installer. The content adapter for generic setup will be modified so that the content import step will now add all content and set workflow states. However if you go to the new portal_setupcontent tool you can run a further enhanced version of the generic setup content step that also sets up users, groups and memberdata and provides fuller logging to screen. In addition the tool provides access to the exporter so that you can re-export a site or a subfolder of its content.

The ilrt.contentmigrator modifies the generic setup site creation to do the following

  • Populate binary content formats and archetypes if matching ones are found.

  • Use Marshall’s RFC822 marshaller to extract and apply the properties data.

  • Apply workflow state transitions. NB: The workflow migration requires the ilrt.migrationtool egg.

  • Translate old content types and add memberdata (see below)

Please not that the import is takes much longer than the export. So for example a Gigabyte of content might only take 5 minutes to export, but take an hour to import!

Content Types Translation

There is a mapping of old types to newer archetypes for old plone sites. Currently this just handles ‘Calendar Item’ to ‘Event’ and ‘Link’ to ‘ATLink’. It is in the ilrt/contentmigrator/ContentMigrator/config.py file. By modifying the TYPEMAP and NONATPROPS dictionaries of configuration data you can map other old custom types to new content, or even use it to migrate content from one new type to another.

User migration

The contentmigrator will also export and import zope held users, including passwords. It does so by generating the user, roles and groups data from GRUF or PAS based sites as generic setup xml files in the /structure/acl_users folder. Memberdata is saved as a csv file for each member in the portal_memberdata directory within acl_users.

Changelog for ilrt.contentmigrator

(name of developer listed in brackets)

ilrt.contentmigrator - 0.5 (2009-11-20)

  • Export any folderish object by default

  • Make user export and reindexing optional

  • Just log failed object deletions and continue

  • Add the os.O_BINARY flag to all file writing to stop line ending tampering on Windows

[Ed Crewe]

  • Check for all string types when doing export [Dominic Hiles]

ilrt.contentmigrator - 0.4

  • Add conversion of old links to archetype links. [Ed Crewe]

  • Windows bug fixes due to line ending issues. [Ed Crewe]

  • 0.3 release was missing some files, doh! [Jerry Van Baren]

[Ed Crewe]

ilrt.contentmigrator - 0.3

  • Bug fixes of utils AT types conversion methods - setting empty dates to current date (hence expiring most content!) and lines fields not converted correctly to tuples.

[Ed Crewe]

ilrt.contentmigrator - 0.2

  • Contains stand alone ContentMigrator product for exporting content from old plone sites

  • Uses generic setup style exportimport/content for importing content

  • Imports file and image content

  • Sets workflow state of content (requires ilrt.migrationtool)

  • Imports users, groups and roles

  • Translates old calendar item type to ATEvent

[Ed Crewe]

ilrt.contentmigrator - 0.1 Unreleased

  • Initial package structure.

[zopeskel]

TO DO

  • Add ATReference handling for export / import like GSXML

  • Fix AT file field handling to work if object also has default file data attribute

  • Add portrait handling to memberdata export / import

  • More old plone type translations?

  • Adapt to use for old zope only based content, eg. add html filters to grab the body text from ZPT content.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ilrt.contentmigrator-0.5.tar.gz (57.1 kB view hashes)

Uploaded Source

Built Distribution

ilrt.contentmigrator-0.5-py2.4.egg (90.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page