Drupal cross-version migration – “a” definitive guide

So you have a Drupal site you installed a few years ago. You want to migrate data from that old installation. With some luck you were already at version 5.x. But with the current version being 7.x the upgrade path isn’t only rather long, it’s also very painful.
In this article, I’ll be explaining how you can easily and flexibly migrate data from Drupal 5.x to Drupal 7.x.

Now, before we start, let’s define the problem at hand. After that, we can explore the options you have for migrating. And finally, I’ll go into detail how to migrate using CSV and the Drupal 7 Feeds module.

Problem description

First off, let’s define what you want to do – or at least what I’ll help you to do.
I’m not talking about full Drupal migrations offering all capabilities of Drupal 5 in Drupal 7. I’m talking about data migration, and not even all data. I’m only talking about specific Drupal 5 CCK data. Having custom fields and look-up fields is no problem.
So, what you want to do is:

Migrating nodes from Drupal 5 with custom CCK content types and fields to Drupal 7 not using CCK but using the built-in Drupal 7 fields.

This means you need to set up the content type on a working Drupal 7 installation before migrating. The content type definition is not updated for you. You may need to make some design decisions for Drupal 5 fields that are no longer supported in Drupal 7 (e.g. usernode references) or fields that didn’t exist yet in Drupal 5 (e.g. entity reference).

Your Options

It’s important that you realize that there are many ways to migrate data from Drupal 5 to Drupal 7. Even within the limited problem scope, you can write your own script to pick it from the old Drupal 5 database, use a custom module based on a migration API, use Feeds and Services for Drupal-to-Drupal migrations …
A note on that last one. I set up the Feeds “Clients” module connecting to an old Drupal site running Services.  The Clients module is not in a stable version yet. The most crucial elements, namely the Feeds Clients fetcher and Feeds Clients parser are not even in the 7.x version, you can only find an old 6.x version which doesn’t even work anymore. So even though the documentation mentions this as a possibility, you may get stuck after installing and configuring about 10 modules…

Note that most migrations will not be one-on-one copies from Drupal 5 to 7. You will probably have (minor) field changes which make direct copies harder and need a transformation step. Read on, I’ve got you covered.

Migrating using Views Export and Feeds

You need to investigate the Feeds module and its submodule Feeds Import. Just read a bit so you get the initial idea of Fetchers, Parsers and Processors. Important stuff there 🙂

So how does the migration I propose to you work? Simple but powerful.

On the old site

This works for sites running Drupal 5. It may work for higher versions as well.

First, think about the data you want to export. Try visualizing this in your old site. And before you start making crazy queries, try thinking of the smallest content type you need to migrate. For example, you may link to locations (addresses) from different nodes. Locations also have route descriptions or something, but they are atomic – they link to no other entity.
Now create a View of the data you want to export. That will be the data you export to CSV.

To be able to export, you need to install the Views Bonus Pack module(s). After installation, enable the “Bonus: Views Export” module. This allows you to add additional parameters to views to export to CSV.

On the view you want to export, make sure you have a Page to output to, and under the Parameters section, select the following:

Add parameter to your View

Add parameter to your View

After adding it, you’ll see:

Your arguments after adding the CSV argument

Your arguments after adding the CSV argument

Now, we you go your View’s page, at the bottom of the page, you’ll see:

CSV icon

CSV icon

Clicking this link will allow you to download the CSV file. It uses the display name of your selected fields to export the data. Make sure you set these correctly, or you’ll end up with too descriptive names that will make it harder to import later on.
Attention: it may be a good idea to (re)format or (re)arrange your fields and their input already to make it easier to parse in the import step. For example, if you have node references, include them in your View (show the node’s title), so you don’t have to map them in your destination site.

Two more remarks:
– Uniqueness: by default, I use the “Title” property of a node to determine uniqueness. This is a handy way of working if your nodes actually have unique titles. The import module will require one field to be unique, so if you don’t have a title, you can use the node ID as a field.
– Node reference: it is perfectly possible to import nodes that have links to other nodes. The only thing you have to do is make sure the referenced nodes are included in your View as fields (either title which I use or node ID if you save the old node ID in your destination site).

On the destination site

Install the Feeds module. It requires the Chaos Tools and Job Scheduler (even though you don’t need to schedule imports if you don’t want to).

Next (in Drupal 7), go to Structure > Feeds Importers.
Next, create a new importer or clone the default “Node import”.

Basic settings

Make sure you don’t select the content type to which you want to migrate. This will make imports fail. Just select “Use standalone form”. Make sure you check “Import on submission”:

Basic Feeds import settings

Basic Feeds import settings

Then, make sure you select the “File upload” fetcher. You could also use the HTTP fetcher and link to your Views CSV link. Didn’t try that yet.
Next, choose the CSV parser. The default delimiter “,” is ok, make sure you don’t check “No headers”. The defaults are fine.
Use the “Node processor” processor.

Node processor settings

Here, select the target content type in the “Bundle” drop-down list. You can change the other settings if you like.

Node mapping

Here comes the magic:

Feeds import mapping

Feeds import mapping

In the Source column, select the name of the column as it is used in the CSV file.
Then select the content type destination field in the Target column.
The unique column is important to be able to figure out whether nodes with a title that matches a records in your CSV file already exists or not. You must have one field as unique for the record.
Note the “Locatie” field. In my target site, this is an Entity reference field. Feeds knows how to work with this, but you need to provide a destination to link to in your CSV. In my case, I used the title of the Locatie node in my CSV directly. Feeds can find this node easily and links it during the import process.
Watch out with dates. I noticed that the format needs to be in English format, e.g. mm/dd/yyyy hh:ii. Also, some dates may not parse correctly out of the box. I’ve found that using the Feeds Tamper module you can parse dates for a field, using an extra tab on the importer:

Feeds Tamper tab (excerpt)

Feeds Tamper tab (excerpt)

The Tamper module allows you to apply some extra filters per field.  This allows us in this case to make sure dates are parsed correctly.  Mapping directly didn’t work.

The actual migration

To do the actual migration, go to /import and select your Feeds importer you just created. Then, select your CSV file and click Import:

The Import screen

The Import screen

A nice progress dialog is shown while your data is being fetched, parsed and processed.
Make sure you check your nodes, the fields and especially the node references. If they seem ok for most, you can just continue on to the next content type. If you have a made a mistake or forgot something, you can easily delete the nodes you just created with the Import, by clicking the “Delete items” tab:

Delete the items you just migrated

Delete the items you just migrated


You can now start over 🙂

Conclusion

In this article, I’ve discussed how to export nodes with custom content types and fields to CSV and how to import it back into Drupal 7 with the Feeds module.
If I am to ever migrate content again, this is the preferred way for me. The intermediate CSV step allows you to perform some extra modifications to the data (e.g. using regular expressions) that you would otherwise have to do in code.
For the record: the larger batch I’ve imported was one with 5 field mappings for a total of 300 nodes. So no, I haven’t had performance issues. The 5×300 batch imported in less than 30 seconds on a shared hosting provider (dev site).

I did have some issues with multivalued fields.  The Views on Drupal 5.x allows you to export this, but only allows the “Group multiple values” for a field.  This just concatenates the multiple values together, but doesn’t separate it any way.  Don’t know how to fix this, but with the limited amount of data, I could get around this.

Happy migrating, let me know in the comments 😉

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s