Specbee: How to Migrate Content from XML files to Drupal 8 (or 9)

How to Migrate Content from XML files to Drupal 8 (or 9)
Maithri Shetty
05 Oct, 2021

Pulling data from different sources to your Drupal CMS is what migration is all about. Your migration source can be anything – a database, another CMS, CSV, XML files. Drupal always gives you the flexibility to migrate data from anywhere into the CMS. We have written extensively on Drupal migrations before like migrating from a database source, migrating multilingual content from CSV, from an SQL source, a complete how-to-guide to migrations and more! If you are looking for a guide to migrating data from XML files to Drupal 8 or Drupal 9, you have arrived at the right place as that is what we are going to discuss about here!

If you’re looking to import external feeds to your Drupal 9 website using the Drupal Feeds module, check out this article.
XML to Drupal Migration

Drupal 8/9 Migration Modules

Here, we will be using the whole set of Drupal migrate modules

After installing these modules, you will need to create a custom module where you will write the script for migration. So first, create a custom module and then create an info.yml which will include the module’s details. Next, enable the module.

name: test migration
type: module
description: Migrating data from the csv files.
core_version_requirement: ^8 || ^9
package: Migration
dependencies:
 - drupal:migrate_source_csv
 - drupal:migrate_plus
 - drupal:migrate_tools
config_devel:
 install:
   - migrate_plus.migration_group.test_migration
   - migrate_plus.migration.test_migration_sessions
   - migrate_plus.migration.test_migration_paper
   - migrate_plus.migration.test_migration_user

Once the info.yml file is created, you need to create a migration group for this migration. You will need to create this migration group in the path: test_migration> config > install. These details are needed while running the migration:

id: test_migration
label: test Migration
description: Migrating xml data

After creating the migration group, install this group in your info.yml file.

config_devel:
 install:
   - migrate_plus.migration_group.test_migration

Now, you need to write a migration script for the content type. This script should be inside config > install and file name should be migrate_plus.migration.test_migration_paper.yml

id: test_migration_paper
label: 'Migrate Paper data from the xml file'
migration_group: test_migration
source:
 plugin: url
 # Full path to the file.
 data_fetcher_plugin: file
 data_parser_plugin: xml
 urls: private://xml/session.xml
 item_selector: /program/session/papers/paper
 fields:
   - name: title
     lable: 'Paper title'
     selector: papertitle
   - name: abstract
     lable: 'Paper abstract'
     selector: abstract
   - name: first_name
     label: 'Author first name'
     selector: authors/author/name/givenname
   - name: last_name
     label: 'Author last name'
     selector: authors/author/name/surname
   - name: paper_id
     label: 'Paper identifier'
     selector: paperid
   - name: session_id
     label: 'Session identifier'
     selector: sessionid
   - name: start_time
     label: 'Paper presentation start time'
     selector: starttime
   - name: end_time
     label: 'Paper presentation end time'
     selector: endtime
   - name: author_name
     label: 'Author name'
     selector: authors/author/name/fullname
   - name: session
     lable: 'Session name'
     selector: session
 ids:
   session_id:
     type: string
process:
 # Adding the mapping between the fields and the csv columns.
 title:
   - plugin: skip_on_empty
     method: process
     source: title
 field_abstract/value: abstract
 field_abstract/format:
   plugin: default_value
   default_value: "full_html"
 field_presentation_timing/value:
   plugin: format_date
   from_format: 'Y-m-dTH:i:s'
   to_format: 'Y-m-dTH:i:s'
   source: start_time
 field_presentation_timing/end_value:
   plugin: format_date
   from_format: 'Y-m-dTH:i:s'
   to_format: 'Y-m-dTH:i:s'
   source: end_time
 field_paper_identifier: paper_id
 field_session_identifier: session_id
 field_author:
   - plugin: skip_on_empty
     method: process
     source: author_name
   - plugin: paper_user_migration_import_process
     no_stub: true
 field_session:
   - plugin: skip_on_empty
     method: process
     source: session
   - plugin: node_migration_import_process
     no_stub: true
destination:
 plugin: 'entity:node'
 default_bundle: paper
migration_dependencies:
 required: { }
dependencies: { }

Ad Label

id: Will be unique for each yaml file and displayed in the migration list.
label: Will contain description for the migration.
migration_group: This is the main group that contains all the migrations.
source: Will contains all source details for this migration.
urls: Will contain the file for migration. In the above code, source is placed inside the private directory.
item_selector: Item selector is the path from which the entire set of data is taken. Basically inside this tag there should be one set of data.
fields: Here we are assigning tag values to some variables.
name: Which gives variable names.
label: A short description.
selector: In which we are telling which tag value should be assigned to the variable.
ids: Which should be unique for each content.
process: In which we are mapping variable and field machine names.
destination: In which we are mentioning the entity name to which data should be migrated.

Based on your data you can now write plugins for migration as well. The above code plugin is written for the field_author field.

After this go to Structure, there you will see Migration – click on that. You can now see all the migration groups.
Add Migration Group

Click on List migration. There you can see the list of migration yaml we have created.
Paper Data Click on the Execute button. You can see the migrating options.
Import Operation