Evolving Web: Drupal 8 Migration: Migrating Hierarchical Taxonomy Terms

When you migrate relations between two different entities, you usually migrate the target entities first and then the source entities.

But what if the target and source are of the same entity and bundle type? How do you ensure that the target entity gets migrated before the source entity?

Have you run into a chicken-versus-egg problem?

In this article, we’ll look at how to migrate hierarchical taxonomy terms to Drupal 8 or Drupal 9 by following along with an example migration. We’ll address the above questions to give you a good understanding of how to implement similar migrations in your projects.


The Drupal 8 Migration Tutorial Series


Before We Start


👩‍💻 Get up to speed on the latest version of Drupal! Join us on September 24 for a free webinar on Drupal 9.


The Problem

We have received these CSV files from our imaginary client:

We need to:

  • Migrate categories including their relationships (parent categories) from categories.csv
  • Migrate articles from articles.csv

Let’s get started.

Setting up the Migrations

Create the migration module

We need to create a module for our migrations. In this example, we’re naming it migrate_example_hierarchical_terms.

We then need to add the following modules as dependencies in the module declaration:

Create a migration group

To group the migrations, we also need to create a migration group. To do so, we’ll create a fairly simple configuration file so that the group gets created when the module is installed.

The file’s contents should be as follows:

id: hierarchical_terms
label: Hierarchical Terms Group
source_type: CSV

Writing the Migrations

Next thing to do is to write the actual migrations. For our requirements, we need to write two different migrations: one for categories and one for articles. The hierarchy stuff will be handled as part of the categories migration.

Since Drupal 8.1.x, migrations are plugins that should be stored in a migrations folder inside any module. You can still make them configuration entities as part of the migrate_plus module but I personally prefer to follow the core recommendation because it’s easier to develop (you can make an edit and just rebuild cache to update it).

Write the category migration

For the categories migration, we’re creating a categories.yml file inside the migrations folder. This migration uses CSV as source, so we need to declare it like this:

source:
  plugin: 'csv'
  path: 'modules/custom/migrate_example_hierarchical_terms/data/categories.csv'
  delimiter: ','
  enclosure: '"'
  header_offset: 0
  ids:
    - name
  fields:
    0:
      name: name
      label: 'Name'
  ...

The destination part of the migration is pretty standard so we’ll skip it, but you can still look at it in the code.

For the process, the important part is the parent field:

  parent:
    -
      plugin: migration_lookup
      migration: categories
      source: parent
    -
      plugin: default_value
      default_value: 0

We’re using the migration_lookup plugin and the current migration (categories) to look for the parent entities. We’re allowing stubs to be created (by not setting no_stub: true) so that if a child term is migrated before its parent term, the parent will be created as a stub and it will be completed later with the real data.

We’re also defaulting to 0 as parent if no parent is set in the source data. This way, the hierarchy will be preserved when running the migration.

Write the article migration

To migrate the articles, we’ve created the articles.yml migration file. If you have previous experience with migrations in Drupal 8, it’s pretty straightforward. It’s also using CSV as a source, so its source section is pretty similar to the one in the categories migration. The destination is set to be the article bundle of the node entity type.

The process section looks like this:

process:
  title: title
  body/value: content
  field_category:
    -
      plugin: migration_lookup
      migration: categories
      source: category

Title from the CSV file is mapped directly to title in the node. The content column in the CSV file is mapped to the value sub-field of the body field. For field_category, we’re also using the migration_lookup plugin to get the categories that we’ve previously migrated.

We’re also setting a dependency to the categories migration to ensure that categories run before articles:

migration_dependencies:
  required:
    - categories

Now, everything is in place and we’re ready to run the migrations.

Running the Migrations

Given that we have set dependencies, we can instruct Drupal to run the migration group and it will run the migrations in the right order.

To do so, execute drush mim --group=hierarchical_terms. The output will look like this:

 [notice] Processed 11 items (7 created, 4 updated, 0 failed, 0 ignored) - done with 'categories'
 [notice] Processed 10 items (10 created, 0 updated, 0 failed, 0 ignored) - done with 'articles'

Note that the counts for categories are not what you’d expect looking at the data. This is because of the stub creation that happened during the migration. However, if you run drush ms, the output will be as expected:

----------------------------------------------- -------------- -------- ------- ---------- ------------- ---------------------
  Group                                           Migration ID   Status   Total   Imported   Unprocessed   Last Imported
 ----------------------------------------------- -------------- -------- ------- ---------- ------------- ---------------------
  Hierarchical Terms Group (hierarchical_terms)   categories     Idle     9       9          0             2020-08-21 19:18:46
  Hierarchical Terms Group (hierarchical_terms)   articles       Idle     10      10         0             2020-08-21 19:18:46
 ----------------------------------------------- -------------- -------- ------- ---------- ------------- ---------------------

Next Steps

+ more awesome articles by Evolving Web