Blogs

EMAIL: info@example.com

Agaric Collective: Using migration groups to share configuration among Drupal migrations

In the previous posts we talked about option to manage migrations as configuration entities and some of the benefits this brings. Today, we are going to learn another useful feature provided by the Migrate Plus module: migration groups. We are going to see how they can be used to execute migrations together and share configuration among them. Let’s get started.

Example definition of a migration group.

Understanding migration groups

The Migrate Plus module defines a new configuration entity called migration group. When the module is enabled, each migration can define one group they belong to. This serves two purposes:

  1. It is possible to execute operations per group. For example, you can import or rollback all migrations in the same group with one Drush command provided by the Migrate Tools module.
  2. It is is possible to declare shared configuration for all the migrations within a group. For example, if they use the same source file, the value can be set in a single place: the migration group.

To demonstrate how to leverage migration groups, we will convert the CSV source example to use migrations defined as configuration and groups. You can get the full code example at https://github.com/dinarcon/ud_migrations The module to enable is UD configuration group migration (CSV source) whose machine name is ud_migrations_config_group_csv_source. It comes with three migrations: udm_config_group_csv_source_paragraph, udm_config_group_csv_source_image, and  udm_config_group_csv_source_node. Additionally, the demo module provides the udm_config_group_csv_source group.

Note: The Migrate Tools module provides a user interface for managing migrations defined as configuration. It is available under “Structure > Migrations” at /admin/structure/migrate. This user interface will be explained in a future entry. For today’s example, it is assumed that migrations are executed using the Drush commands provided by Migrate Plus. In the past we have used the Migrate Run module to execute migrations, but this module does not offer the ability to import or rollback migrations per group.

Creating a migration group

The migration groups are defined in YAML files using the following naming convention: migrate_plus.migration_group.[migration_group_id].yml. Because they are configuration entities, they need to be placed in the config/install directory of your module. Files placed in that directory following that pattern will be synced into Drupal’s active configuration when the module is installed for the first time (only). If you need to update the migration groups, you make the modifications to the files and then sync the configuration again. More details on this workflow can be found in this article. The following snippet shows an example migration group:

uuid: e88e28cc-94e4-4039-ae37-c1e3217fc0c4
id: udm_config_group_csv_source
label: 'UD Config Group (CSV source)'
description: 'A container for migrations about individuals and their favorite books. Learn more at https://understanddrupal.com/migrations.'
source_type: 'CSV resource'
shared_configuration: null

The uuid key is optional. If not set, the configuration management system will create one automatically and assign it to the migration group. Setting one simplifies the workflow for updating configuration entities as explained in this article. The id key is required. Its value is used to associate individual migrations to this particular group.

The label, description, and source_type keys are used to give details about the migration. Their value appear in the user interface provided by Migrate Tools. label is required and serves as the name of the group. description is optional and provides more information about the group. source_type is optional and gives context about the type of source you are migrating from. For example, “Drupal 7”, “WordPress”, “CSV file”, etc.

To associate a migration to a group, set the migration_group key in the migration definition file: For example:

uuid: 97179435-ca90-434b-abe0-57188a73a0bf
id: udm_config_group_csv_source_node
label: 'UD configuration host node migration for migration group example (CSV source)'
migration_group: udm_config_group_csv_source
source: ...
process: ...
destination: ...
migration_dependencies: ...

Note that if you omit the migration_group key, it will default to a null value meaning the migration is not associated with any group. You will still be able to execute the migration from the command line, but it will not appear in the user interface provided by Migrate Tools. If you want the migration to be available in the user interface without creating a new group, you can set the migration_group key to default. This group is automatically created by Migrate Plus and can be used as a generic container for migrations.

Organizing and executing migrations

Migration groups are used to organize migrations. Migration projects usually involve several types of elements to import. For example, book reports, events, subscriptions, user accounts, etc. Each of them might require multiple migrations to be completed. Let’s consider a news articles migration. The “book report” content type has many entity reference fields: book cover (image), support documents (file), tags (taxonomy term), author (user), citations (paragraphs). In this case, you will have one primary node migration that depends on five migrations of multiple types. You can put all of them in the same group and execute them together.

It is very important not to confuse migration groups with migration dependencies. In the previous example, the primary book report node migration should still list all its dependencies in the migration_dependencies section of its definition file. Otherwise, there is no guarantee that the five migrations it depends on will be executed in advance. This could cause problems if the primary node migration is executed before images, files, taxonomy terms, users, or paragraphs have already been imported into the system.

It is possible to execute all migrations in a group by issuing a single Drush with the --group flag. This is supported by the import and rollback commands exposed by Migrate Tools. For example, drush migrate:import --group='udm_config_group_csv_source'. Note that as of this writing, there is no way to run all migrations in a group in a single operation from the user interface. You could import the main migration and the system will make sure that if any explicit dependency is set, they will be run in advance. If the group contained more migrations than the ones listed as dependencies, those will not be imported. Moreover, migration dependencies are only executed automatically for import operations. Dependent migrations will not be rolled back automatically if the main migration is rolled back individually.

Note: This example assumes you are using Drush to execute the migrations. At the time of this writing, it is not possible to rollback a CSV migration from the user interface. See this issue in the Migrate Source CSV for more context.

Sharing configuration with migration groups

Arguably, the major benefit of migration groups is the ability to share configuration among migrations. In the example, there are three migrations all reading from CSV files. Some configurations like the source plugin and header_offset can be shared. The following snippet shows an example of declaring shared configuration in the migration group for the CSV example:

uuid: e88e28cc-94e4-4039-ae37-c1e3217fc0c4
id: udm_config_group_csv_source
label: 'UD Config Group (CSV source)'
description: 'A container for migrations about individuals and their favorite books. Learn more at https://understanddrupal.com/migrations.'
source_type: 'CSV resource'
shared_configuration:
  dependencies:
    enforced:
      module:
        - ud_migrations_config_group_csv_source
  migration_tags:
    - UD Config Group (CSV Source)
    - UD Example
  source:
    plugin: csv
    # It is assumed that CSV files do not contain a headers row. This can be
    # overridden for migrations where that is not the case.
    header_offset: null

Any configuration that can be set in a regular migration definition file can be set under the shared_configuration key. When the migrate system loads the migration, it will read the migration group it belongs to, and pull any shared configuration that is defined. If both the migration and the group provide a value for the same key, the one defined in the migration definition file will override the one set in the migration group. If a key only exists in the group, it will be added to the migration when the definition file is loaded.

In the example, dependencies, migration_tag, and source options are being set. They will apply to all migrations that belong to the udm_config_group_csv_source group. If you updated any of these values, the changes would propagate to every migration in the group. Remember that you would need to sync the migration group configuration for the update to take effect. You can do that with partial configuration imports as explained in this article.

Any configuration set in the group can be overridden in specific migrations. In the example, the header_offset is set to null which means the CSV files do not contain a header row. The node migration uses a CSV file that contains a header row so that configuration needs to be redeclared. The following snippet shows how to do it:

uuid: 97179435-ca90-434b-abe0-57188a73a0bf
id: udm_config_group_csv_source_node
label: 'UD configuration host node migration for migration group example (CSV source)'
# Any configuration defined in the migration group can be overridden here
# by re-declaring the configuration and assigning a value.
# dependencies inherited from migration group.
# migration_tags inherited from migration group.
migration_group: udm_config_group_csv_source
source:
  # plugin inherited from migration group.
  path: modules/custom/ud_migrations/ud_migrations_csv_source/sources/udm_people.csv
  ids: [unique_id]
  # This overrides the header_offset defined in the group. The referenced CSV
  # file includes headers in the first row. Thus, a value of 0 is used.
  header_offset: 0
process: ...
destination: ...
migration_dependencies: ...

Another example would be multiple migrations reading from a remote JSON. Let’s say that instead of fetching a remote file, you want to read a local file. The only file you would have to update is the migration group. Change the data_fetcher_plugin key to file and the urls array to the path to the local file. You can try this with the ud_migrations_config_group_json_source module from the demo repository.

What did you learn in today’s blog post? Did the know that migration groups can be used to share configuration among different migrations? Share your answers in the comments. Also, I would be grateful if you shared this blog post with others.

This blog post series, cross-posted at UnderstandDrupal.com as well as here on Agaric.coop, is made possible thanks to these generous sponsors. Contact Understand Drupal if your organization would like to support this documentation project, whether it is the migration series or other topics.

Read more and discuss at agaric.coop.