Blogs

EMAIL: info@example.com

Web Omelette: Quickly generate the headers for the CSV migrate source plugin using Drush

Using migrate in Drupal is a very powerful way to bring data into a Drupal application. I talked and wrote extensively on this matter here and elsewhere. Most of my examples use the CSV source plugin to illustrate migrations from CSV-formatted data. And if you are familiar with this source plugin, you know you have “configure” it by specifying all the file’s column names. Kind of like this (a very simple YAML array):

  column_names:
    0:
      id: 'Unique Id'
    1:
      column one: 'What this column is about'
    2:
      column two: 'Another column'

I wrote many many migrations from CSV files, of various sizes, but it took me years to finally utter the following out loud:

Can’t I just generate these stupid column names automatically instead of manually writing them every time?

As you can imagine, there can be files with 30 columns. And a given migration effort can even contain 20 migration files. So, a pain. I’m not the sharpest tool in the shed but finally my laziness got the best of me and decided to write a Drush command I want to share with you today. Also, I have not written anything in such a long time and I feel proper shame.

So what I wanted was simple: a command that I can run, point it to a file and it would print me the column names I just paste into the migration file. No fuss, no muss. Or is it the other way around?

So this is what I came up with.

First, in the module’s composer file, we have to add an extra bit to inform Drush about the services file used for Drush commands. Apparently this will be mandatory in Drush 10.

    "extra": {
        "drush": {
            "services": {
                "drush.services.yml": "^9"
            }
        }
    }

Then, we have the actual drush.services.yml file where we declare the command service:

services:
  my_module.commands:
    class: Drupalmy_moduleCommandsMigrationCommands
    tags:
      - { name: drush.command }

It’s a simple tagged service that says that it should be treated by Drush as a command class that can contain multiple commands.

And finally, the interesting bit, the command class:

<?php

namespace Drupalmy_moduleCommands;

use DrupalCoreSerializationYaml;
use DrushCommandsDrushCommands;

class MigrationCommands extends DrushCommands {

  /**
   * Generates the YAML representation of the CSV headers.
   *
   * @param $file
   *   The relative path to the file
   *
   * @command generate-migration-column-headers
   */
  public function generateYaml($file) {
    $spl = new SplFileObject($this->getConfig()->cwd() . DIRECTORY_SEPARATOR . $file, 'r');
    $spl->next();
    $headers = $spl->fgetcsv();

    $source_headers = [];
    foreach ($headers as $header) {
      $source_headers[] = [$header => $header];
    }

    $yml = Yaml::encode($source_headers);
    $this->output()->write($yml);
  }

}

What happens here is very simple. We first read the file whose path is the first and only mandatory argument of the command. This path needs to be relative from where the Drush command is called from because we concatenate it with that location using $this->getConfig()->cwd(). Then we take the values from the first row of the CSV (the header) and we build an array that is in the format expected by the CSV source plugin. Finally, we output a YAML-encoded version of that array.

Do note, however, that the column description is just the column name again since we don’t have data for that. So if you wanna add descriptions, you’ll have to add them manually in the migration file. Run the command, copy and paste and bill your client less.

Hope this helps. Can’t believe I’ve been writing CSV based migrations since like the beginning and I just came up with this thing now.