Amtrak, B-Movies, Web Development, and other nonsense

Tag: Submodules

Overlaying dependency management

I’ve described how Lafayette’s deployment strategy involved pushing rocks uphill. A key change in our thinking came when we started treating each of our WordPress multisite installations as its own software project, with its own dependencies and lifecycle. Enabling this pivot was a technology which wasn’t mature in 2013: Composer.

What is Composer?

Composer is a package manager for PHP. It fills a role similar to npm for Node.js and bundler for Ruby. It uses a JSON file to capture metadata about the project and the project’s dependencies. If you have a custom application built on Symfony and Silex, your composer.json file might look like this:


{
    "name": "outlandish/wpackagist",
    "description": "Install and manage WordPress plugins with Composer",
    "require": {
        "php": ">=5.3.2",
        "composer/composer": "1.3.*",
        "silex/silex": "~1.1",
        "twig/twig": ">=1.8,<2.0-dev",
        "symfony/console": "*",
        "symfony/filesystem":"*",
        "symfony/twig-bridge": "~2.3",
        "symfony/form": "~2.3",
        "symfony/security-csrf": "~2.3",
        "symfony/locale": "~2.3",
        "symfony/config": "~2.3",
        "symfony/translation": "~2.3",
        "pagerfanta/pagerfanta": "dev-master",
        "franmomu/silex-pagerfanta-provider": "dev-master",
        "doctrine/dbal": "2.5.*",
        "knplabs/console-service-provider": "1.0",
        "rarst/wporg-client": "dev-master",
        "guzzlehttp/guzzle-services": "1.0.*"
    },
    "bin-dir": "bin",
    "autoload": {
        "psr-4": {
            "Outlandish\\Wpackagist\\": "src/"
        }
    }
}

The real action here is in the require block, where we spell out all the different packages that compose our application. Each key/value pair is an individual package and its version constraint. Composer users semantic versioning, and supports a wide range of expressions. Here are some quick examples:

  • “*”: wildcard; highest version possible (probably don’t ever do this)
  • “2.*”: highest version within the 2.x major version
  • “~1.2”: highest version between 1.2 and 2.0

The key name is the name of the package, in the format vendor/project name. By default Composer assumes that you’re installing these packages from Packagist, but as we’ll see that’s just the beginning of the Composer ecosystem.

What is Packagist?

Packagist is a centralized Composer repository which anyone can register packages on. It’s full of packages like the ones listed in the example above. Given a specific package and version constraint, it returns the matching files. Packagist is special inasmuch as it’s available by default for every Composer project, but it’s possible to define additional repositories if you have packages which aren’t in Packagist.

What Repositories, where?

Let’s say you’ve got a private project that you can’t publish to Packagist, but you’d like to make it available to an internal Composer project. Composer has a number of options for doing this. The simplest is adding your version control repository (VCS) as a repository to your project:


{
  "name": "yourcompany/yourproject",
  "description": "Your sample project",
  "repositories": [
    {
      "type": "vcs",
      "url": "https://github.com/yourcompany/someotherproject"
    }
  ]
}

This means that Composer will scan that repository, in addition to Packagist, when searching for packages. All you need to do to make this work is to add a reasonable composer.json file to that private project.

This is fine for one or two packages, but becomes unwieldy with a dozen or more. This is where a project like Satis becomes useful. Satis transforms a repository block into a lightweight Composer repository. This way, your internal projects need to include the Satis repository only—as you add new VCS repositories to Satis they become available to your downstream projects:


{
  "name": "yourcompany/yourproject",
  "description": "Your sample project",
  "repositories": [
    {
      "type": "composer",
      "url": "https://satis.yourcompany.com"
    }
  ]
}

What’s this got to do with WordPress?

Composer’s structure lets you overlay a dependency management system on existing code without too much pain. With your own private packages adding composer.json files is straightforward. Obviously you’re not going to get composer.json files added to every package on the WordPress theme and plugin repositories.

Fortunately, this is a solved problem. The fine folks at Outlandish maintain the WordPress Packagist composer repository. WPackagist scans the WordPress theme and plugin subversion repositories, mapping projects and versions in a way that composer can understand. The themes and plugins themselves are downloaded as zip files and extracted. At a stroke, all themes and plugins from WordPress.org are available to your project along with your private projects:


{
  "name": "yourcompany/yourproject",
  "description": "Your sample project",
  "repositories": [
    {
      "type": "composer",
      "url": "https://satis.yourcompany.com"
    },
    {
      "type":"composer",
      "url":"https://wpackagist.org"
    }
  ]
}

This is far more efficient than converting repositories to git, and you arrive at the same end result: specifying a known version constraint for a given project. Next up: how you actually deploy a WordPress site with this stuff.

Don’t push rocks uphill

Pictured: developer pushing rocks uphill

For three years Lafayette’s official WordPress deployment strategy was to push rocks uphill. This was a doubtful plan, but it represented an improvement over its predecessor, which was to stare at the rocks doubtfully, then roll them around a field at random. Here follows a warning to others.

Git all the things!

In 2013 we had embraced git with the fervor of the converted. Applying this to WordPress was difficult. WordPress.org gave us two options for getting themes and plugins:

  • Download them as a ZIP file manually
  • Clone them from subversion

These weren’t great options. We weren’t going to adopt a strategy which incorporated subversion as part of the deployment itself, and we didn’t want to lose revision information with a manual download process.

We hit upon the strategy of pushing rocks uphill. We setup a platform to clone the WordPress themes and plugins we need and then convert them from SVN to git, using the svn2git ruby gem. We pushed the result into a private git repository. This git repository was then added as a submodule to our WordPress deployment repository.

This was cumbersome and time-consuming. The WordPress theme and plugin SVN repositories are massive. The initial conversion of a module could take hours, or just fail. Repository structures varied according to the whims of the plugin maintainers. Tagging was inconsistent. WordPress doesn’t encourage atomic committing to SVN, which undercut the value of having commit messages. Maintaining a private repository for each theme and plugin added significant overhead.

Submodules: threat or menace?

Deployment in progress. Taken by Peter Standing [CC BY-SA 2.0], via Wikimedia Commons

We haven’t even talked about submodules. With a submodule you nest a git repository inside a git repository. The top-level repository has a .gitmodules file which tracks the location of the submodule remote; the revision history tracks which commit should be checked out in the submodule.

In a sample WordPress deployment, you would have your WordPress git repository, and you would attach your theme and plugin submodules. You then clone this repository on to your web server, and update from git as needed. This works, but it’s not as slick as it sounds.

Time was I wouldn’t hear a word said against them. That time is past. They’re a kludge and should be used sparingly. Most of “Submodules considered harmful” think pieces focus on their use in development. Here’s a couple: Why your company shouldn’t use Git submodules and Git: why submodules are evil. Their use is more defensible in a deployment context, but there are still problems:

  1. When you clone a git repository which has submodules, the submodules have to be initialized and updated separately.
  2. When you update a repository, the submodules have to updated as well. A sample command would be git submodule foreach git fetch —all.
  3. With deployments, you’re now worried about the state of each git repository, submodule or no.

With a collection of shell scripts this is manageable, but again it’s a lot of overhead. It also doesn’t self-document very well. Looking at my project repo, I can run git submodule status and get a mixture of useful and not useful information:


 20e6e064792e9735157d88f97eece9c5aef826a8 wp-content/plugins/conditional-widgets (2.2)
 90c74decfb020fdaa255bba68acb142550dfac35 wp-content/plugins/contact-form-7 (4.4.1)
 13cfa86ceb8001438b0fec9ea3a5a094d42e2397 wp-content/plugins/custom-field-template (2.3.4)
 292e34607378de0b4352ba164ccf7e1ecdaa44e9 wp-content/plugins/mathjax-latex (v1.1-59-g292e346)

This is okay as far as it goes, but I’m at the mercy of what’s in the submodules. I can’t rely on it to be human-readable, and I can’t use them for anything. If I want to update a module, I’ve got to push the new code through the pipeline, then update the submodule on my local machine, commit, and then send it out to the web server.

Also, you can’t easily remove a submodule, which you might want to do if you’re deleting a plugin. The last time I did this, I had to follow this process:

  1. Using a text editor, delete its entry from .gitmodules
  2. Stage that change
  3. Using a text editor, delete its entry from .git/config
  4. Run git rm --cached path/to/submodule
  5. Run rm -rf path/to/submodule
  6. Commit everything
  7. Push to your repository
  8. Repeat steps 3-5 when updating on the remote

A developer removes a submodule

Rocks. Uphill. Pushing. Allegedly this has gotten better in the last few years, but (a) we’ve moved on and (b) the version of git is low enough on our servers that it wouldn’t matter anyway.

Do you have a better idea?

The mistake here was forcing the WordPress modules into an unnatural path. They have versions on the WordPress.org repository; we should have adopted a method that leveraged that, rather than re-create that method in our own private repositories. Next up: overlaying dependency management.