Amtrak, B-Movies, Web Development, and other nonsense

Tag: WordPress

WordPress and partial content

Eighteen months ago we had an anomalous problem where video playback didn’t work on some, but not all, of our WordPress multisites. Videos wouldn’t play, or would play but wouldn’t seek. The problem was confined to local uploads embedded in a page. Videos from YouTube played fine; if you viewed the video directly playback worked as expected.

The problem turned out to be long-standing issue with how ms-files.php served up files from pre-WordPress 3.5 multisites. Solutions had floated around for years. Our problem was describing the problem with enough specificity to actually find the right solution.

Continue reading

Overlaying dependency management

I’ve described how Lafayette’s deployment strategy involved pushing rocks uphill. A key change in our thinking came when we started treating each of our WordPress multisite installations as its own software project, with its own dependencies and lifecycle. Enabling this pivot was a technology which wasn’t mature in 2013: Composer.

What is Composer?

Composer is a package manager for PHP. It fills a role similar to npm for Node.js and bundler for Ruby. It uses a JSON file to capture metadata about the project and the project’s dependencies. If you have a custom application built on Symfony and Silex, your composer.json file might look like this:


{
    "name": "outlandish/wpackagist",
    "description": "Install and manage WordPress plugins with Composer",
    "require": {
        "php": ">=5.3.2",
        "composer/composer": "1.3.*",
        "silex/silex": "~1.1",
        "twig/twig": ">=1.8,<2.0-dev",
        "symfony/console": "*",
        "symfony/filesystem":"*",
        "symfony/twig-bridge": "~2.3",
        "symfony/form": "~2.3",
        "symfony/security-csrf": "~2.3",
        "symfony/locale": "~2.3",
        "symfony/config": "~2.3",
        "symfony/translation": "~2.3",
        "pagerfanta/pagerfanta": "dev-master",
        "franmomu/silex-pagerfanta-provider": "dev-master",
        "doctrine/dbal": "2.5.*",
        "knplabs/console-service-provider": "1.0",
        "rarst/wporg-client": "dev-master",
        "guzzlehttp/guzzle-services": "1.0.*"
    },
    "bin-dir": "bin",
    "autoload": {
        "psr-4": {
            "Outlandish\\Wpackagist\\": "src/"
        }
    }
}

The real action here is in the require block, where we spell out all the different packages that compose our application. Each key/value pair is an individual package and its version constraint. Composer users semantic versioning, and supports a wide range of expressions. Here are some quick examples:

  • “*”: wildcard; highest version possible (probably don’t ever do this)
  • “2.*”: highest version within the 2.x major version
  • “~1.2”: highest version between 1.2 and 2.0

The key name is the name of the package, in the format vendor/project name. By default Composer assumes that you’re installing these packages from Packagist, but as we’ll see that’s just the beginning of the Composer ecosystem.

What is Packagist?

Packagist is a centralized Composer repository which anyone can register packages on. It’s full of packages like the ones listed in the example above. Given a specific package and version constraint, it returns the matching files. Packagist is special inasmuch as it’s available by default for every Composer project, but it’s possible to define additional repositories if you have packages which aren’t in Packagist.

What Repositories, where?

Let’s say you’ve got a private project that you can’t publish to Packagist, but you’d like to make it available to an internal Composer project. Composer has a number of options for doing this. The simplest is adding your version control repository (VCS) as a repository to your project:


{
  "name": "yourcompany/yourproject",
  "description": "Your sample project",
  "repositories": [
    {
      "type": "vcs",
      "url": "https://github.com/yourcompany/someotherproject"
    }
  ]
}

This means that Composer will scan that repository, in addition to Packagist, when searching for packages. All you need to do to make this work is to add a reasonable composer.json file to that private project.

This is fine for one or two packages, but becomes unwieldy with a dozen or more. This is where a project like Satis becomes useful. Satis transforms a repository block into a lightweight Composer repository. This way, your internal projects need to include the Satis repository only—as you add new VCS repositories to Satis they become available to your downstream projects:


{
  "name": "yourcompany/yourproject",
  "description": "Your sample project",
  "repositories": [
    {
      "type": "composer",
      "url": "https://satis.yourcompany.com"
    }
  ]
}

What’s this got to do with WordPress?

Composer’s structure lets you overlay a dependency management system on existing code without too much pain. With your own private packages adding composer.json files is straightforward. Obviously you’re not going to get composer.json files added to every package on the WordPress theme and plugin repositories.

Fortunately, this is a solved problem. The fine folks at Outlandish maintain the WordPress Packagist composer repository. WPackagist scans the WordPress theme and plugin subversion repositories, mapping projects and versions in a way that composer can understand. The themes and plugins themselves are downloaded as zip files and extracted. At a stroke, all themes and plugins from WordPress.org are available to your project along with your private projects:


{
  "name": "yourcompany/yourproject",
  "description": "Your sample project",
  "repositories": [
    {
      "type": "composer",
      "url": "https://satis.yourcompany.com"
    },
    {
      "type":"composer",
      "url":"https://wpackagist.org"
    }
  ]
}

This is far more efficient than converting repositories to git, and you arrive at the same end result: specifying a known version constraint for a given project. Next up: how you actually deploy a WordPress site with this stuff.

Don’t push rocks uphill

Pictured: developer pushing rocks uphill

For three years Lafayette’s official WordPress deployment strategy was to push rocks uphill. This was a doubtful plan, but it represented an improvement over its predecessor, which was to stare at the rocks doubtfully, then roll them around a field at random. Here follows a warning to others.

Git all the things!

In 2013 we had embraced git with the fervor of the converted. Applying this to WordPress was difficult. WordPress.org gave us two options for getting themes and plugins:

  • Download them as a ZIP file manually
  • Clone them from subversion

These weren’t great options. We weren’t going to adopt a strategy which incorporated subversion as part of the deployment itself, and we didn’t want to lose revision information with a manual download process.

We hit upon the strategy of pushing rocks uphill. We setup a platform to clone the WordPress themes and plugins we need and then convert them from SVN to git, using the svn2git ruby gem. We pushed the result into a private git repository. This git repository was then added as a submodule to our WordPress deployment repository.

This was cumbersome and time-consuming. The WordPress theme and plugin SVN repositories are massive. The initial conversion of a module could take hours, or just fail. Repository structures varied according to the whims of the plugin maintainers. Tagging was inconsistent. WordPress doesn’t encourage atomic committing to SVN, which undercut the value of having commit messages. Maintaining a private repository for each theme and plugin added significant overhead.

Submodules: threat or menace?

Deployment in progress. Taken by Peter Standing [CC BY-SA 2.0], via Wikimedia Commons

We haven’t even talked about submodules. With a submodule you nest a git repository inside a git repository. The top-level repository has a .gitmodules file which tracks the location of the submodule remote; the revision history tracks which commit should be checked out in the submodule.

In a sample WordPress deployment, you would have your WordPress git repository, and you would attach your theme and plugin submodules. You then clone this repository on to your web server, and update from git as needed. This works, but it’s not as slick as it sounds.

Time was I wouldn’t hear a word said against them. That time is past. They’re a kludge and should be used sparingly. Most of “Submodules considered harmful” think pieces focus on their use in development. Here’s a couple: Why your company shouldn’t use Git submodules and Git: why submodules are evil. Their use is more defensible in a deployment context, but there are still problems:

  1. When you clone a git repository which has submodules, the submodules have to be initialized and updated separately.
  2. When you update a repository, the submodules have to updated as well. A sample command would be git submodule foreach git fetch —all.
  3. With deployments, you’re now worried about the state of each git repository, submodule or no.

With a collection of shell scripts this is manageable, but again it’s a lot of overhead. It also doesn’t self-document very well. Looking at my project repo, I can run git submodule status and get a mixture of useful and not useful information:


 20e6e064792e9735157d88f97eece9c5aef826a8 wp-content/plugins/conditional-widgets (2.2)
 90c74decfb020fdaa255bba68acb142550dfac35 wp-content/plugins/contact-form-7 (4.4.1)
 13cfa86ceb8001438b0fec9ea3a5a094d42e2397 wp-content/plugins/custom-field-template (2.3.4)
 292e34607378de0b4352ba164ccf7e1ecdaa44e9 wp-content/plugins/mathjax-latex (v1.1-59-g292e346)

This is okay as far as it goes, but I’m at the mercy of what’s in the submodules. I can’t rely on it to be human-readable, and I can’t use them for anything. If I want to update a module, I’ve got to push the new code through the pipeline, then update the submodule on my local machine, commit, and then send it out to the web server.

Also, you can’t easily remove a submodule, which you might want to do if you’re deleting a plugin. The last time I did this, I had to follow this process:

  1. Using a text editor, delete its entry from .gitmodules
  2. Stage that change
  3. Using a text editor, delete its entry from .git/config
  4. Run git rm --cached path/to/submodule
  5. Run rm -rf path/to/submodule
  6. Commit everything
  7. Push to your repository
  8. Repeat steps 3-5 when updating on the remote

A developer removes a submodule

Rocks. Uphill. Pushing. Allegedly this has gotten better in the last few years, but (a) we’ve moved on and (b) the version of git is low enough on our servers that it wouldn’t matter anyway.

Do you have a better idea?

The mistake here was forcing the WordPress modules into an unnatural path. They have versions on the WordPress.org repository; we should have adopted a method that leveraged that, rather than re-create that method in our own private repositories. Next up: overlaying dependency management.

The Changelog Is A Lie

Today at WordCamp Lancaster Ryan Duff gave a talk on “Choosing WordPress Themes And Plugins: A Guide To Making Good Decisions.” It jogged my mind about an incident on the WordPress.org plugins database I observed last year. This incident, though minor, illustrates the significant limitations with that place.

Two years ago–to the day–I called the WordPress.org plugins database a “swamp” and I stand by that. Ryan noted that there’s no canonical right way to select plugins and themes. You have to mitigate risk as much as possible. That means you have to look at a plugin in the round. WordPress.org gives you some tools for that: ratings, reviews, installation base, support forums. You can evaluate the social credit of the developer. You can review the code yourself, if you’re so inclined and have the technical background.

Here at Lafayette we use a plugin called Category Posts Widget. It’s pretty simple: it creates a widget which will display recent posts from a given category. Its original author released version 3.3 in August 2011 and then never updated it again. We’d been running it since 2010 or earlier. If we’d stumbled on it 2013 we’d have seen it was outdated and passed, but if a plugin keeps working you never really notice it’s been abandoned unless you have a regular review process (which we don’t).

In September 2014 a new author took ownership of the plugin and released an update, 4.0, which was of course automatically available for site owners. As we manage our multisites with git we have a code import process using svn2git, so we generally know how significant the changes are. Every plugin page on WordPress.org has a changelog, and the changes for this update sounded pretty routine:

  • Added CSS file for post styling
  • Now compaitable [sic] with latest versions of WordPress

Okay that sounds pretty helpful and…hey, check out the diff on those changes:

 cat-posts.php | 504 ++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 279 insertions(+), 225 deletions(-)

Wait, what? That plugin was only 262 lines long! What the hell?

At the risk of a tired metaphor this was a wolf in sheep’s clothing. The new author had inserted a completely new plugin with no upgrade path under the guise of an update. While it provided the same functionality, you would have to manually update your widgets. If, like us, you maintain multiple multisite installations with hundreds of sites, this simply isn’t an option. This support forum discussion gives a taste of the anguish for downstream users.

We dodged a bullet because of our internal code review process, but there are few external indications on WordPress.org about what happened:

  • As of today, the plugin has 80,000+ active installs. That no doubt includes those clients who, like us, stayed on version 3.3. In November, when WordPress still counted downloads and not installations, it had 300,000+ downloads.
  • It stands at 3.9 of 5 stars, with 8 5-star reviews and 3 1-star reviews. Tellingly, most of these reviews are from after the 4.0 update, and apparently from new users who weren’t burned by the update. Only one of the 1-star reviews flags the upgrade issue.
  • The author has 3 plugins, though if you dig in you notice he isn’t very active in the WordPress community and his other two plugins aren’t widely used. His plugins page shows 317,000 downloads, which sounds great until you realize almost all of those predate his involvement.

Nothing in the WordPress.org environment flags that the new author usurped the plugin, assumed the social credit generated by the previous author, and then pushed through a breaking update which raised hell on downstream production sites. Discussion after the fact showed that he either didn’t care or didn’t understand how serious this was. The offer to submit pull requests to GitHub was better than nothing…except that months later there’s been no activity and no pull requests have been accepted.

I’m not sure how you fix this. On the face of it, a new a developer assuming responsibility for an abandoned but popular plugin (or theme) is a Good Thing so outlawing it isn’t a solution. Maybe if WordPress.org tracked activation history and author history, so you could drill down and get stats? Alternatively, some way to flag when a plugin has a breaking change. But for now,

Wait, let me finish!

This summer we have our student worker building out a set of Behat tests for our WordPress environment. We’ve started with smoke tests. For example, on www.lafayette.edu we’re looking at the following:

  • Are there news items? If so, do the links to those items work?
  • Are there calendar events? If so, do the links to the events works?
  • Does the “Offices & Resources” drop-down function? Do the links in that drop-down work?

That’s a short list of tests but it covers a lot of ground:

  • The RSS feed validity between the main site and the news site
  • The RSS feed validity between the main site and the calendar
  • Whether Javascript still works on the main site
  • The proper functioning of every link in the drop-down

If any of these tests fails there’s a non-trivial problem with the main page. In the first iteration, we ran into a problem with testing the links in the drop-down. This was the original test:

                When I click on "#navResources-toggle" 
                And I click the link <link>

This was within a Scenario Outline testing each link. It failed, each time, with some variation of the following:

 Exception thrown by (//html/.//a[./@href][(((./@id = 'foo' or contains(normalize-space(string(.)), 'foo')) or contains(./@title, 'foo') or contains(./@rel, 'foo')) or .//img[contains(./@alt, 'foo')])] | .//*[./@role = 'link'][((./@id = 'foo' or contains(./@value, 'foo')) or contains(./@title, 'foo') or contains(normalize-space(string(.)), 'foo'))])[1]
 unknown error: Element is not clickable at point (573, -163)

Googling suggested fiddling with the click location, which didn’t feel right. Triggering a drop-down menu and clicking a link is a simple enough use case. Simple problems should have simple answers.

Turns out this is a race condition and it reveals something about behavioral testing. The drop-down menu on the main page doesn’t open right away. We have some easing, timeouts, and animation which all mean that it takes a second or so to finish loading. During that time the links are actually moving from their starting point at the top of the page. Go try to clicking on the menu and you’ll see what I’m describing. This means that a normal user will wait for 1-2 seconds before clicking a link. We didn’t write the test that way, which meant that the location of link changed from the time we told Behat to click it and when Behat tried to click it.

The solution? Write the test like Behat is an actual user and build in that delay:

                When I click on "#navResources-toggle"
                And wait 2 seconds
                And I click the link <link>

The order of operations matters. Now, Behat doesn’t click the link until two seconds have passed, at which point the drop-down is done expanding and the links are in their final location.

Draining the swamp

It’s best to imagine WordPress’s plugin ecosystem as a swamp. Swamps are terrible. You don’t want to be there. You run a constant risk of disease and/or drowning. Anything that sinks into the swamp–it’s not coming back.

Belarus-Peat Mining near Rudzensk-Swamp-2

I’ve been debugging an odd problem on our WordPress installations involving categories. On some sites, posts which are in have multiple categories don’t display more than one category. That would be strange enough, but the category permalinks are coming out in the format SITE_URL/category/foo with the title baz, where foo is one category and baz a different category:

<a href="http://my.wordpress.site/category/category1">Category2</a>

Strange, seemingly non-deterministic behavior? The usual suspects would be database corruption or a theme bug. Yet neither seemed likely in this case. Database corruption usually isn’t so…predictable…and we quickly verified that this error was occurring in both our custom themes and stock TwentyTwelve. That would leave a core bug (unlikely with something so fundamental, but still possible) or a bad plugin.

After several patient hours of tracing execution I’d narrowed the problem to the function WordPress uses when building up the category list: the_category(). The category link string was correct before going in for formatting and it came out mangled. WordPress uses filters to allow plugins to “hook in” and modify output. A search of our plugin code revealed the culprit: Remove Title Attributes.

WordPress adds title attributes to links by default, a behavior which apparently annoys the hell of many people, including at least one person at Lafayette in the past. This plugin simply removes them with a regex (I would be remiss if I didn’t link to the famous StackExchange thread about why you should never, ever, parse HTML with a regex). To accomplish this the plugin added a filter which washed the generated category code through its regex.

Unfortunately, the regex is improperly written. In jargon, it’s greedy. This is the expression evaluated:

` title='(.+)'`

If you pass a string with multiple URL fragments it’s going to match beginning on the title tag of the first URL and ending on the end tag of the last URL. A more properly focused regex would be this:

` title=\"([^"]*)\"`

That’s it. Mystery solved.

Unresolved, however, is the larger problem with the WordPress plugin ecosystem. This plugin was added to the plugin repository in August 2009. It has never been updated since. It has been broken from the very beginning. The author has disappeared. The support forums are moribund. There’s no github repository for me to fork, should I want to continue support, since WordPress in its infinite wisdom uses SVN for everything. Spend some time Googling and you’ll find people talking up this plugin, never realizing the problems inherent. It’s still being downloaded. This may be inexperience (I’m a Moodle veteran and new to WordPress) but I don’t see a good way to get the word out that this plugin has a serious bug. If WordPress allowed you to usurp a plugin then I could push out an updated version so at least you’d get notified in your Dashboard. All I can do is leave a review indicating that it’s broken in 3.5.1 (for this specific use case) and link back to this post.

Not that it matters overmuch in this case since we’re likely to deep-six it here, but the situation feels inadequate. There’s got to be a way to do better.