Using Pantheon Advanced Page Cache in Drupal 7

No more cache clear all for content changes!

Our clients actively update their sites with new and timely content, and delivering fresh pages to their international audiences is crucial to their success.

Because they also need to precisely control release of content, waiting 15 minutes for the cache to expire (or longer depending on how that is configured) before visitors see the new or updated content is unacceptable. And we don’t want to force editors to clear the entire site cache on every content change — easy to forget until marketing contacts the editors wondering why the CEO’s press release isn’t on the site yet.

Now that Pantheon has a module connecting Drupal to its Global CDN, clearing stale pages from the CDN (also referred to as the edge cache) happens instantly and automatically. We’ve been experimenting with the alpha version of Pantheon Advanced Page Cache with great results.

How it works

For D7 sites, Pantheon’s module depends on Drupal 8 Cache Backport (D8cache for short), a backport of the Drupal 8 cache system, or part of it at least. If you’re on Drupal 8, you only need Pantheon’s module, but this article is about Drupal 7.

D8cache works by attaching cache tags to content, users, entities, taxonomy terms and more, using Drupal hooks to listen for a variety of CRUD operations. The tags look like node:1234, taxonomy_term:5678, views:blog_recent, and so on.

The Pantheon module adds an HTTP header listing the tags to each page it serves. And when D8cache calls for tags to be invalidated by invoking hook_invalidate_cache_tags(), the Pantheon module talks to the CDN to clear the stale pages.

In Drupal 7, when you turn on these two modules, which requires a few lines in settings.php, they just work. We found that cached pages were cleared almost instantly from wherever the server floats in the cloud all the way through Pantheon’s Global CDN. (Drupal still takes care of caching in the database.)

When to extend cache clearing

It just works? Well, it just works, except for a few cases that you can quickly handle with some custom code (D8 has the Views Custom Cache Tags and Cache Control Override modules to help).

The special case for our clients is most typically a page with a views list of blog or press release nodes. Other cases might include new products or price changes, status updates or emergency alerts.

Without any help, the stale listing page is cleared from the CDN when an existing post is updated, but not when a new one is added, deleted or published.

D8cache doesn’t know which content types are used in a view, but it’s easy enough to make it aware with this variation of what D8cache does in its includes/views.inc file. (See all the includes to get an idea of what’s possible for your specific needs.) Below, we add tags based on the node type that look like:

/**
 * Implements hook_views_pre_render().
 */
function site_views_pre_render(&$view) {
  $tags = _site_views_get_cache_tags($view);
  drupal_add_cache_tags($tags);
}

/**
 * Build tags array from allowed content types in view. This only includes those
 * that are checked in the filter. It will miss all if none are checked or if
 * the operator is 'out'.
 *
 * @param $view
 * @return array
 */
function _site_views_get_cache_tags($view) {
  $tags = [];
  if (isset($view->filter['type']->value) && $view->filter['type']->operator == 'in') {
    foreach ($view->filter['type']->value as $type) {
      $tags[] = 'views:node_type.' . $type;
    }
  }
  return $tags;
}

A couple of caveats are noted in the function documentation block. You could modify the targeting in a number of ways to be more or less focused, based on anything in the view object.

Tags are added to an HTTP header called Surrogate-key. A few tags are obscured to keep the client’s name hidden in this image of the header:

code-screenshot

Now that all pages with views are tagged with the node types they use, we can easily clear pages in the edge cache for appropriate views by invalidating the custom view cache tag on node CRUD operations.

/**
 * Implements hook_entity_delete().
 */
function site_entity_delete($entity, $entity_type) {
  site_invalidate_entity_cache_tags($entity, $entity_type);
}

/**
 * Implements hook_entity_insert().
 */
function site_entity_insert($entity, $entity_type) {
  site_invalidate_entity_cache_tags($entity, $entity_type);
}

/**
 * Implements hook_entity_update().
 */
function site_entity_update($entity, $entity_type) {
  site_invalidate_entity_cache_tags($entity, $entity_type);
}

/**
 * Invalidate custom views cache tags by entity type.
 * @param $entity object
 * @param $entity_type string
 */
function site_invalidate_entity_cache_tags($entity, $entity_type) {
  if ($entity_type === 'node') {
    $tags = ['views:node_type.' . $entity->type];
    drupal_invalidate_cache_tags($tags);
  }
}

The hook_entity_update() implementation is needed if your content workflow includes creating unpublished content. When you create a new node without publishing it, it won’t be listed or tagged in a view that only shows published content. So when the node is published this hook is necessary to invalidate the listing page, because the insert hook only fires on node create. Nodes that are published on creation use the insert hook to tag the view with their node ids, like: node:1234 node:1235, and when updated don’t need this hook to clear the view by content type.

A second case for this hook occurs when existing published content would newly appear or reappear in a list through manual curation, date change or maybe the addition of a taxonomy term.

Invoking hook_entity_update() probably results in some unneeded edge cache clearing, but that’s a small price to pay to allow our clients to use their desired workflow. However, this might be a significant performance hit for some sites.

What if you’re not using Views?

You’ll need to add the tags wherever you’re creating your lists. Here’s how we did it with Entity Field Query for one site. You could do something similar with db_select() or other database calls.

/**
 * Add tags from custom tags array, for lists that don't use views.
 *
 * @param $suffixes array
 * Array of tag suffixes for tagging custom lists.
 * @return array
 */
function site_set_site_list_cache_tags($suffixes) {
  foreach ($suffixes as $suffix) {
    $tags[] = 'list_node_type:' . $suffix;
  }
  drupal_add_cache_tags($tags);
}

/**
 * Implements hook_entity_query_alter().
 */
function site_entity_query_alter($query) {

  $conditions = $query->entityConditions;
  // Add tags for node queries
  if (isset($conditions['entity_type']) && $conditions['entity_type']['value'] == 'node' && isset($conditions['bundle'])) {
    $tags = [];
    if (strtoupper($conditions['bundle']['operator']) == 'IN') {
      $values = explode(',', $conditions['bundle']['value']);
        foreach ($values as $value) {
          $tags[] = trim($value);
        }
     }
     // This is the "=" operator.
     elseif ($conditions['bundle']['operator'] == null) {
       $tags[] = trim($conditions['bundle']['value']);
     }

    if (!empty($tags)) {
      site_set_site_list_cache_tags($tags);
    }
  }
}

Then add the new tag to your cache tag invalidation.

/**
 * Invalidate custom views cache tags by entity type.
 * @param $entity object
 * @param $entity_type string
 */
function site_invalidate_entity_cache_tags($entity, $entity_type) {
  if ($entity_type === 'node') {
    $tags = [
      'views:node_type.' . $entity->type,
      'list_node_type.' . $entity->type,
    ];
    drupal_invalidate_cache_tags($tags);
  }
}

Conclusion

We’re using D8cache and Pantheon Advanced Page Cache module in production on most of our Pantheon sites. Remember that neither module has a stable release yet. API changes are possible, if not likely, in the future. Update carefully. We’re looking forward to seeing the modules develop.