Jen Wachter

Web Developer

Hub API

The Hub API enables authorized users to interact with data from the Hub database, a central repository of news articles, events, announcements, photo galleries, and faculty experts from Johns Hopkins University.

See the API documentation for a detailed usage guide. Keep in mind that the documentation is designed for general consumers of the API; therefore, is not all-inclusive of the API’s capabilities.

The back end

The Hub API uses headless Drupal to store and manage its content, taxonomies, and files. Each type of content and taxonomy has a corresponding endpoint in the API.

Content-first philosophy

Image processing

For every image uploaded to Drupal, approximately 30 distinct crop sizes are generated, consisting of hard and soft crops at various sizes. These variations are used to facilitate responsive image loading in various scenarios on the Hub website.

Normally, Drupal creates these images on-demand when they are requested, but the Hub’s files are served from a CDN; therefore, the crop sizes need to be generated and made available in the CDN as soon as the original image is uploaded. To avoid prolonged processing times for users, the generation of the sizes and their subsequent synchronization to cloud storage is sent to a background server queue. Passing this work off to a queue allows the user to continue with their work while the image is processed in the background.

The front end

The API front end is a custom-built Model-View-Controller (MVC) application that uses Slim as the underlying framework. Authentication, authorization, and validation are handled via application middleware.

Abstraction

Within the application, interactions with Drupal are abstracted away into a CMS/database adapter, which keeps CMS-dependent logic separate from the application layer. As a result, CMS upgrades or a migration to a completely different CMS does not require any changes in the core application; only the adapter needs to be modified.

Authentication and authorization

Affiliates of Johns Hopkins University can request access to the API, which is granted in the form of an API key. An API key does not open the entirety of the API to the user; instead, each key represents a set of privileges that are evaluated when requests are made to the API. For example, most API keys are given GET request access to content endpoints with limited response data.

Caching strategy

To ensure quick response times, the API has two layers of cache:

  • an outer layer that uses an external CDN
  • an inner layer within the origin web servers

The outer layer is unsophisticated, as it relies on the URI to create the cache key; therefore each API user’s requests are cached individually due to the user-specific API key that is required in the query string of each request. Even if two users make the same request outside of their API key, the responses are cached separately in the CDN.

The inner layer is more sophisticated, as it can use private data — such as a user’s privilege level — to group a larger number of requests in the same cache key. Identical requests made by users of the same privilege level will share the same cache key, making up for some of the shortcomings of the outer cache layer.

Evolution over the years

  • 2013: Hub API launches. Utilizing Drupal 7 as the content management system, article, file, and taxonomy endpoints are made available in the API with basic documentation provided to users. Articles are fairly basic at this point, with images being the only type of content that can be embedded within them.
  • 2014: Campus announcements transition to the Hub API. Previously, campus announcements were distributed to faculty, staff, and students through a manually generated daily email. As part of this launch, not only are announcements now available via the API and Hub website, but the creation of the daily email moved to an automated process that generates the email using API data and schedules it for delivery using the MailChimp API.
  • 2015: Embeddable photo galleries launch. Photo galleries can be created and embedded within the body of an announcement or article.
  • 2016
    • Events launch. Stakeholders from around the university are invited to submit their events for inclusion in the Hub event calendar. Divisions from across the University begin using the Hub calendar to promote their events.
    • Additional embeddable content is made available. Editors can now embed videos, teasers, pull quotes, section breaks, and drop-caps within content.
  • 2017: CDN integration. All traffic to the API is routed through a CDN (Akamai) with caching rules designed to make the API as fast and efficient as possible. Files are stored and served from cloud storage instead of the API’s origin servers.
  • 2018: Refactor of API source code and unit tests. Using code profiling reports, bloat and inefficiencies are identified and removed, resulting in an approximate 43% time cost reduction across sampled endpoints.
  • 2019: Recurring events launch. Previously, if an event had multiple instances, a separate event record needed to be created for each instance. With the launch of recurring events, these instances can now be created within the same event record.
  • 2020: Faculty profiles launch. Faculty profiles are included in the API to power the Faculty Experts Guide.
  • 2021: Stand-alone photo galleries launch. Photo galleries no longer need to be part of another piece of content; they can stand on their own.
  • 2022: Migration to Drupal 9. Executed the migration of the API frontend, CMS backend, and 60K content nodes from Drupal 7 to 9 with zero frontend downtime.