clock menu more-arrow no yes mobile

Filed under:

To Federate or Stitch a GraphQL gateway, revisited

A closer look at Apollo Federation versus the newly-renovated Schema Stitching

NASA ESA/Hubble

Vox Media’s Chorus publishing platform is—as the name suggests—a chorus of applications working together to publish modern media. While building the new Chorus GraphQL API, we looked at gateway tools that could unite these services into a single graph.

Schema Stitching (a component of GraphQL Tools) got a bad rap when it was famously abandoned by Apollo in favor of their Federation architecture some years back. However, Stitching came under stewardship of The Guild and friends in 2020, and they’ve since overhauled it with numerous automations and performance enhancements. Seemingly out of nowhere, Schema Stitching has reemerged as something of a nimble hummingbird racing alongside the stallion that is Apollo Federation.

Let’s compare the two systems with a small application structure... say we’re building a publishing platform and we have two services: one manages written content, while the other manages uploaded images.

In the content service, we can compose Entry objects that may contain placed Image objects from the images service; each placed image is represented by an EntryImage that controls the crop of the placement.

Apollo Federation

Federation leans heavily into a declarative SDL (schema definition language) for its operations. Our Entry/Image schema above would look something like this:

With federation, Content service and Image service have interwoven schemas. The Image service resolves some fields using data from the Content service. These connections are defined using SDL annotations.

Federation encourages schemas to be organized by concern. As such, Federation posits that EntryImage should exist in both the Content service and the Images service because it has overlapping concerns with both. We can say that EntryImage originates in the Content service, and then is extended (using the extend keyword) in the Images service with additional fields. To scaffold this federated architecture, we use the Apollo federation and gateway packages to build dedicated servers for each component:

The @apollo/federation package configures sub-services while @apollo/gateway hooks them all together.

There are some quick wins to be had with Apollo’s prebuilt Node tools. The federation package prepares each sub-service schema while the gateway package hooks them all together. The underlying protocols that exchange data between servers are automatically configured, and the setup code is pretty concise.

However, nuances appear as you dig into the federated architecture. By design, Federation services are not autonomous; they contain fields that cannot be resolved outside of the combined gateway context. Federation builds around cross-service type hierarchies and field-level dependencies—patterns that are unintuitive, and tricky to support in non-Node programming languages without the help of Apollo tools. For a system that encourages separation of concerns, the overall service architecture is remarkably codependent.

Schema Stitching

Stitching’s new type merging pattern offers a totally different strategy for building a distributed graph, and it starts with self-contained sub-service schemas composed using plain GraphQL syntax:

With Stitching’s new type merging pattern, Content service and Image service are two self-contained schemas without any awareness of each other. Types with the same name are merged together in the gateway.

Type merging encourages sub-service schemas to be independently valid. A type may exist in any number of services, and each service simply defines as many (or as few) fields for the type as it has available. Types with the same name are then merged into one unified type in the gateway schema. The only requirement is that each partial type containing unique fields must register a query for itself with the gateway (or provide some kind of “_entities” style abstract service):

Merged types each register a query with the gateway so it knows how to fetch each service’s version of the type. This query configuration may be omitted for types that contain no unique fields.

Once the gateway knows how to query for each service-specific type, it can smartly delegate portions of a request to each relevant service in dependency order, and then merge all results for the final API response. Thanks to the recent addition of query batching, all partial types are efficiently fetched with a single query per service (greatly improving upon Stitching’s old networking overhead).

While you have to write your own service APIs for Stitching, the bare-metal nature of this configuration is quite extensible. A deliberate omission here is the extends keyword—in fact, type merging makes the extends keyword feel like something of a relic. Merge patterns are considerably more flexible than type hierarchies, they use plain GraphQL, they keep services encapsulated, and they combine schemas into one decentralized type graph (versus Federation’s nuanced distinctions around origin/extended types). Overall, merging proves to be a surprisingly extensible and intuitive strategy.

Side-by-side features

Service design

Each system has a unique strategy for combining sub-services:

  • Federation services are aware of each other’s data while the gateway is a generic agent that combines them. The gateway configures itself by reading SDLs from each service, and may be reloaded on the fly with new SDLs.
  • Stitching services remain unaware of each other while the gateway loads and combines their schemas. Recent development has added SDL annotations that allow stitched schemas to also be reloaded on the fly.

Object keys

For type-level keys, Federation’s @key directive and Stitching’s selectionSet config are relatively comparable.

Federation’s @key directive is odd because it’s supposed to refer to a type’s primary key in its origin (non-extended) service; though when a type doesn’t have a primary key (such as EntryImage above) this key is a nonsensical directive with the sole purpose of turning the type into a federated entity. By comparison, Stitching’s selectionSet simply collects fields from an object that are needed to query for subsequent parts of it.

Also noteworthy is that Federation expects objects to include unsightly foreign keys in their schema (such as imageId above) for internal reference purposes. Stitching cleverly hides foreign keys as types, so Image.id (a typed object with a primary key) acts as both a public field and an internal implementation reference.

Field-level dependencies

Federation’s @requires directive describes a classic dependency pattern: this field requires data from other services to be fulfilled. However, these dependent fields are inoperable outside of the combined gateway schema. While Stitching generally discourages tightly-coupled services, it does offer an analogous computed fields feature with the expressed intent of supporting Federation service interactions.

Ironically, Stitching’s computed fields are slightly more robust than their Federation counterpart: Stitching may compute dependencies from anywhere in the type graph, versus Federation’s constraint of only requiring fields from a type’s origin service.

Field hints

Among the least ergonomic aspects of Federation are its @external and @provides directives, which indicate how fields may interact in subtle ways. For example, @external denotes a local field whose data comes from another service; this ghost field is mainly provided for validation purposes, yet is left in the runtime schema despite not being inherently consistent. Stitching makes everything implicit—services provide only the fields they can fulfill, and the gateway always selects as many fields as possible from as few services as possible, eliminating the need for @provides-style hints.

Unique features

  • The Apollo ecosystem of hosted tools is a clear advantage for Federation. There’s Apollo Studio for schema registry and managed deployments, and Apollo Data Graph for full enterprise solutions.
  • Schema Stitching supports subscriptions, and is generally more customizable. Cross-service interfaces and proxyable scalars allow for flexible gateway composition, and classic schema extensions are still fully supported for one-off needs. There’s also a suite of schema transformations for integrating third-party schemas.

Conclusion

While Schema Stitching fell into deprecation limbo for a few years, its new maintainers have brought it back in a big way, and it feels like it’s breaking new ground into a post-extends era of GraphQL gateway design. Our team really appreciates that Stitching follows standard GraphQL syntax and best-practices, while Federation expects engineers to become specialists with Federation. Unless you’re specifically in the market for an enterprise solution, consider giving Stitching a fresh look... you might not need Federation.