Beginner Friendly But With TLDRs!

Gatsby + Contentful Rich Text: Migrate to gatsby-source-contentful v4+

Ed Pike
6 min readFeb 11, 2021

The gatsby-source-contentful plugin changed a lot in late 2020, from version 2 to version 4. One significant breaking change concerned Rich Text fields, especially if you have linked/embedded Contentful content and assets in your RichText Field. Fixing this breaking change can be tricky, especially if you are fairly new to GraphQL and Gatsby.

Overview

TLDR: The plugin changed from 2.x to 4.x in four months. Rich Text changed from having one “json” field to a combination of a “raw” string field and “references” array.

If you use Contentful headless CMS with Gatsby, you need the gatsby-source-contentful plugin to gain access to your content. Rendering most of the data types to HTML is straight forward. The Rich Text field is a major exception. Most examples, tutorials, and Gatsby starters refer to an older, simpler version of the plugin.

The gatsby-source-contentful package was stable for nearly 3 years at version 2.x. However, people building large Gatsby sites with nested references in their Rich Text fields started having serious issues while building their pages. See the issue on github here and here.

In response, the plugin implementation for accessing Rich Text changed dramatically to force you to deal with embedded references explicitly.

If you look on NPM you can see that it stayed at 2.x for almost 3 years then jumped through 3.x to 4.6 in 4 months. If they were following standard SemVer, thats 2 sets of breaking changes.

However the vast majority of tutorials on its use did not get updated and, as of Feb 2021, all of the Gatsby starters that use Contentful are still using the 2.x version. Finally, none of these resources seem eager to use to embedded references. That’s understandable: it’s not a sexy topic and it can quickly lead beginners off the “happy path”.

The documentation for the new Rich Text format is towards the bottom of the plugin read-me (https://www.npmjs.com/package/gatsby-source-contentful). However, it is minimal, especially if you are new to Gatsby and GraphQL. Most importantly, it does not handle one serious issue: handling optional embeds of Contentful content types in your Rich Text field.

If you are rewriting your code from “json” to use “raw” and “references” that means you are not a real beginner so it’s not so tough. This GitHub issue from Nov 16, 2020 was helpful. I will have a blog post for beginners about using Gatsby and GraphQL in the near future!

Bad Optional Embeds: The Symptoms

When you run gatsby develop you will see something like:

success onPreExtractQueries - 0.002sERROR #85901 GRAPHQLThere was an error in your GraphQL query:Fragment cannot be spread here as objects of type “ContentfulAssetContentfulPostContentfulTagUnion” can never be of type “ContentfulPage”.

Note for Beginners: My Contentful space has a custom content type of “Page”. The Gatsby schema inferrer always prepends “Contentful” to the content type.

This is the page query that is causing the problem:

// page query from bottom of page...
/*
allContentfulPost is a collection generated by Gatsby schema inferrer
it auto prepends "Contentful" to content types on Contentful CMS.
In this case I have a custom content type of Post.
Further down you will see the "references" field refer to ContentfulAssets (images)
and my custom types of Tag, Post and Page
*/

export const query = graphql`
{
allContentfulPost
(
filter: {
contentful_id: {eq: "7tC182oOY4JywGH9PY98Vh"}
}
)
{
nodes {
__typename
contentful_id
title
slug
publishDate
tags {
contentful_id
title
slug
}
richBody{
raw
references {
... on ContentfulTag {
__typename
contentful_id
id
title
}
... on ContentfulAsset {
__typename
contentful_id
title
fixed(width: 1600) {
width
height
src
srcSet
}
}
... on ContentfulPost {
__typename
contentful_id
title
metaDescription {
metaDescription
}
}
... on ContentfulPage {
__typename
contentful_id
title
}
}
}
}
}
}
`

I got the query by using the Gatsby’s GraphiQL app. I slapped it into the page query above. Everything worked fine.

The problem’s roots are that when I created that page query, I had a Post with a Rich Text field that had embedded references to my Contentful content types of: Tag, Image, Post and Page. Gatsby inferred the “references”.

Then, I edited that specific Post in Contentful and deleted the embedded reference to another Post. I published and ran gatsby develop and got the Error.

The core problem is that the Gatsby schema inferrer did not see a reference to an embedded post in my Rich Body rich text field, so when it built the schema again, it did not have the definition. However, my page query did not change in concert, so the GraphQL compiler threw an error.

Because I could never predict what a blogger might put in the Rich Body field, I needed a fix that would allow any combination of Embeds to be in the Content. Though I found a solution, below, you have to edit it every time you add a new Contentful content type that *might* be embedded in the Rich Text field. Contentful and Gatsby are working on a better solution, but in the meantime…

The Fix

TLDR: If you want your content editors to be able to optionally embed Contentful assets and content types in a Rich Text field, you must implement exports.createSchemaCustomization in your gastby-node.js

// in gatsby-node.js...
/*
This is code to address the fact that if you **could** have an embedded link (aka union)
from a Rich Text field on one Content Type to one or more Content Types in your Contentful space,
Gatsby's schema inference will fail UNLESS you have at least one embed for each Content Type inside
an instance of the Rich Text field. The code below lets Gatsby schema inferer know that these "unions"
COULD happen even if there is not one right now
*/
exports.createSchemaCustomization = ({ actions }) => {
const { createTypes } = actions

const typeDefs = `

#The union name is arbitrary. However, I'm using the name that would have been generated by the schema inferer.
union ContentfulAssetContentfulPageContentfulPostContentfulTagUnion = ContentfulAsset | ContentfulPage | ContentfulPost | ContentfulTag
# ContentfulPostRichBody does not implement node
type ContentfulPostRichBody {
references: [ContentfulAssetContentfulPageContentfulPostContentfulTagUnion] @link(by: "id", from: "references___NODE")
}
`

createTypes(typeDefs)

}

How I Found The Fix

I’m new to GraphQL, let alone Gatsby and it’s schema inferrer, so I tried to use first principles. I needed to see what had changed in the code.

The easiest way was to install the `gatsby-plugin-schema-snapshot` plugin. This saves out the schema that the Gatsby inferrer creates.

Capture Happy Path Schema

  • I added the embed to a Post on the Contentful Rich Text interface and published.
  • Back in VSCode, I ran gatsby develop. The schema.gql file appeared. I renamed it “default schema.gql”. Because its no longer named “schema.gql” it will be ignored by Gatsby and will not be overwritten by the snapshot code.

Capture Sad Path Schema

  • I deleted the Post embeded in the Rich Text field on Contentful. I published the change.
  • I Ran gatsby develop. The fresh schema.gql file appeared.

What Had Changed?

I used VSCode to compare the two schema files.

union ContentfulAssetContentfulPageContentfulPostContentfulTagUnion = ContentfulAsset | ContentfulPage | ContentfulPost | ContentfulTag

changed to

union ContentfulAssetContentfulPageContentfulPostContentfulTagUnion = ContentfulAsset | ContentfulPage | ContentfulTag

That gave me enough info to make an intelligent Google search, a key pillar of programming :-) . That led me to this ongoing GitHub pull request https://github.com/gatsbyjs/gatsby/pull/12816.

That in turn led me to the Gatsby documentation on schema customization. I had to play around with it for a bit.

One big gotcha is that even when I got the schema to look exactly the same in GraphiQL, the references field was “null”. In other words, it was not pulling the actual embedded fields over from Contentful. The key to that was adding the line @link(by: "id", from: "references__NODE") in the gastby-node.js code above.

Well that’s it! Don’t forget to “watch” gatsby-source-contentful on github. Version 5 is in the works and might create a deeper fix.

PS: This was my first blog post!

--

--

Ed Pike

Full Stack, DevOps, Tech Educator, Science Enthusiast