FullStack Labs

Please Upgrade Your Browser.

Unfortunately, Internet Explorer is an outdated browser and we do not currently support it. To have the best browsing experience, please upgrade to Microsoft Edge, Google Chrome or Safari.
Upgrade

Optimizing GraphQL N+1 query problems in Ruby on Rails

How to fix common scenarios where N+1 query problems arise using graphql-batch

Written by 
Felipe Moyano
,
Ruby on Rails Developer
Optimizing GraphQL N+1 query problems in Ruby on Rails
blog post background
Jira Time Tracking
Must-Have Productivity Tools for Working from Home
Hybrid Business Analyst / Project Manager Role in Agile Software-Development Projects
Node.js Developer Salaries
Comparing Hourly Rates Between Software Consultancies
React Native App Of The Week: Honey

Here at FullStack Labs, we use GraphQL, an amazingly powerful tool, to establish a common language between teams of back-end developers and front-end developers. It allows teams to focus on what the underlying domain model is instead of how to expose it to clients.

Front end applications can now declare what data they need, and in what shape, and trust that the server will deliver this data and avoid waterfall scenarios where multiple queries are needed to render a single screen. 

Hitting the N+1 Query Problem

Seems like a win-win scenario, right? And it is, mostly. However, this approach shifts some of the complexity of gathering data back to the server side and, if we’re not careful, we might end up firing several queries for each fetched by the client. We have found during our Ruby on Rails projects that this is particularly common because queries are executed lazily.

Setup

For our purposes,we’ll assume that you’re familiar with Ruby on Rails, have a basic understanding of GraphQL, and have set up the graphql-ruby gem on your application. 

We’ll be using Shopify’s graphql-batch to load records efficiently. Let’s start by adding the following line to your gemfile:

-- CODE language-shell keep-markup --
gem 'graphql-batch'

Now add the line use GraphQL::Batch to your AppSchema.rb:

	
class BlogSchema < GraphQL::Schema
  query QueryType
  mutation MutationType

  use GraphQL::Batch
end
	

We’re now ready to start going through some common scenarios where you might encounter an N+1 problem, and we’ll go through how to optimize it.

Loading a belongs_to association

Let’s consider a typical Blog application:

-- CODE language-jsx keep-markup --
class
Post < ApplicationRecord
  belongs_to :author
end

-- CODE language-jsx keep-markup --
class
Author < ApplicationRecord
  has_many :posts
end

Now imagine that our blog has an index page where we show all the posts with their respective authors, and our client is fetching this data with the following Query:

	
  posts{
    id
    title
    description
    author {
      id
      name
    }
  }
	

This will result in the following queries being performed, before we apply any optimization:

-- CODE language-jsx keep-markup --
SELECT
"posts".* FROM "posts"
SELECT
"authors".* FROM "authors" WHERE "authors"."id" = $1
SELECT
"authors".* FROM "authors" WHERE "authors"."id" = $1
SELECT
"authors".* FROM "authors" WHERE "authors"."id" = $1
SELECT
"authors".* FROM "authors" WHERE "authors"."id" = $1

We can clearly see that for every Post loaded, there’s an additional query being performed to retrieve its author, thus raising an N+1 Query Problem. We can solve this with the help of graphql-batch by modifying the PostType class:

	
module Types
 class PostType < Types::BaseObject
   field :id, ID
   field :title, String
   field :description, String
   field :content, String
   field :author, AuthorType

   def author
     RecordLoader.for(Author).(object.author_id)
   end
 end
end
	

We’ll also have to include a new RecordLoader class to our application, which will handle loading records lazily:

If we run the previous query one more time, we’ll see the output changed to:

-- CODE language-jsx keep-markup --
SELECT "posts".* FROM "posts"
SELECT
"authors".* FROM "authors" WHERE "authors"."id" IN ($1, $2, $3)

Loading a has_many association

Now let’s consider the reverse scenario, where we might have a page that displays  the different authors whose children are all the posts each has written:

	
authors {
  id
  name
  posts {
    id
    title
    description
  }
}
	

As in the previous example, this would cause an additional Posts query for each author loaded. Luckily, we can fix it using an AssociationLoader:

	
class AuthorType < Types::BaseObject
   field :id, ID, null: false
   field :name, String, null: false
   field :posts, [PostType], null: false

   def posts
     AssociationLoader.for(object.class, :posts).load(object)
   end
 end
	

Finally let’s include the AssociationLoader class in our project:

	
class AssociationLoader < GraphQL::Batch::Loader
 def self.validate(model, association_name)
   new(model, association_name)
   nil
 end

 def initialize(model, association_name)
   @model = model
   @association_name = association_name
   validate
 end

 def tload(record)
   unless record.is_a?(@model)
     raise TypeError,
           "#{@model} loader can't load association for #{record.class}"
   end
   return Promise.resolve(read_association(record)) if association_loaded?(record)

   super
 end

 # We want to load the associations on all records, even if they have the same id
 def cache_key(record)
   record.object_id
 end

 def perform(records)
   preload_association(records)
   records.each { |record| fulfill(record, read_association(record)) }
 end

 private

 def validate
   raise ArgumentError, "No association #{@association_name} on #{@model}" unless @model.reflect_on_association(@association_name)
 end

 def preload_association(records)
   ::ActiveRecord::Associations::Preloader.new.preload(
     records,
     @association_name
   )
 end

 def read_association(record)
   record.public_send(@association_name)
 end

 def association_loaded?(record)
   record.association(@association_name).loaded?
 end
end
	

And now we can see that our posts are being batched together:

-- CODE language-jsx keep-markup --
SELECT "authors".* FROM "authors"
SELECT
"posts".* FROM "posts" WHERE "posts"."author_id" IN ($1, $2, $3)  [["author_id", 1], ["author_id", 2], ["author_id", 3]]

Loading a has_many through association

Let’s enrich our Blog application by allowing Posts to have multiple tags. We’d model that relationship the following way:

	
class Post < ApplicationRecord
 belongs_to :author
 has_many :taggings
 has_many :tags, through: :taggings
end
	

The interesting part about this scenario is that we don’t actually care to expose the Taggings part of the relationship to our clients, since all they care about is the actual Tags. The Tagging is just an implementation detail, but we still somehow need to go through it to get the data the client is asking for. Rails makes it extremely easy to achieve this, but we end up with yet another N+1 scenario:

	
SELECT "posts".* FROM "posts"
SELECT "tags".* FROM "tags" INNER JOIN "taggings" ON "tags"."id" = "taggings"."category_id" WHERE "taggings"."post_id" = $1  [["post_id", 1]]
SELECT "tags".* FROM "tags" INNER JOIN "taggings" ON "tags"."id" = "taggings"."category_id" WHERE "taggings"."post_id" = $1  [["post_id", 3]]
SELECT "tags".* FROM "tags" INNER JOIN "taggings" ON "tags"."id" = "taggings"."category_id" WHERE "taggings"."post_id" = $1  [["post_id", 4]]
SELECT "tags".* FROM "tags" INNER JOIN "taggings" ON "tags"."id" = "taggings"."category_id" WHERE "taggings"."post_id" = $1  [["post_id", 2]]
	

We can see that Rails is smart enough to do the join between our tables, but we can still do even better and batch them all together by using (once again) the association loader:

	
class PostType < Types::BaseObject
 # no need to expose the taggings
 field :tags, [TagType]

 def tags
   AssociationLoader.for(object.class, :tags).load(object)
 end
end
	
	
 SELECT "posts".* FROM "posts"
 SELECT "taggings".* FROM "taggings" WHERE "taggings"."post_id" IN ($1, $2, $3, $4)  [["post_id", 1], ["post_id", 3], ["post_id", 4], ["post_id", 2]]
 SELECT "categories".* FROM "categories" WHERE "categories"."id" IN ($1, $2)  [["id", 2], ["id", 1]]
	

Loading dependent associations

Sometimes you’ll arrive at a scenario where you need to load a record that depends on the result of a previous query. Luckily, graphql-batch provides us with a mechanism to resolve lazy loading operations via a Promise API. 

Continuing with our Blog application example, let's assume that Authors have an Image associated with their accounts and that we expose a query that returns said image given a post id.

Using our RecordLoader class from previous examples, we can chain together the Post and Image queries in a way that both will be executed lazily:

	
def author_image(post_id:)
  RecordLoader.for(Post).load(post_id).then do |post|
    RecordLoader.for(Image).load(post.image_id)
  end
end
	

Summary

We’ve addressed the most common scenarios where you’d run into N+1 queries on a GraphQL endpoint in Rails. These examples demonstrate how batching is a powerful tool to address these types of problems and the possibilities that custom loaders uncover, such as having loaders for ActiveStorage attachments, Cache loaders that avoid hitting the database and much more.

A word of caution

When addressing performance issues and optimizations it’s good to remember the trade-offs and avoid early optimizations. These techniques, while powerful, do introduce complexity into your code base. To properly optimize, always measure things before and after introducing new techniques to make sure that you’re on the right track.

Felipe Moyano
Written by
Felipe Moyano
Felipe Moyano

As a Senior Software Engineer at FullStack Labs, I focus on creating top-of-the-line web applications and delivering value to clients. I have more than 7 years of experience working with frameworks like Ruby on Rails and React Native to deliver high-quality products for both enterprise clients and consumers. I love working on products that have a positive impact on people and have experience managing software projects, teams and products. Prior to FullStack Labs I was a co-founder and CTO at Savy Co., a fitness and wellness Startup based in Bogotá. I have a BS degree in Systems and Computing Engineering from the Universidad de los Andes at Bogotá.

FullStack Labs Icon

Let's Talk!

We’d love to learn more about your project. Contact us below for a free consultation with our CEO.
Projects start at $50,000.

company name
name
email
phone
Type of project
Reason for contact
How did you hear about us?
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.