Security

Defending Against GraphQL Attacks: A Deep Dive into Common Vulnerabilities

This article is an in-depth look at the most common GraphQL vulnerabilities, why they occur, and how they can be mitigated.

Mon 21 October 2024

Defending Against GraphQL Attacks: A Deep Dive into Common Vulnerabilities

GraphQL has revolutionized API development with its flexible and efficient approach to data querying. However, like any other technology, it has its own set of security challenges.

This article reflects our experience at Ostorlab in automating the detection and testing of GraphQL vulnerabilities.

We will take an in-depth look at the most common GraphQL vulnerabilities, why they occur, and how they can be mitigated.

What is GraphQL?

GraphQL is a query language for APIs and a runtime for executing those queries. It allows clients to request exactly the data they need, no more and no less, which can significantly optimize API interactions.

Originally developed by Facebook in 2012 and released as open-source in 2015, GraphQL was designed to overcome the limitations of traditional REST APIs, such as over-fetching or under-fetching data.

According to Wappalyzer, more than 176,000 websites are using GraphQL as of now. This number is rapidly growing as more companies adopt GraphQL, Including AWS, PayPal, and GitHub. Check GraphQL Landscape for the full list.

GraphQL Landscape
GraphQL Landscape
Example of a GraphQL Request:

query CurrentUser {
  currentUser {
    name
    age
  }
}

Response:

{
   "currentUser":{
      "name":"John Doe",
      "age":23
   }
}

Key Features

- Single Endpoint: GraphQL APIs expose a single endpoint through which all interactions occur.

- Precise Data Fetching: Clients request exactly what they need, reducing network overhead.

- Strongly Typed Schema: A well-defined schema specifies types and relationships, enabling powerful queries. GraphQL’s built-in schema integrity and type safety eliminates the need for data versioning.

GraphQL Features
GraphQL Key Features

Types in GraphQL

GraphQL schemas are built on three main types:

- Root types
- Scalar types
- Object types

Types in GraphQL
Types in GraphQL

Root Types

1. Queries: Retrieve or fetch data from the server.

2. Mutations: Modify or manipulate data (create, update, delete).

3. Subscriptions: Allow clients to receive real-time data updates.

Types in GraphQL 2
Types in GraphQL

Scalar Types

Scalar types represent simple values like integers, strings, booleans, etc. These are the basic building blocks of a schema.

Object Types

Object types represent complex entities with multiple fields. These fields can be scalars or other object types. For example:

type Human {
  id: String
  name: String
  homePlanet: Planet
}

type Planet {
  id: String
  name: String
}

 Object Types
Graph of Types

Discovery of GraphQl Schema and introspection.

GraphQL provides introspection, allowing developers to query the schema to understand the available types, queries, and mutations (Think of it as an OPTIONS request in REST).

Although introspection is not a security issue by itself and is useful in development, attackers can exploit it to better understand API capabilities and potentially abuse your GraphQL API.

GraphQl Schema and introspection
GraphQl Schema introspection

Introspection is Enabled

Example of an Introspection Query to get all mutations:

Request

{
  __schema {
    mutationType {
      kind
      name
      fields {
        name
        description
        deprecationReason
      }
    }
  }
}

Response

{
   "__schema":{
      "mutationType":{
         "kind":"OBJECT",
         "name":"Mutation",
         "fields":[
            {
               "name":"createUser",
               "description":"Create a new user"
            }
         ]
      }
   }
}

Example of introspection that can dump the entire schema, including queries, mutations and objectTypes:

query IntrospectionQuery {
  __schema {

    queryType { name }
    mutationType { name }
    subscriptionType { name }
    types {
      ...FullType
    }
    directives {
      name


      locations
      args {
        ...InputValue
      }
    }
  }
}

fragment FullType on __Type {
  kind
  name
  fields(includeDeprecated: true) {
    name
    args {
      ...InputValue
    }
    type {
      ...TypeRef
    }
    isDeprecated
    deprecationReason
  }
  inputFields {
    ...InputValue
  }
  interfaces {
    ...TypeRef
  }
  enumValues(includeDeprecated: true) {
    name

    isDeprecated
    deprecationReason
  }
  possibleTypes {
    ...TypeRef
  }
}

fragment InputValue on __InputValue {
  name

  type { ...TypeRef }
  defaultValue


}
fragment TypeRef on __Type {
  kind
  name
  ofType {
    kind
    name
    ofType {
      kind
      name
      ofType {
        kind
        name
        ofType {
          kind
          name
          ofType {
            kind
            name
            ofType {
              kind
              name
              ofType {
                kind
                name
                ofType {
                  kind
                  name
                  ofType {
                    kind
                    name
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Bypassing Regex Filters on `__schema`

When developers disable introspection, they might use a regex to exclude the __schema keyword in queries. You can try characters like spaces, new lines, and commas, as these are ignored by GraphQL but not by flawed regex filters.

For example, if the developer has only excluded __schema{, the introspection query below, with a newline after __schema, would not be excluded:

{
    "query": "query{__schema
    {queryType{name}}}"
}

Schema from Suggestions and Error Handling.

Sometimes, even when introspection is disabled, it is still possible to infer parts of the GraphQL schema by leveraging suggestions and error messages provided by the server. For example, when a request contains a field that is not correctly spelled but is close to an existing field, the GraphQL server can return an error suggesting the correct field name.

GraphQl Schema and introspection
Schema from Suggestions and Error Handling

Example:

example-of-schema-suggestions
Example of GraphQl Schema Suggestions

The same goes for inferring mutations and arguments.

downlexample-of-mutation-suggestions
Example of GraphQl Schema Suggestions for arguments

Disabling introspection

When possible, introspection should be disabled in production environments and enabled only during development. According to the GraphQL specification:

"All types and directives defined within a schema must not have a name that begins with __ (two underscores), as this is reserved exclusively for GraphQL's introspection system."

A common approach to disable introspection is excluding fields that start with __.

This method effectively limits introspection, but it's essential to apply this configuration consistently across all environments.

Denial of Service (DoS) Attacks

In this section, we will have a deep dive into various DoS (Denial of Service) attacks that attackers can (and will) employ against your application.

GraphQL exposes application data as a graph, enabling clients to retrieve data by traversing the relationships between nodes (types).

Most of the vulnerabilities listed below are caused by GraphQL's graph part.

Circular Fragments (Low severity)

GraphQL fragments allow you to reuse query logic by defining reusable fields.
However, circular fragments can be used to craft queries that consume excessive server resources.

Example of Circular Fragment:

fragment UserFields on User {
  comments {
    ...CommentFields
  }
}

fragment CommentFields on Comment {
  owner {
    ...UserFields
  }
}

As you can see, the UserFields fragment is referencing CommentFields, and CommentFields is referencing back to UserFields, which will cause an infinite loop.

Example of Circular Fragment
Example of Circular Fragment

GraphQL servers commonly identify and reject queries with cyclic fragment references, returning an error message.

However, in some cases, the server's implementation may not strictly adhere to the GraphQL specification.

Therefore, extra measures need to be taken:
Circular Fragment Detection: Use schema analysis tools like GraphQL-ESLint to detect and prevent circular fragments.

Implement Query Cost Analysis: Assign costs to fields and queries, rejecting those that exceed a predefined limit. For instance, if you are using Apollo server, You can use graphql-query-complexity library:

const {
    queryComplexity,
    simpleEstimator
} = require('graphql-query-complexity');

const server = new ApolloServer({

    schema,

    validationRules: [

        queryComplexity({

            estimators: [

                simpleEstimator({
                    defaultComplexity: 1
                })

            ],

            maximumComplexity: 1000, // Reject queries that exceed this cost

        }),

    ],

});

Alias overloading (Low severity)

Alias overloading in GraphQL happens when an attacker uses a large number of aliases in a query to overwhelm the server's processing capabilities.

In GraphQL, aliases allow clients to request the same field multiple times with different names. However, excessive use of aliases in a single query can lead to Denial of Service (DoS) attacks by exhausting the server’s resources. This attack can degrade performance or cause complete service outages.

For example:

query AliasOverLoading {
    alias1: __typename
    alias2: __typename
    alias3: __typename
    alias4: __typename
    alias5: __typename
}

Security Impact of Alias Overloading:

Alias overloading poses significant security risks for GraphQL APIs. One major consequence is Denial of Service (DoS), where a query with an excessive number of aliases forces the server to allocate disproportionate resources to process and respond. This can slow down or even crash the service, disrupting availability. Additionally, the server might experience resource exhaustion, as handling numerous aliases simultaneously can lead to memory depletion, CPU spikes, or severely degraded performance.

Several strategies can be implemented to mitigate the risk of Alias Overloading. First, enforcing query timeouts is an effective measure. By setting a maximum execution time, the server can automatically terminate queries that take too long to resolve, preventing malicious queries from overloading resources and causing a DoS.

Another important measure is to limit the number of aliases, as covered in the previous section.

Circular references (Medium severity)

Circular references
Circular references
Circular references occur when GraphQL object types refer back to each other, which can lead to deeply nested queries.

An attacker can exploit this to craft a recursive query, overwhelming the server's resources and causing Denial of Service and Resource Exhaustion.

While testing the impact of this vulnerability (on a real target), using Django, graphene-python and a Cloud SQL instance with 42 GB of memory,8 vCPUs and 2.2 TB of SSD storage.

We’ve managed to craft a recursive query that caused our Cloud SQL instance, to reach 100% CPU utilization, and the issue remained unit we had to manually kill those hanging sql transaction,

Cloud-SQL-instance-with-graphql
Recursive query that caused our **Cloud SQL instance,** to reach 100% CPU utilization

Let's say we want to expose a user alongside a list of comments the user has made using the following types:

type User {
  id: ID!
  comments: Comment
  username: String
}

type Comment {
  id: ID
  owner: User!
  content: String!
}

The type User introduces a circular reference, which can be used to create complex or recursive queries.

Example of a Circular Query:

query {
  user(id: 1) {
    id
    comments {
      id
      owner {
        id
        comments {
          id
          owner {
            id
... # can go forever 
          }
        }
      }
    }
  }
}

circular reference

To mitigate the risk of circular references in GraphQL, you can:

Refactor the schema to Avoid direct circular references where possible. For example, use a new type `LimitedUser` type in the `Comment` type to break the loop:

type User {
  id: ID!
  username: String!
  comments: [Comment]!
}

type LimitedUser {
  id: ID!
  username: String!
}

type Comment {
  id: ID!
  owner: LimitedUser!
  content: String!
}

Fixed circular reference

Still this solution may not be effective as It may break the clients and needs major changes to the codebase.

Enforce query depth limits: Set a limit on how deep queries can go to prevent excessively deep queries.
Most GraphQL server implementations provide this feature by default for example, in Apollo Server:

const depthLimit = require('graphql-depth-limit');
const server = new ApolloServer({
schema,
validationRules: [depthLimit(10)], // Set maximum query depth to 10
});

Brute Force Login Using Alias Batching (medium)

Brute Force Login Using Alias Batching in GraphQL involves an attacker leveraging the alias feature to automate login attempts, making it easier to submit numerous credential combinations in a single query.

In GraphQL, aliases allow clients to send multiple versions of the same query under different names. Attackers exploit this by batching login requests within a single query using different alias names for each attempt. This can lead to an efficient brute force attack, bypassing traditional rate-limiting protections and overwhelming the authentication system with login attempts.

Example:

query loginBatch {
  login1: login(username: "user1", password: "password1") {
    token
  }

  login2: login(username: "user2", password: "password2") {
    token
  }

  login3: login(username: "user3", password: "password3") {
    token
  }
 ...
}

To mitigate this issue, you should limit the number of aliases allowed in a single query. By configuring server-side restrictions or using tools like GraphQL Armor, you can cap the number of aliases per request

Limiting the number of failed login attempts per user can be effective to resolving the issue.

Authorization Misconfiguration in GraphQL (High severity)

In GraphQL APIs, a severe vulnerability can occur when access to sensitive data is correctly restricted in one query path but left exposed in another due to inconsistent access control checks. Attackers can retrieve unauthorized data by taking an alternative query route that bypasses restrictions.

This kind of vulnerability is easy to occur.

Here is a simple example (with Django and django-graphene):

We have a small social network platform where users can talk to each other and also publish posts.

from django.db import models
from django.contrib.auth.models import AbstractUser

class SocialUser(AbstractUser):
    pass

class Post(models.Model):
    user = models.ForeignKey(SocialUser, related_name='posts', on_delete=models.CASCADE)
    content = models.TextField()

class Discussion(models.Model):
    user = models.ForeignKey(SocialUser, related_name='discussions', on_delete=models.CASCADE)
    content = models.TextField()

And using `graphene_django`, we have the following types:

class DiscussionType(DjangoObjectType):
    class Meta:
        model = models.Discussion

class PostType(DjangoObjectType):
    class Meta:
        model = models.Post

class UserProfileType(DjangoObjectType):
    class Meta:
        model = models.SocialUser

We declare the following queries:

class Query(graphene.ObjectType):
    my_discussions = graphene.List(DiscussionType)
    posts = graphene.List(PostType)
    my_profile = graphene.Field(UserProfileType)

    def resolve_my_discussions(self, info):
        user = info.context.user
        if user.is_anonymous:
            raise Exception("Not logged in!")
        return user.discussions.all() # Only the discussions of the logged in user

    def resolve_posts(self, info):
        user = info.context.user
        if user.is_anonymous:
            raise Exception("Not logged in!")
        return models.Post.objects.all() # All posts are public of course.

    def resolve_my_profile(self, info):
        user = info.context.user
        if user.is_anonymous:
            raise Exception("Not logged in!")
        return user

At first glance, nothing is wrong here. A user can only see his own discussions

discussions-query

You can also retrieve a list of public posts from other users.

posts-query

But if you look closer at the type `PostType`, you can see that it has a UserProfileType field.

PostType

This is because the Model post does have a `ForeignKey`to the model SocialUser and django-graphene exposed using the first Object Type it can find.

code-of-types

Therefore, we can execute the following query to gain unauthorized access to other users' discussions.

unauthorized access to other users

GraphQL Extensions: Debug mode as an example.

What are GraphQL Extensions?

GraphQL extensions are pieces of code that add new features to your GraphQL setup. They help you do things that GraphQL doesn't usually do on its own, like adding new ObjectTypes and exposing debug information.

Although this is a useful feature, some of those extensions that are implemented by default in some graphql servers such as Graphene-Django and graphql-ruby can cause some serious security problems.

GraphQL Debug (High severity)

When addressing issues in GraphQL developers utilize debugging information applications.

When debug mode is enabled, a GraphQL server provides verbose messages in response to client requests for backend server errors that are typically not displayed.

For example, rather than returning standard error messages, a client might receive a stack trace and detailed information about the error.

GraphQL Debug Mode is implemented by default in many graphql implementations, but not all of them (see the table below).

While this is invaluable during development, leaving it enabled in a production environment can expose sensitive information about the server's internal structure and implementation details.

When debug mode is enabled, error responses may include:

  • Detailed stack traces
  • Database query information (including SQL queries)
  • Internal server paths and file names
  • Sensitive configuration details

Example of a response with debug mode enabled in Django Graphene which wraps around Django debug:

{
  query {
    nonexistentField {
      id
    }
  }
  _debug {
    sql {
      sql
      transId
      transStatus
      isoLevel
      encoding
    }
  }
}
{
   "errors":[
      {
         "message":"Cannot query field \"nonexistentField\" on type \"Query\".",
         "locations":[
            {
               "line":3,
               "column":3
            }
         ],
         "path":[
            "nonexistentField"
         ]
      }
   ],
   "data":null,
   "extensions":{
      "debug":{
         "sql":[
            {
               "sql":"SELECT \"auth_user\".\"id\", \"auth_user\".\"password\", \"auth_user\".\"last_login\", \"auth_user\".\"is_superuser\", \"auth_user\".\"username\", \"auth_user\".\"first_name\", \"auth_user\".\"last_name\", \"auth_user\".\"email\", \"auth_user\".\"is_staff\", \"auth_user\".\"is_active\", \"auth_user\".\"date_joined\" FROM \"auth_user\" WHERE \"auth_user\".\"id\" = %s LIMIT 21",
               "time":"0.000",
               "params":[
                  "1"
               ]
            }
         ],
         "python_version":"3.8.5",
         "django_version":"3.1.3",
         "graphene_version":"2.1.8",
         "graphql_version":"3.0.0"
      }
   }
}

Security Impact of GraphQL Debug Mode:

  • Information Disclosure: Debug information can reveal internal details about the GraphQL server's structure, dependencies, and potential vulnerabilities, which could be leveraged by attackers to plan more targeted attacks.

  • Sensitive Data Exposure: Stack traces, error messages, and especially SQL queries might inadvertently include sensitive information such as database structure, internal file paths, or environment variables.

  • Easier Exploitation: Detailed error messages and SQL queries can assist attackers in refining their attacks by providing immediate feedback on what worked or didn't work in their malicious queries.

  • Performance Information Leakage: The timing information provided for SQL queries could be used by attackers to infer the structure of the database or to perform timing attacks.

To mitigate the risks associated with GraphQL Debug Mode, First and foremost.

It is essential to disable debug mode in production environments, for example, if you are using Django with Graphene-Django debug mode in graphql is controlled with the same settings as debug in Django.

settings.py
# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = False

Conclusion

While GraphQL offers significant advantages for API development, such as flexible queries and efficient data fetching, it also introduces unique security challenges. Addressing these challenges requires a deep understanding not only of potential vulnerabilities but also of the specific GraphQL server you're using and the security features it supports.

This is a list of the most popular graphql servers and what features they support:

✅ - Enabled by Default
⚠️ - Disabled by Default
❌ - No Support

graphql-threat-matrix

Source: graphql-threat-matrix

We do newsletters, too


Get the latest news, updates, and product innovations from Ostorlab right in your inbox.

Table of Contents