Query User-Defined Functions

In GSQL, users can supplement the language by defining their own query user-defined functions (query UDFs) in C++. Query UDFs can be called in queries and subqueries to perform a set of defined actions and return a value like the built-in functions.

This page introduces the process to define a query UDF. Once defined, the new functions are added into GSQL automatically the next time GSQL is executed.

UDFs are written in C++ in two files, ExprFunctions.hpp and ExprUtil.hpp. ExprFunctions is used for functions that are called directly in GSQL queries, while ExprUtil contains structs or helper functions that used called by the functions in ExprFunctions.

These files are stored as .hpp files in AppRoot/dev/gdk/gsql/src/QueryUdf/ in a TigerGraph Server installation.

There are two ways to modify these files to add user-defined functions to GSQL:

  • Store the files in a GitHub repository, and configure GSQL to read from the repository.

  • Use GET and PUT commands to download, modify, and store the files locally.

    • The GET command requires the READ_FILE privilege.

    • The PUT command requires the WRITE_FILE privilege.

It is strongly recommended that you enable GSQL user authentication by changing the password of the default user tigergraph to protect your UDF files. If you don’t enable user authentication, anyone with GSQL shell access can log in as the default user with superuser privileges and modify your UDF files.

This section first explains how to define a query UDF, then how to integrate query UDFs into GSQL.

Define a query UDF in C++

User-defined functions are C++ functions with a certain set of allowed data types. The function definition must include the keyword inline.

Example 1. Sample user-defined function

This is a sample function that returns true if the value passed to the function is greater than 3.

inline bool greater_than_three (double x) {
  return x > 3;
}
Table 1. Allowed Data Types in User-Defined Functions
Data Type Argument Return Function Body

bool

Yes

Yes

Yes

int

Yes

Yes

Yes

float

Yes

Yes

Yes

double

Yes

Yes

Yes

string

Yes

Yes

Yes

std::string

No

No

Yes

Accumulators

Yes

Yes

Yes

All other C++ data types

No

No

Yes

You can write your functions in the ExprFunctions file provided as a sample in TigerGraph Server installations, or create your own .hpp files from scratch.

If your function requires a user-defined struct or helper function, that struct or helper function must be defined in a separate ExprUtil file.

Below is an example of a short ExprFunctions file containing a single UDF that reverses a string. Note the include statement on the first line.

New code in ExprFunctions.hpp
#include <algorithm>  // for std::reverse
inline string reverse(string str){
  std::reverse(str.begin(), str.end());
  return str;
}

Use GitHub to store UDFs

You can configure GSQL to read from a GitHub repository for ExprFunctions and ExprUtil.

If GitHub access is configured, GSQL will retrieve user source code files from GitHub before files added via PUT, so long as the files exist.

TigerGraph only allows one UDF file at a time. Files on GitHub take priority. If GitHub is connected but files are missing, TigerGraph will look for an UDF file added via PUT.

New additions to the files in the GitHub repository are instantly available in GSQL.

You can retrieve ExprFunctions.hpp and ExprUtil.hpp from AppRoot/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp and copy them to a Git repository of your choice.

When the files are hosted on GitHub, the typedef statements and include directives must be part of the .hpp files. These are included by default in the files located in the /QueryUDF/ folder.

The file names must be ExprFunctions.hpp and ExprUtil.hpp. This is in contrast to the PUT method, where the files could have any file name.

The gadmin configuration parameters for setting up the connection to GitHub are as follows:

Table 2. Parameters for GitHub
Parameter Description Example

GSQL.GithubUserAcessToken

The credential used to access the repository

anonymous

GSQL.GithubRepository

The user and repository where the files are held

sample_user/repository

GSQL.GithubBranch

The branch to access

main

GSQL.GithubPath

Path to the directory in the repository that has ExprFunctions.hpp and ExprUtil.hpp

src/

GSQL.GithubUrl

Optional parameter used for GitHub Enterprise

https://api.github.com

Use the gadmin config set command to configure the aforementioned parameters to connect GSQL to the GitHub repository hosting your files.

Below is an example configuration. Remember to run gadmin config apply after changing the parameters. If GSQL is already running, you will need to run gadmin restart all to restart GSQL before the UDFs become available.

gadmin config set GSQL.GithubUserAcessToken anonymous
gadmin config set GSQL.GithubRepository tigergraph/ecosys
gadmin config set GSQL.GithubBranch demo_github
gadmin config set GSQL.GithubPath sample_code/src
gadmin config apply

After the parameters are successfully configured, you can access your UDFs in new queries right away.

Store a UDF file locally

Step 1: Modify current query UDF file

Use the GET ExprFunctions command in GSQL to copy the current set of functions into a local file. The path can be absolute or relative to your current directory, but the file extension must be .hpp:

GSQL > GET ExprFunctions TO "/example/path/to/ExprFunctions.hpp"
GSQL > GET ExprFunctions TO "./ExprFunctions.hpp"

If your query UDF requires a user-defined struct or helper function, also use the GET ExprUtil command to download the current ExprUtil file:

GSQL > GET ExprUtil TO "/example/path/ExprUtil.hpp"

Step 2: Define your function

Write your function in ExprFunctions and any helper functions in ExprUtil.

If any code in ExprFunctions.hpp or ExprUtil.hpp causes a compilation error, GSQL will be unable to install any new queries, whether containing user-defined functions or not.

Step 3: Store the updated query UDF file

After you have defined the function, use the PUT command to store the files you modified.

GSQL > PUT ExprFunctions FROM "/path/to/udf_file.hpp"
PUT ExprFunctions successfully.
GSQL > PUT ExprUtil FROM "/path/to/utils_file.hpp"
PUT ExprUtil successfully.

The PUT command will automatically store the files in all nodes in a cluster, overwriting any existing files that contain UDFs.

Once the files are stored, you will be able to call the Query UDF the next time GSQL is executed. This includes the next time you start the GSQL shell or execute GSQL scripts from a bash shell. If you are using GraphStudio, however, you will be able to use the queries without needing to refresh the page.

Example of a GSQL query that uses the UDF
CREATE QUERY udfExample() FOR GRAPH minimalNet {
  DOUBLE x;
  BOOL y;

  x = 3.5;
  PRINT greater_than_three(x);
  y = greater_than_three(2.5);
  PRINT y;

}

Example

Suppose you are working in a distributed environment and want to add a function rng() that that returns a random double between 0 and 1. In this example, suppose you want to modify the ExprFunctions file locally rather than using GitHub.

Start by downloading the current UDF file with the GET command. In this example, we will place our download in the working directory and use the name udf.hpp in contrast to above, where it was named ExprFunctions.hpp, to illustrate the flexibility of the naming scheme.

GSQL > GET ExprFunctions TO "./udf.hpp"

In the downloaded file, add the function definition for the rng() function.

udf.hpp
inline double rng() {
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_real_distribution < double > distribution(0.0, 1.0);

    return distribution(gen);
    }

After adding your query, use the PUT command to store the file in all nodes in a cluster:

GSQL > PUT ExprFunctions FROM "./udf.hpp"
PUT ExprFunctions successfully.

The file has been stored and the UDF has now been added to GSQL. You can add it to a query, then run the commands INSTALL QUERY and RUN QUERY to test the rng() function.

The following commands demonstrate the process with a one-line query called rngExample that simply prints the output of the new function rng().

GSQL > CREATE QUERY rngExample() FOR GRAPH example_graph {PRINT rng();}
Successfully created queries: [rngExample].

GSQL > INSTALL QUERY rngExample
Start installing queries, about 1 minute ...
rngExample query: curl -X GET 'http://127.0.0.1:9000/query/example_graph/rngExample'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.
Select 'm1' as compile server, now connecting ...
Node 'm1' is prepared as compile server.

[=========================================================================================] 100% (1/1)
Query installation finished.

GSQL > RUN QUERY rngExample()
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"rng()": 0.51352}]
}