Query User-Defined Functions

In GSQL, users can supplement the language by defining their own query user-defined functions (query UDF) in C++. Query UDFs can be called in queries and subqueries to perform a set of defined actions and return a value like the built-in functions.

This page introduces the process to define a query UDF. Once defined, the new functions are added into GSQL automatically the next time GSQL is executed.

Query UDFs are user-editable in the form of an .hpp file locally on your machine. Creating, editing, or removing Query UDFs is done by downloading them from the server, editing the file, and re-uploading through GET and PUT commands.

Uploading and downloading the UDF file (ExprFunctions.hpp) requires the following privileges:

  • The GET command requires the READ_FILE privilege.

  • The PUT command requires the WRITE_FILE privilege.

It is strongly recommended that you enable GSQL user authentication by changing the password of the default user tigergraph to protect your UDF files. If you don’t enable user authentication, anyone with GSQL shell access can log in as the default user with superuser privileges and modify your files.

Define a query UDF

Below are the steps to add a Query UDF to GSQL:

Step 1: Download current query UDF file

Use the GET ExprFunctions command in GSQL to download the current list of functions into a local file. The path can be absolute or relative to your current directory, but the file extension must be .hpp:

GSQL > GET ExprFunctions TO "/example/path/to/ExprFunctions.hpp"
GSQL > GET ExprFunctions TO "./ExprFunctions.hpp"

If your query UDF requires a user-defined struct or helper function, also use the GET ExprUtil command to download the current ExprUtil file:

GSQL > GET ExprUtil TO "/example/path/ExprUtil.hpp"

Step 2: Define C++ function

Define the C++ function inside the UDF file you just downloaded in Step 1. The definition of the function should include the keyword inline.

Below is a list of data types and where each is allowed in a function.

Table 1. Allowed Data Types

Data Type

Argument

Return

Function Body

bool

Yes

Yes

Yes

int

Yes

Yes

Yes

float

Yes

Yes

Yes

double

Yes

Yes

Yes

string

Yes

Yes

Yes

std::string

No

No

Yes

Accumulators

Yes

Yes

Yes

All other C++ data types

No

No

Yes

If the function requires a user-defined struct or helper function, define it in the ExprUtil file you downloaded in Step 1.

Below is an example of a query UDF definition that includes two separate functions. The first function returns true if the passed value is greater than 3, and the second reverses a string.

New code in ExprFunctions.hpp
#include <algorithm>  // for std::reverse
inline bool greater_than_three (double x) {
  return x > 3;
}

inline string reverse(string str){
  std::reverse(str.begin(), str.end());
  return str;
}

If any code in ExprFunctions.hpp or ExprUtil.hpp causes a compilation error, GSQL will be unable to install any new queries, whether containing user-defined functions or not.

Step 3: Upload files

After you have defined the function, use the PUT command to upload the files you modified.

GSQL > PUT ExprFunctions FROM "/path/to/udf_file.hpp"
PUT ExprFunctions successfully.
GSQL > PUT ExprUtil FROM "/path/to/utils_file.hpp"
PUT ExprUtil successfully.

The PUT command will automatically upload the files to all nodes in a cluster, overwriting any existing files that contain UDFs.

Once the files are uploaded, you will be able to call the query UDF the next time GSQL is executed. This includes the next time you start the GSQL shell or execute GSQL scripts from a bash shell. If you are using GraphStudio, however, you will be able to use the queries without needing to refresh the page.

Example of a GSQL query that uses the UDF
CREATE QUERY udfExample() FOR GRAPH minimalNet {
  DOUBLE x;
  BOOL y;

  x = 3.5;
  PRINT greater_than_three(x);
  y = greater_than_three(2.5);
  PRINT y;

  PRINT reverse("abc");
}

Example

Suppose you are working in a distributed environment and want to add a function rng() that that returns a random double between 0 and 1.

Start by downloading the current UDF file with the GET command. In this example, we will place our download in the working directory and use the name udf.hpp in contrast to above, where it was named ExprFunctions.hpp.

GSQL > GET ExprFunctions TO "./udf.hpp"

In the downloaded file, add the function definition for the rng() function.

udf.hpp
inline double rng() {
    std::random_device rd;
    std::mt19937 gen(rd());
    std::uniform_real_distribution < double > distribution(0.0, 1.0);

    return distribution(gen);
    }

After adding your query, use the PUT command to upload the file. This will upload the file to all nodes in a cluster:

GSQL > PUT ExprFunctions FROM "./udf.hpp"
PUT ExprFunctions successfully.

The UDF has now been added to GSQL. You can add it to a query, then run the commands INSTALL QUERY and RUN QUERY to test the rng() function. The following commands demonstrate the process with a one-line query called rngExample that simply prints the output of the new function rng().

GSQL > CREATE QUERY rngExample() FOR GRAPH example_graph {PRINT rng();}
Successfully created queries: [rngExample].

GSQL > INSTALL QUERY rngExample
Start installing queries, about 1 minute ...
rngExample query: curl -X GET 'http://127.0.0.1:9000/query/example_graph/rngExample'. Add -H "Authorization: Bearer TOKEN" if authentication is enabled.
Select 'm1' as compile server, now connecting ...
Node 'm1' is prepared as compile server.

[=========================================================================================] 100% (1/1)
Query installation finished.

GSQL > RUN QUERY rngExample()
{
  "error": false,
  "message": "",
  "version": {
    "schema": 0,
    "edition": "enterprise",
    "api": "v2"
  },
  "results": [{"rng()": 0.51352}]
}