Query User-Defined Functions
In GSQL, users can supplement the language by defining their own query user-defined functions (query UDFs) in C++. Query UDFs can be called in queries and subqueries to perform a set of defined actions and return a value like the built-in functions.
This page introduces the process to define a query UDF. Once defined, the new functions are added into GSQL automatically the next time GSQL is executed.
1. UDF files
UDFs are written in C++ in two files, ExprFunctions.hpp
and ExprUtil.hpp
:
-
ExprFunctions is used for functions that are called directly in GSQL queries.
-
ExprUtil contains structs or helper functions that are used by the functions in ExprFunctions. The functions defined in
ExprUtil.hpp
cannot be used in a GSQL query.
2. UDF security
The TigerGraph database implements a number of security measures to prevent unauthorized installation and modification of UDFs.
-
For self-managed databases, there are two methods for installing a UDF:
-
Via a set of gadmin config operations. Since gadmin can only be executed by the TigerGraph Linux user directly on the database server, this method has inherent privilege restrictions.
-
Via the GSQL shell
GET
andPUT
commands. This is disabled by default and can only be enabled by the TigerGraph Linux user.
-
-
UDF installation is not available as a self-service feature in TigerGraph Cloud at this time. Cloud customers can request an UDF installation as a TigerGraph support service.
All UDF files are scanned by the system to ensure that they comply with the UDF file policy as set by configuration commands in TigerGraph Server.
-
Macros with replacement are not allowed.
-
There is a file header allowlist that restricts the possible file headers to a customizable list.
-
Additional C++ features are disabled for security reasons.
For more details, see Security - UDF file scanning.
3. Define a query UDF in C++
User-defined functions are C++ functions with a certain set of allowed data types.
The function definition must include the keyword inline
.
Beginning in TigerGraph version 3.9, user-defined functions may not begin with the prefix tg_
.
This is a sample function that returns true
if the value passed to the function is greater than 3.
inline bool greater_than_three (double x) {
return x > 3;
}
Data Type | Argument | Return | Function Body |
---|---|---|---|
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
Yes |
Yes |
Yes |
|
No |
No |
Yes |
Yes |
Yes |
Yes |
|
All other C++ data types |
No |
No |
Yes |
You can write your functions in the ExprFunctions
file provided as a sample in TigerGraph Server installations, or create your own .hpp
files from scratch.
If your function requires a user-defined struct or helper function, that struct or helper function must be defined in a separate ExprUtil
file.
C++ supports changing the value of a for loop variable within a loop.
However, C++ UDFs in TigerGraph do not have this functionality.
Use a different structure such as a while loop if your function requires more complicated variable iteration.
|
The following is an example of a short ExprFunctions
file containing a single UDF that reverses a string. Note the include
statement on the first line.
#include <algorithm> // for std::reverse
inline string reverse(string str){
std::reverse(str.begin(), str.end());
return str;
}
4. Upload UDFs
There are two ways to modify the UDF files to add user-defined functions to GSQL.
Both are secured by gadmin
and require administrator-level access:
-
Store UDF files in a GitHub repository, and configure GSQL to read from the repository.
-
This is the recommended approach. It also takes the highest precedence. If GSQL is configured to read from GitHub for UDFs and the configurations are valid, UDFs in other sources are ignored.
-
-
Use the GSQL
PUT
command to upload UDFs.-
For security reasons, this method is disabled by default, and can only be enabled by the TigerGraph Linux user with access to
gadmin
.
-
4.1. Use GitHub to store UDFs
You can configure GSQL to read from a GitHub repository for ExprFunctions and ExprUtil. This is TigerGraph’s recommended approach to managing UDFs.
If GitHub access is configured, GSQL retrieves user source code files from GitHub before other sources, so long as the files exist on GitHub.
If GitHub is connected but files are missing, TigerGraph looks for a UDF file added locally with the PUT
command.
New additions to the files in the GitHub repository are instantly available in GSQL.
You can retrieve ExprFunctions.hpp and ExprUtil.hpp from AppRoot/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
and copy them to a Git repository of your choice.
The file names must be ExprFunctions.hpp
and ExprUtil.hpp
.
The gadmin
configuration parameters for setting up the connection to GitHub are as follows:
Parameter | Description | Example |
---|---|---|
|
The credential used to access the repository |
|
|
The user and repository where the files are held |
|
|
The branch to access |
|
|
Path to the directory in the repository that has ExprFunctions.hpp and ExprUtil.hpp |
|
|
Optional parameter used for GitHub Enterprise |
Use the gadmin config set
command to configure the aforementioned parameters to connect GSQL to the GitHub repository hosting your files.
The following is an example configuration.
Remember to run gadmin config apply
after changing the parameters.
If GSQL is already running, run gadmin restart all
to restart GSQL before the UDFs become available.
gadmin config set GSQL.GithubUserAcessToken anonymous
gadmin config set GSQL.GithubRepository tigergraph/ecosys
gadmin config set GSQL.GithubBranch demo_github
gadmin config set GSQL.GithubPath sample_code/src
gadmin config apply
After the parameters are successfully configured, you can access your UDFs in new queries right away.
4.2. Upload UDFs with GSQL PUT
command
TigerGraph offers the ability to upload UDFs directly from GSQL using the PUT
command.
This command is securely managed by the management tool gadmin
and is disabled by default, requiring a Tigergraph Linux user with gadmin access to enable it.
After making modifications to your UDF files, it is recommended that you disable the ability to upload UDF files using PUT
until you need to modify the files again
4.2.1. Enable uploading UDFs through GSQL
Run the following command to enable uploading query UDFs through the GSQL PUT
command:
$ gadmin config set GSQL.UDF.EnablePutExpr true
$ gadmin config apply
$ gadmin restart gsql
4.2.2. Modify current query UDF file
Use the GET ExprFunctions
command in GSQL to copy the current set of functions into a local file.
The path can be absolute or relative to your current directory, but the file extension must be .hpp
:
GSQL > GET ExprFunctions TO "/example/path/to/ExprFunctions.hpp"
GSQL > GET ExprFunctions TO "./ExprFunctions.hpp"
If your query UDF requires a user-defined struct or helper function, also use the GET ExprUtil
command to download the current ExprUtil
file:
GSQL > GET ExprUtil TO "/example/path/ExprUtil.hpp"
4.2.3. Define your function
Write your function in ExprFunctions and any helper functions in ExprUtil as described in Define a query UDF in C++.
4.2.4. Upload the updated query UDF file
After you have defined the function, use the PUT
command to upload the files you modified.
GSQL > PUT ExprFunctions FROM "/path/to/udf_file.hpp"
PUT ExprFunctions successfully.
GSQL > PUT ExprUtil FROM "/path/to/utils_file.hpp"
PUT ExprUtil successfully.
The PUT
command automatically uploads the files in all nodes in a cluster and updates all existing files.
Once the files are stored, you can call the UDFs in a query the next time GSQL is executed. This includes the next time you start the GSQL shell or execute GSQL scripts from a bash shell. If you are using GraphStudio, you can use the queries without needing to refresh the page.
CREATE QUERY udf_example() FOR GRAPH Minimal_Net {
DOUBLE x;
BOOL y;
x = 3.5;
PRINT greater_than_three(x);
y = greater_than_three(2.5);
PRINT y;
}
5. Example
Suppose you are working in a distributed environment and want to add a function rng()
that that returns a random double between 0 and 1.
In this example, suppose you want to modify the ExprFunctions file locally rather than using GitHub.
Start by enabling uploading query UDFs with the PUT
command:
$ gadmin config set GSQL.UDF.EnablePutExpr true
$ gadmin config apply
$ gadmin restart gsql
After enabling, download the current UDF file with the GET
command.
In this example, we place our download in the current working directory and use the name udf.hpp
in contrast to above, where it was named ExprFunctions.hpp
, to illustrate the flexibility of the naming scheme.
GSQL > GET ExprFunctions TO "./udf.hpp"
In the downloaded file, add the function definition for the rng()
function.
inline double rng() {
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution < double > distribution(0.0, 1.0);
return distribution(gen);
}
After adding your query, use the PUT
command to store the file in all nodes in a cluster:
GSQL > PUT ExprFunctions FROM "./udf.hpp"
PUT ExprFunctions successfully.
The file has been stored and the UDF has now been added to GSQL.
You can add it to a query, then run the commands INSTALL QUERY
and RUN QUERY
to test the rng()
function.
After making modifications, you should disable the ability to upload UDFs to secure your server:
$ gadmin config set GSQL.UDF.EnablePutExpr false
$ gadmin config apply
$ gadmin restart gsql