Query User-Defined Functions
In GSQL, users can supplement the language by defining their own query user-defined functions (query UDF). Query UDFs can be called in queries and subqueries to perform a set of defined actions and return a value like the built-in functions.
This page introduces the process to define a query UDF. Once defined, the new functions will be added into GSQL automatically next time GSQL is executed.
Define a query UDF
Below are the steps to add a Query UDF to GSQL:
Step 1: Download current query UDF file
Use the GET ExprFunctions
command in GSQL to download the current UDF file to any location on your machine. The file and the directores will be created if they do not exist, and the file must end with the file extention .hpp
:
If your query UDF requires a user-defined struct or helper function, also use the GET ExprUtil
command to download the current ExprUtil
file:
Step 2: Define C++ function
Define the C++ function inside the UDIMPL
namespace inside of the UDF file you just downloaded in Step 1. The definition of the function should include the keyword inline
. Only bool
, int
, float
, double
, and string
(NOT std::string
) are allowed as the return value type and the function argument type. However, any C++ type is allowed inside a function body.
If the function requires a user-defined struct or helper function, define it in the ExprUtil
file you downloaded in Step 1.
Below is an example of a query UDF definition:
If any code in ExprFunctions.hpp
or ExprUtil.hpp
causes a compilation error, GSQL cannot install any GSQL query, even if the GSQL query doesn't call any query UDF. Therefore, please test each new query UDF after adding it. One way of testing a function is to create a new file test.cpp
and compile it:
> g++ test.cpp
> ./a.out
You might need to remove the include header #include <gle/engine/cpplib/headers.hpp>
in ExprFunctions.hpp
and ExprUtil.hpp
in order to compile.
Step 3: Upload files
After you have defined the function, use the PUT
command to upload the files you modified.
The PUT
command will automatically upload the files to all nodes in a cluster. Once the files are uploaded, you will be able to call the query UDF the next time GSQL is executed. This includes the next time you start the GSQL shell or executing GSQL scripts from a bash shell.
Example
Suppose you are working in a distributed environment and want to add a function that that returns a random double between 0 and 1.
Start by downloading the current UDF file with the GET
command:
In the downloaded file, add the function definition for function rng
and add the necessary include directives at the top:
Lastly, use the PUT
command to upload the file. This will uploaded the file to all nodes in a cluster:
The UDF has now been added to GSQL and you can start using the function in GSQL queries.
Last updated