Workload Management
Certain TigerGraph operations, such as running online analytical processing (OLAP) queries that touch a lot of data, can be memory-intensive. TigerGraph provides the following mechanisms for you to manage workload in your TigerGraph instances.
Workload Queue
You can configure workload queues so that queries are routed to the appropriate queues during runtime. Each queue has a few properties, such as the maximum number of concurrent queries allowed and the maximum number of queries that can be queued so it can help prevent the system overload. You can grant workload queues to users based on their roles so that the users can submit queries to the appropriate workload queues to be managed.
What APIs are managed by the workload queue?
The following types of requests will be routed to either the default workload queue or the one specified by the user:
-
Run installed queries.
-
Interpret queries.
-
Run heavy built-in queries, mostly used to "Explore Graph" in GraphStudio.
Configurations
You can toggle the workload queue feature on and off, and add, update, or delete workload queues as you need.
Put Workload Queue
POST /gsqlserver/gsql/workload-manager/configs
Upload the workload queue configs.
Request Body
The request body expects a JSON object with the following schema:
{
"isEnabled": true,
"queues": {
"OLTP": {
"description": "OLTP queries",
"isDefault": false,
"maxConcurrentQueries": 100,
"maxDelayQueueSize": 200
},
"scheduled_jobs": {
"description": "Scheduled jobs",
"maxConcurrentQueries": 10,
"maxDelayQueueSize": 20
},
"AdHoc": {
"description": "Ad-hoc queries",
"isDefault": true,
"maxConcurrentQueries": 1,
"maxDelayQueueSize": 2
}
}
}
The request body must have the following fields at the top level:
Field | Description | Data type |
---|---|---|
|
The feature flag to enable or disable the workload queue. |
|
|
The map of the available workload queues. |
|
Objects under queues
consist of queue ID (key) and properties (value).
The queue ID must be a string of less than 64 characters including alphanumeric and underscore. |
Each queue has the following properties:
Field | Description | Data type |
---|---|---|
|
The description of the queue. |
|
|
The flag to indicate if the queue is the default queue. Must be set to |
|
|
The maximum number of concurrent queries allowed in the queue. |
|
|
The maximum number of queries that can be queued in the delay queue. |
|
maxConcurrentQueries and maxDelayQueueSize
maxConcurrentQueries
and maxDelayQueueSize
are enforced on per machine level. More specifically, it puts a limit on how many requests ONE GPE
process can handle.
For example, in a TigerGraph cluster with 4
nodes (there will be 4
GPE
processes), the total number of qureies allowed for a WorkloadQueue
is 4*maxConcurrentQueries
.
Similarly, the total number of queries can be put into the corresponding delay queue is 4*maxDelayQueueSize
.
TigerGraph internally would try to evenly distribute queries evenly among the nodes, hence, the WorkloadQueue
from each GPE
would be filled at a similar pace.
The query concurrency is also confined by the number of physical cores that the machine has.
Therefore, Once the configurations change, GPE must be restarted to take effect. |
Examples
To modify the whole config:
curl -X POST -u tigergraph:tigergraph \
<hostname>:<nginx-port>/gsqlserver/gsql/workload-manager/configs \
-d '{"isEnabled":true,"queues":{"OLTP":{"description":"OLTP queries","isDefault":false,"maxConcurrentQueries":100,"maxDelayQueueSize":200},"scheduled_jobs":{"description":"Scheduled jobs","maxConcurrentQueries":10,"maxDelayQueueSize":20},"AdHoc":{"description":"Ad-hoc queries","isDefault":true,"maxConcurrentQueries":1,"maxDelayQueueSize":2}}}'
To just toggle the feature flag, simply skip queues
:
curl -X POST -u tigergraph:tigergraph \
<hostname>:<nginx-port>/gsqlserver/gsql/workload-manager/configs \
-d '{"isEnabled":true}'
To add, delete, or update the queues
while keeping the feature flag untouched, simply skip isEnabled
:
curl -X POST -u tigergraph:tigergraph \
<hostname>:<nginx-port>/gsqlserver/gsql/workload-manager/configs \
-d '{"queues":{"OLTP":{"description":"OLTP queries","isDefault":false,"maxConcurrentQueries":100,"maxDelayQueueSize":200},"scheduled_jobs":{"description":"Scheduled jobs","maxConcurrentQueries":10,"maxDelayQueueSize":20},"AdHoc":{"description":"Ad-hoc queries","isDefault":true,"maxConcurrentQueries":1,"maxDelayQueueSize":2}}}'
Response Status Codes
Status Code |
Description |
---|---|
200 |
The queue configs have been uploaded successfully. |
400 |
The payload is ill-formed. |
403 |
The user doesn’t have the privilege |
GSQL Command
From a local file:
PUT WORKLOAD QUEUE FROM "/path/to/queue.json"
From a raw string:
PUT WORKLOAD QUEUE FROM "{\"queues\":{\"OLTP\":{\"description\":\"OLTP queries\",\"isDefault\":false,\"maxConcurrentQueries\":100,\"maxDelayQueueSize\":200},\"scheduled_jobs\":{\"description\":\"Scheduled jobs\",\"maxConcurrentQueries\":10,\"maxDelayQueueSize\":20},\"AdHoc\":{\"description\":\"Ad-hoc queries\",\"isDefault\":true,\"maxConcurrentQueries\":1,\"maxDelayQueueSize\":2}}}"
Get Workload Queue
GET /gsqlserver/gsql/workload-manager/configs
Dump the queue configs so that the response would be the equivalent of the payload for POST
.
The purpose of this API is to retrieve the active configs and modify them on top of it.
Other than the administrative purposes, one may use GET WORKLOAD QUEUE
instead.
Example Request
curl -X GET -u tigergraph:tigergraph \
<hostname>:<nginx-port>/gsqlserver/gsql/workload-manager/configs
Permissions
You can grant or revoke workload queues to a user based on its user name, groups, and/or roles.
Grant/Revoke Workload Queue
POST /gsqlserver/gsql/workload-manager/permission
Grant a workload queue to users, groups, and/or roles.
Request Body
The request body expects a JSON object with the following schema:
{
"OLTP": {
"granted": {
"USER": []
"GROUP": ["*"]
"ROLE": ["r1", "r2"]
}
}
}
The request body must have the following fields at the top level:
Field |
Description |
Data type |
---|---|---|
|
|
|
|
The ID of the queue to be granted or revoked. |
|
|
The list of the user names to be granted/revoked. |
|
|
The list of the group names to be granted/revoked. |
|
|
The list of the role names to be granted/revoked. |
|
TIP: You can use the wildcard " * " to grant/revoke the queue to all users, groups, or roles. Note that " * " must be the only entry in the list when available.
Example Request
Grant the queue OLTP
to the user u1
and u2
:
curl -X GET -u tigergraph:tigergraph \
<hostname>:<nginx-port>/gsqlserver/gsql/workload-manager/permission \
-d '{"action": "grant", "queue": "OLTP", "user": ["u1", "u2"]}'
Revoke the queue scheduled_jobs
from all users and the role r1
:
curl -X GET -u tigergraph:tigergraph \
<hostname>:<nginx-port>/gsqlserver/gsql/workload-manager/permission \
-d '{"action": "REVOKE" "queue": "scheduled_jobs", "user": "*", role": ["r1"]}'
Response Status Codes
Status Code |
Description |
---|---|
200 |
The queue has been granted/revoked successfully. |
400 |
The payload is ill-formed so none of the given entities could be granted/revoked. |
403 |
The user doesn’t have the privilege |
GSQL Command
# GRANT
GRANT WORKLOAD QUEUE OLTP TO USER u1, u2
GRANT WORKLOAD QUEUE OLTP TO GROUP g1, g2
GRANT WORKLOAD QUEUE OLTP TO ROLE r1, r2
GRANT WORKLOAD QUEUE OLTP TO ALL USERS
GRANT WORKLOAD QUEUE OLTP TO ALL GROUPS
GRANT WORKLOAD QUEUE OLTP TO ALL ROLES
# REVOKE
REVOKE WORKLOAD QUEUE OLTP FROM USER u1, u2
REVOKE WORKLOAD QUEUE OLTP FROM GROUP g1, g2
REVOKE WORKLOAD QUEUE OLTP FROM ROLE r1, r2
REVOKE WORKLOAD QUEUE OLTP FROM ALL USERS
REVOKE WORKLOAD QUEUE OLTP FROM ALL GROUPS
REVOKE WORKLOAD QUEUE OLTP FROM ALL ROLES
Unlike REST API, the GSQL commands don’t allow you to specify USER, GROUP, and ROLE in a command. You must use separate commands for each entity type. |
Show Workload Queue
GET gsqlserver/gsql/workload-manager/permission
Show info on a specific workload queue or all.
Query Parameters
Parameter | Description | Data type |
---|---|---|
|
The ID of the queue to be shown. If not specified, all queues will be shown. |
|
Example Request
To retrieve the permission info of the queue OLTP
:
curl -X GET -u tigergraph:tigergraph \
localhost:14240/gsql/workload-manager/permission?id=OLTP
Example Response
The response will be the combination of configs and permission, e.g.
{
"OLTP": {
"description": "OLTP queries",
"isDefault": false,
"maxConcurrentQueries": 100,
"maxDelayQueueSize": 200,
"granted": {
"USER": [],
"GROUP": ["*"],
"ROLE": ["r1", "r2"]
}
}
}
List Workload Queue
GET restpp/workload-manager/queue
List all granted workload queues to the current user so the user can choose the appropriate queue from the list.
Example Request
curl -X GET -u tigergraph:tigergraph \
<hostname>:<nginx-port>/restpp/workload-manager/queue
Use Cases
Suppose we have configured the following workload queues that are the output of the SHOW WORKLOAD QUEUE
command:
{
"OLTP": {
"description": "OLTP queries",
"isDefault": true,
"maxConcurrentQueries": 100,
"maxDelayQueueSize": 100,
"granted": {
"USER": [],
"GROUP": ["g1", "g2"],
"ROLE": []
}
},
"scheduled_jobs": {
"description": "Scheduled jobs",
"maxConcurrentQueries": 5,
"maxDelayQueueSize": 0,
"granted": {
"USER": ["u1"],
"GROUP": [],
"ROLE": ["r1"]
}
},
"AdHoc": {
"description": "Ad-hoc queries",
"isDefault": false,
"maxConcurrentQueries": 10,
"maxDelayQueueSize": 10,
"granted": {
"USER": [],
"GROUP": ["g3"],
"ROLE": ["r2"]
}
}
}
Running a Query
When running a query, you can specify the workload queue to run the query on.
If the queue is not specified, the query will be routed to the default queue.
To specify the queue in the GSQL shell, you can use the -queue
option, e.g.
RUN QUERY -queue AdHoc q1()
or you can use the HTTP header Workload-Queue
:
curl -X POST -u tigergraph:tigergraph \ -H "Workload-Queue: AdHoc" \ <hostname>:14240/restpp/query/ldbc_snb/q1"
If the given queue is not granted to the current user, the query will be rejected with the error code REST-14000
and return HTTP 422 Unprocessable Entity
.
For example, if the user tigergraph
who does not belong to the group g3
or holds the role r2
tries to run a query on the queue AdHoc
, the query will be rejected.
If the queue is full of capacity, the query will be rejected. |
Monitoring
You can use the following API to check the status of the workload queues for monitoring purposes.
Check Running Queries
POST /restpp/workload-manager/queuestatus
Return the status of the given workload queue on each GPE instance.
Request Body
Field |
Description |
Data type |
---|---|---|
queuelist (optional) |
The list of the ID of the WorkloadQueue. If not specified, all queues will be shown. |
|
mode (optional) |
|
|
For mode
field, if stats
is specified, response only gives the numbers of queries waiting and running. If verbose
is specified, the response will include the the request Ids of the queries that are waiting and running.
If Request Body is not provided, response is generated as if both fields are using the default values.
Example Request
curl -X POST -u tigergraph:tigergraph \
<hostname>:<nginx-port>/restpp/workload-manager/queuestatus \
-d '{"queuelist": ["AdHoc"], "mode": "verbose"}'
Example Response
{
"version": {
"edition": "enterprise",
"api": "v2",
"schema": 0
},
"error": false,
"message": "Completes",
"WorkloadQueueStatusByInstances": [
{
"version": {
"edition": "enterprise",
"api": "v2",
"schema": 0
},
"error": false,
"message": "",
"results": {
"GPE_2_1": [
{
"WorkloadQueueName": "AdHoc",
"maxConcurrentQueries": 1,
"maxDelayQueueSize": 2,
"runningQueries": [
"196702.RESTPP_1_1.1707799387957.N"
],
"delayQueries": [
"65630.RESTPP_1_1.1707799387958.N"
]
}
]
}
},
{
"version": {
"edition": "enterprise",
"api": "v2",
"schema": 0
},
"error": false,
"message": "",
"results": {
"GPE_1_1": [
{
"WorkloadQueueName": "AdHoc",
"maxConcurrentQueries": 1,
"maxDelayQueueSize": 2,
"runningQueries": [
"94.RESTPP_1_1.1707799387957.N"
],
"delayQueries": [
"131167.RESTPP_1_1.1707799387959.N"
]
}
]
}
}
],
"code": "REST-0000"
}
Other Query Concurrency Control Methods
Limit the number of current built-in heavy queries
This configuration is deprecated as of TG 3.10.0 and will be removed in a future release. This is ignored once the workload queue feature is enabled. |
TigerGraph has a few built-in queries that are memory-intensive, here referred to as "heavy".
These queries tend to be invoked by applications such as GraphStudio.
You can set a limit of how many of these heavy queries are allowed to run concurrently by configuring the parameter RESTPP.WorkLoadManager.MaxHeavyBuiltinQueries
with the gadmin config
command.
For example, to set the maximum number of heavy built-in queries to 10, run the following command:
$ gadmin config set RESTPP.WorkLoadManager.MaxHeavyBuiltinQueries 10
You must restart the RESTPP service for the change to take effect.
Limit number of concurrent queries
This configuration is deprecated as of TG 3.10.0 and will be removed in a future release. This is ignored once the workload queue feature is enabled. |
You can use the RESTPP.WorkLoadManager.MaxConcurrentQueries
parameter to set a limit of how many queries are allowed to be running concurrently.
The count of these queries does not include the built-in heavy queries.
For example, to specify that there can only be 50 concurrent queries at a time, excluding the heavy built-in queries, change the value of the configuration parameter to 50 with the gadmin config
command:
$ gadmin config set RESTPP.WorkLoadManager.MaxConcurrentQueries 50
If the maximum number of concurrent queries is reached, newly submitted queries are placed in a delay queue, and begin to run as the currently running queries finish.
If the queue is at capacity, newly submitted queries are rejected. and you need wait until there is capacity to run the query again.
You can adjust the size of the queue with the configuration parameter RESTPP.WorkLoadManager.MaxDelayQueueSize
.
For example, to specify that a maximum 20 queries may remain in the queue, run the following command:
$ gadmin config set RESTPP.WorkLoadManager.MaxDelayQueueSize 20
You must restart the RESTPP service for the change to take effect.
Specify number of threads used by a query
You can specify the limit of the number of threads that can be used by one query through the Run Query REST endpoint.
For example, to specify a limit of four threads that can be used by a query, use the GSQL-THREAD-LIMIT
parameter and set its value to 4:
curl -X POST -H "GSQL-THREAD-LIMIT: 4" -d '{"p":{"id":"Tom","type":"person"}}' "http://localhost:9000/query/social/hello"
Specify replica to run query on
On a distributed cluster, you can specify on which replica you want a query to be run through the Run Query REST endpoint.
For example, to run the query on the primary cluster, use the GSQL-REPLICA
header when running a query and set its value to 1:
curl -X POST -H "GSQL-REPLICA: 1" -d '{"p":{"id":"Tom","type":"person"}}'
"http://localhost:9000/query/social/hello"
Query Routing Schemes
In a distributed or replicated cluster, REST++ automatically routes queries to different GPEs, in order to spread the workload.
If GSQL-REPLICA header is used when invoking a query, this header overrides the routing scheme for that query. |
Round Robin routing
The default query routing scheme is round-robin. The first query is managed by GPE 0, the next query by GPE 1, and so on. After the last GPE, the cycle returns to GPE 0.
Version 3.9.3 adds a system configuration parameter RESTPP.CPULoadAware.Mode
to enable system administrators to select other query routing schemes:
-
Mode = 0 (default): Round-Robin
-
Mode = 1: CPU Load Aware
CPU Load Aware Query Routing
When this query routing mode is selected, REST++ tries to direct incoming queries to the GPEs that are currently less busy.
Specifically, the system periodically polls CPU usage data to find a GPE whose CPU usage percentage is below
RESTPP.QueryRouting.TargetSelectionCPUThreshold
(default 50).
If no GPE satisfies the CPU threshold condition, REST++ falls back to the default behavior (round-robin selection).
$ gadmin config entry RESTPP.QueryRouting.TargetSelectionCPUThreshold 40
$ gadmin config entry RESTPP.QueryRouting.Mode 1