1 of 9

Release Notes, FAQs and Troubleshooting

Release Notes - TigerGraph Server 3.2

Release Date: Sep 30th, 2021

Release Notes for Previous Versions:

Release Notes - TigerGraph 3.1
Release Notes - TigerGraph 3.0
Release Notes - TigerGraph 2.6
Release Notes - TigerGraph 2.5
Release Notes - TigerGraph 2.4
Release Notes - TigerGraph 2.3
Release Notes - TigerGraph 2.2
For v2.1 and older, contact TigerGraph Support

For the running log of bug fixes, see the Change Log.

Compatibility with TigerGraph 3.1

The following changes were made to the built-in roles in TigerGraph's Role-based Access Control

The built-in role queryreader can no longer run queries that include updates to the database.
- To emulate the old queryreader role, create a role with all queryreader privileges, and also grant the WRITE_DATA privilege to the new role to allow users with the role to run queries that update the graph.
The built-in role admin can no longer create users
- To emulate the old admin role, create a global role with all admin privileges, and also grant the WRITE_USER privilege to the new role to allow users with the role to create users.

Compatibility with TigerGraph 2

Major revisions (e.g., from TigerGraph 2 to TigerGraph 3) are the opportunity to deliver significant improvements. While we make every effort to maintain backward compatibility, in selected cases APIs have changed or deprecated features have been dropped, in order to advance the overall product.

Data migration: A tool is available to migrate the data in TigerGraph 2.6 to TigerGraph 3.0. Please contact TigerGraph Support for assistance.

Query and API compatibility:

Some gadmin syntax has changed. Notably. gadmin set config is now gadmin config set. Please see Managing with gadmin.
Some features which were previously deprecated have been dropped. Please see V3.0 Removal of Previously Deprecated Features for a detailed list.

New Features

User-defined Roles

Privileges are introduced as the atomic unit for managing database user access. Database administrators can now define their own roles with customizable collections of privileges.

For details on this feature, see User Access Management.

Cross-region replication

TigerGraph 3.2 provides complete native support for all data and metadata cross-region replication including automated schema changes, user and query management.

For details on this feature, see Cross-Region Replication.

Containerization of services

TigerGraph 3.2 allows users to deploy TigerGraph single servers and clusters on Kubernetes. Running applications in containers on Kubernetes provides rapid spin-up and repeatability across environments.

For details on this feature, see Kubernetes.

Cluster resizing

TigerGraph 3.2 provides built-in cluster management features that allow users to expand, shrink, and repartition their TigerGraph clusters.

For details on this feature, see Cluster resizing.

Workload Management

TigerGraph 3.2 provides the GSQL-REPLICA and GSQL-THREAD-LIMIT header to specify the replica for a query to run on and a thread limit that a query is allowed to use.

For details on this feature, see Run a query.

File output policy

Starting with TigerGraph 3.2, GSQL has a file output policy that contains a whitelist and a blacklist. GSQL queries will only write to the whitelist and are forbidden from writing to the blacklist.

Log viewing with Elasticsearch, Kibana, and Filebeat

Starting with TigerGraph 3.2, TigerGraph provides a gadmin command that can generate Filebeat configuration files for a TigerGraph cluster. Read our step-by-step guide to set up Elasticsearch, Kibana, and Filebeat to view TigerGraph logs.

General Improvements

Expanded library of built-in GSQL functions

TigerGraph 3.2 added over 30 built-in functions to the GSQL query language.

Relaxed variable declaration rules

Starting in TigerGraph 3.2, base type variables and accumulators can be declared anywhere in a query and are block-scoped.

Starting in TigerGraph 3.2, variables and accumulators can be initialized with expressions.

Parameter passing with JSON

Starting in TigerGraph 3.2, users can pass in parameters to a GSQL query with a JSON object.

Query installation improvements

TigerGraph 3.2 implemented the following improvements for query installation:

Updating a subquery will no longer require reinstalling all dependent queries
Schema change will no longer trigger reinstalling all queries of the graph
Concurrent query installation between graphs is now supported
When installing queries on a cluster, TigerGraph will now utilize the computing power of multiple nodes to compile the queries, greatly improving installation performance

WCAG-compliant accessibility features

The user interfaces of GraphStudio and Admin Portal - TigerGraph’s GUI are improved to meet WCAG accessibility criteria. More users across a wider range of physical abilities will now be able to work effectively with GraphStudio and the visual Admin Portal.

Edit mode for graph exploration in GraphStudio

Starting with TigerGraph 3.2, users need to enter Edit Mode in the Graph Exploration Panel in order to write to the graph.

Enhanced cluster service status for AdminPortal

Cluster service status is moved from the footer of the Admin Portal page to the bottom of the navigation menu.

Improved GSQL auto-complete and syntax highlighting in GraphStudio

We made improvements to GraphStudio's auto-complete and syntax highlighting features so users have a better experience writing and editing queries in GraphStudio.

Substring search within graph exploration result in GraphStudio

Starting with TigerGraph 3.2, users can search substring to locate the vertices in addition to using exact match.

Anonymous tuples as return type in subqueries

Starting in TigerGraph 3.2, subqueries in GSQL can return an anonymous tuple or a collection of anonymous tuples.

Function overloading

Starting in TigerGraph 3.2, function overloading is now available. Query UDFs with the same name but different signatures can be defined in the UDF library.

Known Issues

Applications

GraphStudio
- Known Issues for GraphStudio
- The No-Code Data Migration feature is in Alpha release. Your feedback would be appreciated.
- The No-Code Visual Query Builder is in Beta release. Your feedback would be appreciated.
AdminPortal
- Known Issues for AdminPortal

GSQL

Multiple (Conjunctive) Path Patterns:
- There are no known functional problems, but the performance has not be optimized. Your feedback would be appreciated.
DML type check error in V2 Syntax:
- GSQL will report a wrong type check error for Query block with multiple POST-ACCUM clauses and Delete/Update attribute operation.
Turn on GSQL HA manually when upgrading from 3.0.x
- Users who are upgrading from 3.0.X need to manually start GSQL HA service. Please reach out to support for help with the process documented in: https://tigergraph.freshdesk.com/a/solutions/articles/5000865072
Stale data visible after Deletes using index
- Queries that use secondary index may still see the vertices being deleted until after the snapshots are fully rebuilt.

Change Log

This page will document all the changes to TigerGraph product including New Features and Bug Fixes.

Distributed Graph support and certain other enterprise-level features are available in the Enterprise Edition only. They do not pertain to the Developer Edition.

TigerGraph 3.2.1

Changes

Increased limit for graph catalog size
Added retention for Metadata topic in Kafka
Improved error handling when retrieving patterns from server in Visual Query Builder

Fixes

Fixed an issue during upgrade related to initiating Kafka
Fixed a bug that in rare cases caused a catalog size limit issue
Fixed an issue that slowed queries that write to files in 3.2.0
Fixed a bug that in rare cases caused query compilation issues

TigerGraph 3.2

Release Date: 2021-09-30

Features

Check release notes: 3.2.0 Release notes

Changes:

GSQL:

ADD Edge Pair commands as part of Schema Change operations are now allowed.
Query-calling-query limitation: Distributed main query cannot call a distributed sub-query.
Default logging level for GSQL logs has been changed from DEBUG to INFO

Enhancements

Database Server:

Core: Improve Abort transaction if the transaction is too long
Core: GPE hung under high number of concurrent queries
Core: Turn on transaction for RestPP Post for atomicity
Core: Workload management: Specify replica for a query to run in a distributed cluster
Core: Standardize the correct http response code for query requests
GSQL: Query Installation Performance Improvements
- Support longer reload time for Query installation on 1000’s of queries
- Don't drop parent queries when subquery is installed
GSQL: Support for new built-in functions:
- Math-related functions: round(), reverse(), repeat(), insert(), cot(), degrees(), radians, square, truncate, log2
- String-related functions instr(), length(), substr(), PI(), rand(), lpad(), rpad(), replace(), ascii(), chr(), soundex(), difference(), translate(), space(), ltrim(), rtrim(), find_in_set(), left(), right();
GSQL: Support FROM/TO vertex type change for edge type metadata
GSQL: Support VLAC tags in Import and Export operations
GSQL: Allow variable declaration anywhere in query body
GSQL: Support initialization from an expression;
GSQL: Support revoking superuser role from default user
GSQL: Improve the error message displayed when connecting to LDAP server
Platform: Upgrade to Java 11
Platform: Add support for ubuntu20
Platform: Show executor status and updated status of other services
Platform: Run upgrade locally without ssh if user is local with only a single node
Platform: Start/stop local executor will no longer need ssh,
Platform: Increase Backup/Restore S3 upload Partition Size
Platform: Make Backup/Restore Heartbeat timeout configurable to allow media with slower speeds.

GraphStudio:

WCAG compliance changes
Support overwriting exploration result
Support duplicate file-edge mappings and fix setSelection error;
Add graph information and variable names to auto-complete list;

Admin Portal:

WCAG changes
Support Privilege based management
Improve unauthorized access warning popup message
Display secrets table for each graph

Fixed

Database Server:

Core: Kafka loader should exit gracefully
Core: GPE crash if the request specifies an invalid replica
Core: Health check for 1 mins in RESTPP startup
Core: Fixed file loading failed due to OOM
Core: Fixed no error message when edge does not exist
Core: Fixed issue with deleted_vertex_check API after dropping vertex type;
GSQL: LDAP user privilege parsing missed authorization checks
GSQL: Fixed rhs check issue for direct interpret query;
GSQL: Fixed print Vset issue with vertex accum declaration order;
GSQL: Added semantic checker for rhs with the same name;
GSQL: Export fails due to mismatching token of an unexpected graph
GSQL: Fixed wrong name when looking up variable from global
GSQL: Fix datetime_format function not working for v2 syntax
GSQL: The result of printing string differs in interpret mode and installed mode
GSQL: Fixed issue with Order by for interpret query
GSQL: Fix to handle abort while adding queries if a concurrent delete fails
Platform: Service status for KAFKA is down when one zookeeper server offline
Platform: Fix for Admin log rotation time issue

GraphStudio:

Addressed Schema change logic for reversed edge
Fix for privilege based access control issue
Fix for loading job information migration failure
Remove loading job log on export;
Remove graphName from loading job information interface;
Use authorization token in header instead of logging in;
Send heartbeat to keep client connection alive

TigerGraph 3.1.6

Release Date: 2021-08-09

Fixed

Application

Configuration for light or dark mode in GraphStudio/Admin Portal
Multiple maps from a single file to an edge are indistinguishable
GraphStudio: Implement responsive design for all sizes of screens
GraphStudio: Rearrange elements to avoid overlay in small screen
GraphStudio: Support toolbar button announcement for screen readers
GraphStudio: Support keyboard shortcut for focusing elements within working panels

TigerGraph 3.1.5

Release Date: 2021-07-23

Fixed

Database Server

Core: GPE on DR cluster stuck in warm up state after failover due to invalid requests
GSQL: Prevent QueryReader role to run any graph updates query
GSQL: Validation script to check schema consistency issue
Platform: Increase in proxy request buffer size for NGINX
Platform: Change in GRPC maximum message size for GBAR backup of catalog data

Application

GraphStudio: Reuse controller connections to avoid running out of used ports
GraphStudio: Remove "change layout" button in toolbar in Visual Editor

TigerGraph 3.1.4

Release Date: 2021-07-01

Enhancements

GSQL: \requesttoken API can be used to create authorization tokens using User name/password in addition to secret.
GSQL: Secrets created without alias will be assigned a system-generated alias so that they can be dropped
Platform: Nginx upgrade from 1.18.0 to 1.21.0
Platform: Backup/Restore configuration improvements to allow use of slower HDD media for storage
GraphStudio: UI enhancements to support WCAG compliance

Fixed

Database Server

Core: GPE need to verify catalog updates after new schema changes are applied
Core: Running Louvain algorithm as a distributed query crashed GPE due to unnecessary vertex activation
Core: Backup failed with WaitForDeltaToBeProcessed timeout
Core: Updated log messages to reference /deleted_vertex_check endpoint in RESTPP correctly
GSQL: Fix schema consistency issues due to duplicate Vertex/Edge type names
GSQL: Fix for schema consistency issue due to GPE referencing a dropped Vertex
GSQL: Additional semantic check for local schema change job to prevent schema inconsistency
GSQL: Error when making schema changes using UI/ Install all queries fails
GSQL: Inconsistency between GSQL and GPE catalog data after ‘Drop graph’ fails
GSQL: ‘From’ clause missing from delete loading jobs when Export Graph command is run
GSQL: Query installation will fail due to wrong order of arguments in PRINT statement
GSQL: “Incompatible argument types for function/tuple evaluate" error when using evaluate without second argument on v2 syntax
GSQL: Designer Role unable to run a query in Interpret Mode
Platform: Updates to Nginx templates for security updates
Platform: Change in default value for UI request timeout to 3600

Application

GraphStudio: Vertex and Edge statistics generation optimization to avoid Cluster CPU usage spike
GraphStudio: Unexpected error when dropping edge with reversed edge
GraphStudio: Fix for failure to migrate loading job info from 3.0.x to 3.1.2+

TigerGraph 3.1.3

Release Date: 2021-06-05

Enhancements

GraphStudio

Theme color adjustment to meet Web Content Accessibility Guidelines(WCAG).
Support responsive page layout for "Home" page, "Load Data" page and "Write Queries" page.
Add information transcripts for visualization areas in each page.
Add keyboard navigation in graph charts.
Improve tabbing capability and tabbing order.
Improve element status announcement.
Add headings for the entire application.
Add aria-labels for the entire application to meet WCAG compliance.
Add captions for all table elements.

AdminPortal

Theme color adjustment to meet WCAG compliance.

TigerGraph 3.1.2

Release Date: 2021-05-20

Features

SQL to GSQL translation for Enterprise BI tools like Tableau and Power BI
- This enriches data visualization tools with graph-enabled dashboards

Enhancements

Core: Increase the maximum allowed size of Vertex/Edge delta files to allow larger number of updates for write-heavy applications.
GSQL: Support for more than 10K elements in a Set<> of a query parameter
GSQL: Support VertexAccessControl Tags in DBImportExport

Fixed

Database Server

Core: Pick the latest version of GPE data for backup
GSQL: datetime attribute type in a schema-level user-defined tuple translated as int32_t
GSQL: NullPointerException when handle VSet variable in nested if statement
GSQL: NullPointerException when using multiple POST-ACCUM clauses
GSQL: INSERT statement with non-existent edge does not report error in V1 syntax
GSQL: GSQL does not produce type error when inserting non-existent edge with vertices from query parameters
GSQL: NoSuchElementException when using a non-existent edge on INSERT statement
GSQL: Lexical error when a newline is followed by an exclamation mark (!) in a string
GSQL: Printing string with newline fails compilation
GSQL: Refresh RESTPP Token: output and default lifetime is not correct
GSQL: Multiplicity propagation ACCUM clauses should reset only if the block is within a loop
GSQL: Create user don't allow an empty password
GSQL: Pattern match - propagation accumulator values not cleared
GSQL: Push-down error reported for non-alias expressions
GSQL: Support TAGS in DBImportExport
GSQL: Fix TokenBank compilation slowdown
Platform: Graceful handling of port used by Executor component
Platform: Got failed to authenticate with GSQL server error when login with SSO on tg3.1.1
Platform: Remove gsql password printing

GraphStudio

The loading data status is incorrect while import a solution
Imported solution with no modification, should not ask user to publish Data mapping.
Failed to overwrite datafile in Map Data to Graph

AdminPortal

Display of secrets on AdminPortal - User management should be paginated.

TigerGraph 3.1.1

Release Date: 2021-04-02

Changes:

Change BY(OR|OVERWRITE) syntax to BY OR|OVERWRITE for explicit tag creation
Changed name of 'dbsanitycheck' endpoint to 'deleted_vertex_check'

Enhancements

Database Server

Core: Improved throttling mechanism for Updates when memory usage has hit critical threshold
Core: Improved reliability of transferring in-memory data to on-disk within GSE
Core: Logging improvements to support both time-based and size-based configuration for all the component logs
Fixes/Enhancements for Vertex Level Access Control feature
- GSQL: Performance improvement for tag creation only operations
- GSQL: Make tag description optional
- GSQL: Block altering taggable value of global vertex if being used in tag based graph
- GSQL: Show tag expression of tag graphs in base graph “ls” command
- GSQL: Allow vertex taggable property to be updated even if it is currently being used in a tag-based graph
GSQL: Support for accumulators in table-style SELECT clause expression lists
GSQL Query syntax extensions for table support
GADMIN: Allow script to be used to configure LDAP TrustStore Path
Platform: Security enhancement to allow HTTPS traffic only access securely through dedicated interfaces when SSLis enabled.
Platform: Upgrade grpc to 1.33.0

GraphStudio

Add a * in the label of a data source if the loading job is changed
Return detailed error messages when install queries failed
Enable only one column header to be editable at the same time
Enable closing popup with Escape
Add a max validator for timeout field for configuration
Query name conflict check uses all available type names from GSQL

Fixed

Database Server

Core: Retry logic for adding data to GSE in the DR cluster
Core: Fix for GPE crash due to potential race condition between queries and updates.
Core: Partial result output in extreme cases before a running query has finished
Core: restpp crashed when missing parameter name
Core: Fixed file loading job failures due to OOM
GSQL: Fix for catalog access issue due to concurrent schema change requests
GSQL: GPE crash due to incorrect catalog update issued by GSQL
GSQL: LDAP password visible in GSQL logs
GSQL: Exit code from GSQL CLI needs to return non-zero code if there is an error
GSQL: Unable to run global schema change on global vertex if local vertex with same name exists
GSQL: Query created through GSQL shell, but returns error through GraphStudio
GSQL: Add check for GPE readiness for create/drop vertex/edge operations for global schema changes
GSQL: GSQL v2 syntax - vertex-attached containers cannot be read in WHERE/ACCUM clauses
GSQL: Enhance Export/Import by pre-creating necessary directories
GSQL: Fix calling subquery without RETURNS clause
GSQL: Code generation error for multiple dynamic expressions with the same parameter
GSQL: Wrong result for the output of datetime_format function
GSQL: SET<VERTEX> Not Working in Query Parameter
GSQL: GLE error message uses incorrect terminology: 'batch mode' should say 'distributed query mode'
GSQL: Printing vertex set variable with parentheses causes wrong printing for attributes
GSQL: GSQL pattern match - incorrect WHERE condition parsing
GSQL: GSQL query doesn't work on HA cluster when RESTPP#1 is down
GSQL: Fix for Catalog backup file cleaning failure
GSQL: Empty gsql password should not be allowed.
GSQL: NullPointerException on creating a query with a body-level DML delete statement
GSQL: Query cannot be dropped after its caller queries have been dropped
Platform: Remove user authentication information after installation
Platform: GSQL user defined functions are not backed up
Platform: Residual GPE/GSE processes are not terminated before restore
Platform: GBAR gracefully exit after ctrl-c
Platform: guninstall does not take into account the password login
Platform: gbar restore failed with message: Failed to import key-value store
Platform: Single node 3.1 installation in in VMware private cloud environment in VMWare Private Cloud Environment
Platform: Restore failure from S3 didn’t update the replicas correctly
Platform: Check to prevent migration tool running twice
Platform: GBAR restore fails with invalid checksums
Platform: User didn’t receive correct feedback when incorrect password entered during 3.1 upgrade

GraphStudio

Query goes back to a previous version after schema change in query editor
Remove the use of regex for GSQL CLI and rely on exit code instead
Progress bar hangs if query installation fails
datetime's default value field does not support rfc3339 nor iso8601 format
Export solution is only available for superuser
Unexpected error when changing the schema (Fix from GSQL side)
Update global schema after a local schema is dropped
Uploading progress bar hangs after choosing unsupported file type
Query editor does not display full text if line cannot break
Undo button should clear the expand list
JSON result of "write query" is not updated in error mode
Not possible to unset/cancel custom radius in Graph Exploration
Syntax highlighting is incomplete
Link to License page from GST is wrong
Long messages in Design Schema overlap vertex properties editor's ✓ button
The loading progress bar is stuck if import fails
The data mapping will disappear after change the global vertex's attribute
Address Export/Import solution migration issues

Admin Portal

Validate input on config management
Ignore blank spaces in log search

TigerGraph 3.1.0

Release Date: 2020-12-02

Features

New features are described in 3.1.0 Release notes.

Changes:

GSQL: STRING COMPRESS data type will no longer be allowed for new data objects. However, existing objects with STRING COMPRESS data type will continue to work.
GSQL: Changes to ADD/DROP Edge Pair commands
- ADD edge pair in schema change will not be allowed
- Drop vertex will be disallowed if it is currently being used in edge pair.
Platform: tigergraph user id included with default installation will be allowed to be dropped
Platform: Root user will now be disallowed to do an upgrade using installer -U option

Enhancements

Database Server

Engine: License enforcement check improvements
Engine: Restpp memory footprint reduction by recycling memory periodically
GSQL: Support JSON Payload Method for Calling GSQL Built-In Dynamic Endpoints
GSQL: Support Async query execution with query status/result functionality
GSQL: Enhanced Interpreted Query support:
- Support graph update for interpreted query
- Support Where filter in PRINT statement for interpreted query
GSQL: Logging for /requesttoken API endpoint
GSQL: Reset function for vertex attached accumulators
GSQL: Make token expiration maximum limit configurable
Platform: Enterprise Free Package improvement to make pre-installed license work in both interactive and non-interactive modes
Platform: Allow users to set hard coded timeout for Backup jobs
Platform: Allow configurable minimum and maximum memory limits for Kafka, Kafka Connect and Kafka Stream
Platform: Software upgrades for the following packages:
- etcd, Kafka plugins, Jsoncpp library

GraphStudio

Add new application server framework to offer continuous availability in GraphStudio and Admin Portal
Update APIs for the new application server
Support solution export/import without graph metadata
Integrate GraphStudio with the new application server
Increase unit test timeout

Admin Portal

Add log management for viewing, searching and downloading
Add configuration management settings
Add Restpp setting: Default query timeout
Add Nginx setting: SSL setting and whitelist IP setting
Add application server setting: Query return size
Add security management settings: LDAP, SSO
Integrate Admin Portal with the new application server
Change SSO authorization request URL
Handle SAML ACS for SSO
Disable authorization check for SSO metadata

Fixed

Database Server:

Engine: Correct HTTP response code will be returned when query times out
Engine: GPE status reporting is delayed due to backlog of large number of Kafka messages in the queue.
Engine: GPE crash in Sub-query print statement
Engine: Infinite loop in refresh index when some attributes are disabled
Engine: RESTPP memory consumption increase caused by timed out queries
Engine: Query using index will not fully utilize compute resources.
Engine: When query times out, JSON may not be well formed
Engine: Failed to post data when id is int and primary_id_as_attribute is true
Engine: Avoid converting string compress index hint in remote topology edge action
Engine: GPE not responding to SIGTERM
GSQL: Refactor memory usage in query installation to reduce the memory footprint when there is a large number of queries
GSQL: When creating the edge pairs, allow use of new vertex types that will be added from the current schema change job
Platform: Backup/Restore fails to backup GUI related data
Platform: Installer will print progress message during package install to avoid ssh timeout

TigerGraph 3.0.6

Release Date: 2020-11-11

Enhancements

Database Server

Audit Logging Enhancements
- User information for all requests.
- Request Status (request succeeded or failed) for all requests irrespective of access mode
Remove Hard timeout limit for Backup/Restore operations

Fixed

Database Server

Platform: Resolve the issues where Kafka start-up will hang in certain OS and shell environment.
Platform: Backup/Restore hangs if there are too many files
Platform: Backup/Restore list error when backup files on S3 are corrupted
Engine: Builtin query running background blocks schema change
GSQL: Fix for SSL certificate exception

TigerGraph 3.0.5

Release Date: 2020-09-05

Features

New features and described in 3.0.5 Release notes.

Enhancements

Database Server

Longer timeout for retrieving enum maps when using STRING COMPRESS
Socket timeout adjustment to improve RESTPP stability
Implement SetAccum<vertex> as bitset
Semantic check for println of File object for compiled query
Installer improvements
- Enhancement to change the user and group separately.
- Check permission of parent dir of App/Temp/Data/Log Roots
TigerGraph 2.x to 3.x Migration tool enhancements
- Support for copying UDFs and other functions during migration
Enhanced license support for Cloud deployments
Enhanced upgrade version checking
Zookeeper client connection retry mechanism to avoid Zookeeper operation failures

Changes

Installer Configuration JSON format

Install Configuration is separated into basic configuration and advanced configuration sections
Support for allowing replication factor to be set during installation as opposed to limited HA on/off setting previously

Fixed

Database Server

Core: GPE down during Backup for large number of files
Core: GPE will crash if the data comes from a machine without relevant metadata.
Core: Query failure due to string overflow
Core: Query with large UDF job didn't stop for configured time out setting
Platform: Kafka loading bug when number of loaders exceeds 10
Platform: Backup hangs when there are very large number of files in Graph Store
Platform: Backup reports successful operation even if it's actually incomplete
Platform: gadmin reset does not reset all files
GSQL: V2 syntax removes edge type that is excluded by Accum clause.
GSQL: Force query install should regenerate the endpoints
GSQL: Loading Job failed with SSL enabled
GSQL: Query installation performance issue for V2 syntax
GSQL: ArrayAccum value is not accessible in the ACCUM block when query is installed in distributed mode.
GSQL: Dictionary Fails when Tokens are too many
GSQL: Query installation fails due to schema change
GSQL: gsql_client strips out newlines when writing gsql queries by pasting into gsql shell

GraphStudio

Apply previous visualization result should handle empty saved schema
Displaying attribute for raw type in visualization should not use JSON stringify
Remove clear text user password in error log for migration from RDBMS to Graph

TigerGraph 3.0

Release Date: 2020-06-30

Features

New and modified features and described in the TigerGraph 3.0 Release Notes.

Enhancements

Database Server

Support for reload libudf command
Schema validation before apply settings
Relax Developer Edition restrictions
YAML parsing support for edge pairs
Support SPLIT for UDT loading, Load From/To Type from File
Data generator 2.0
Change log level by SIGUSER1, avoid unnecessary error log
Restpp self-report status
Allow users to remove data for reinstallation
Upgrade kafka to 2.3.0
Path pattern optimization with pattern flipping and PER clause
Combine service status and processState into one log event
Support validation of entry value during gadmin config set command
Add strong check for symlinks
Support to_datetime builtin function in expressions
Support string set filter for edge and target vertex
Support local vertex and edge with same name in multiple graphs
Index hint for interpret mode
Support string compress attributes in built-in Query filters
Enable jemalloc profiling
Utility function to get disk free percentage
Allow concurrent user query access during Query Installation

GraphStudio:

Support multiple-pair edge type
Schema change job for add/drop attribute index
Improved clear graph warning
New layout for logo and multiple graphs
Allow user edit header for sample data
Support multiple files upload
Cancel autofit for adding vertex and double click actions
Cancel auto login if user has logged out
Save JSON format of query result to local storage
Create Edge Type from Multiple Vertex Types to Multiple Vertex Types

Fixed

Database Server

Add on-demand heap profiling for jemalloc
Delete legacy ids data
Periodically force Jemalloc release memory to OS / on demand profiling
Change debug log in convertids into verbose
Print warning but no assert in ZMQ
Wrong JSON format for tempTables
Fix wrong check for loading job completion
Allow interpret query to recognize html encoded string constant
Handle logical type in json converter
Corrected URL decode for whitespace character
Add time before delete edges command to ensure rebuild has enough time to complete
Fix remove session bug for the aborted handler after 'ctrl + c'
Synchronize concurrent install queries
Change logic to check service status for cluster mode
Support the ‘=‘ operator SumAccum;
Drop vertex/edge/graph when there are local and global vertex/edge have the same name;
Support removing a SetAccum from another SetAccum;
Remove the reversed edge too when removing an edge;
Cannot create query due to the overflow of the size of the HeapAccum;
Query referred as subquery from interpreted mode query can not be dropped;
Index out of bound when ignoring the parameter checking for interpret query
Output error message for invalid job id
Fix codegen to insert a vertex/edge without attributes
Support file regexp in checking header of filename
Support the true value of key word header and transaction in the loading data job to be case-insensitive
Dedupe proxy user's own roles from groups
Make schema change metadata modification a transaction
Fix builtin k_step expansion query bug
Check disk space before exporting each vertex/edge type
Allowed non-English string constants in interpreted queries
Edge variable prints attribute by default
Print developer information only in gadmin status
Restrict symlinks and check their existence

GraphStudio:

Fix error message for new secret creation
Refactor keywords
Do not emit explorer config if saved exploration doesn't have it
Check for Valid date time
Extend wait time for progress bar finish
Add right border for side navigation
Upgrade color-picker
Fix check accumulator format
Fix percentage of performing schema change
Run interpreted query through websocket

TigerGraph 2.6.6

Release Date: 2021-03-23

Fixed

Database Server

Core: Fix concurrent access of abort messages
Core: Fix for GPE crash due to wrong license
Core: Fixes to gcollect utility:
- Improvements to work in clustered environments
- Accidental removal of directory with old data collection run
GSQL: Fix for catalog access issue due to concurrent schema change requests
GSQL: Increase timeout for download upload catalog, make it configurable
Platform: Upgrade of gRPC version to 1.33.0
Platform: Remove user authentication information after installation

TigerGraph 2.6.5

Release Date: 2021-01-15

Enhancements:

Database Server

GSE/GPE segment consistency check utility
Integration with GSE/GPE consistency check utility with Backup/Restore

Changes

Increase in refresh timeout for RESTPP from 20 to 60 seconds;

Fixed

Database Server

GSE replica synchronization for Zookeeper errors
Explicitly check replica follower status before automatic promotion to leader is allowed
RESTPP fix - memory leaks caused by timed out queries
Backup/Restore: Ensure GPE and GSE snapshots are done in correct order

TigerGraph 2.6.4

Release Date: 2020-11-02

Enhancements

Database Server

Allow RESTPP to manage log files based on timestamp
Upgrade NGINX to 1.18 version
Correct status code to indicate GSQL operation result
Remove Hard timeout limit for Backup/Restore operations
Token Management Improvements:
- Improve GSQL stability by setting a limit on number of tokens allowed
- Logging improvement to indicate new and refreshed tokens separately

Fixed

Database Server

Core: GSE follower replicas lag leader replica on the data updates
Core: Shuffle abort causing GPE crash
Core: Handle un-released lock gracefully during json print command failure
Core: Incremental Snapshot triggers creation of all segments causing delays
Core: Kafka loading fails when number of loaders exceed 10
GSQL: Query Install fails for batch installs
Backup/Restore hangs if there are too many files

TigerGraph 2.6.3

Release Date 2020-08-21

Enhancements

Improved handling of query time outs for distributed queries.

Fixed

Longer timeout for retrieving large memory map for attributes of STRING COMPRESS data type with large number of distinct values.
Backup jobs report incorrect successful runs
Incorrect type check logic for trim function;

TigerGraph 2.6.2

Release Date 2020-08-14

Enhancements

Improvements to GSE Upsert performance
Add User Id information to RESTPP logs for all user initiated calls
Improvements to Query Installation performance time
Provide warning message when revoking a role from proxy user if needed

Fixed

Core: GPE crash on unknown vertex / segment
Core: PostWriter needs to skip vertices if the internal vertex id is invalid one.
Core: Handle exception in ResponseThread of RemoteTopology
Core: Query re-installation issue caused by non-deterministic transformation
Core: Address Data Loading speed for hub loading
Core: Inconsistent result with and without using local accumulators
Core: RestPP payload scale issue due to 3rd party FCGI library
GSQL: GSQL pattern match - translation error when vertex type is the keyword "ANY"
GSQL: Issue with reduce function with Bitwise OR operator in the LOAD functions
GSQL: gsql_client strips out newlines when writing gsql queries by pasting into gsql shell
GSQL: Secrets and token associated with a graph and not removed during graph delete
GraphStudio: Displaying attribute for raw type in visualization should not use JSON stringify method

TigerGraph 2.6.1

Release Date 2020-06-12

Enhancements

Allow concurrent user query access during Query Installation
GPE & GSE Data Sync Check Utility
Use of POST for /requesttoken API so that user password is not exposed
Write Performance improvements
Error handling and reporting improvements for Query Timeout and Failures
UX improvement for ‘Clear Graph’ command in GraphStudio

Fixed

Ensure cleanup and compaction of delta records in a large transaction even in the event of TigerGraph service restart
Performance improvement to make Graph Updates faster by parallelizing and sharing transaction
Fix for the leftover Shuffle threads after Query Abort/Timeout
Change in the error message of AbortQuery request inside the Shuffle Operator
Bug fixes for GSE compaction feature to address exporting with mixed segments of data and load data from the database in worker mode
Fix for GSE crash triggered by schema change
Enable background thread on JEMALLOC for memory cleanup even when system is idle
/showprocesslist and /abortquery APIs do not list the running queries of old worker if RESTPP is refreshed
S3 loader header check doesn't apply file filter regex
GSQL V2 syntax does not handle ACCUM operator correctly
Fix for RESTPP timeout error

TigerGraph 2.6.0

Changes

Release Date 2020-04-24

New and modified features and described in the TigerGraph 2.6 Release Notes.

Enhancements

Remove SSH connection use dependency for GSQL Install Query command
New 'force' parameter to RebuildNow so that engine to start the rebuild.

Fixed

Core: GSE crash in HA setup when CPU usage is extremely high
Core: Out Of Memory handling improvements to prevent GPE crash due bad memory allocation call
GLE: fix builtin query crash in worker due to graph id missing
Core: Skewed CPU usage for high-query throughput scenarios
Fixes in Rebuild to address broken edge count
Fix for 2.5.2 bug - Inconsistent query results when running non-distributed query on a cluster
Unable to find local vertex and edge with same name in multiple graphs
RESTPP memory leak due to yaml file
Reverse edge id is wrong when two local edges with reverse edge are created with same name

TigerGraph 2.5.4

Release Date: 2020-04-24

Enhancements

New 'force' parameter to RebuildNow so that engine to start the rebuild.
Improved version of /abortquery so that query can be aborted more quickly

Fixed

Fixes in Rebuild to address broken edge count
RESTPP memory leak due to yaml file
Builtin query crashed due to missing Graph Id
RESTPP crash for same vertex name in the global graph
Resolved the distributed query hanging issue which could block rebuild and schema change
Core: Skewed CPU usage for high-query throughput scenarios

TigerGraph 2.5.3

Release Date: 2020-02-26

Fixed

Ensure catalog data backed up before schema change
Support creation of two local edges with same name with one being a reverse edge
Support Local vertex and edge type with same name in multiple graphs in
Support for multi-lingual string constant in Interpret query mode
Upgrade to Release 2.5.2 leads to inconsistent query results
Compute resource usage spikes on particular node in cluster
GCleanUp failed to cleanup all pointers when adjusting thread

TigerGraph 2.5.2

Release Date: 2020-01-27

TigerGraph 2.5.2 is not compatible with versions prior to 2.5.1. Customers who are using Pre-2.5.1 version and intending to migrate to 2.5.2 are advised to take backup of their existing version before upgrading to 2.5.2. This will enable them to downgrade back to the original Pre-2.5.1 version if nee

New Features

GPE: Increase MemoryCheck frequency based on Resource Usage
GPE: Abort Query if Memory usage crosses critical threshold
GSE: Support Log compaction as part of startup for GSE
GraphStudio: Support Multi-edge pair in design schema.
Core: Support OS RHEL 8.0 in Installer

Enhancements

REST: Increase the RESTPP reload timeout
GSQL: Change error message to specify user when default tigergraph user is dropped
GSQL: Make user tigergraph droppable
GraphStudio: Do not change layout when adding/updating/deleting vertex and edge

Fixed

Core: GPE crashed running distributed LDBC query
GST: Incorrect vertex count in TigerGraph GraphStudio
Core: Shuffle deadlock causing full system memory use
Core: Replace GASSERT with GWARN in GDataBox
Core: BATCH_SIZE of Kafka loader set from GSQL console doesn't work
GPE: Schema Change failed due to Query Install OOM
GSQL: Quote in string key is not escaped
GraphStudio: Reverse edge filter doesn't work
Core: Don't display LDAP password in IUM

TigerGraph 2.5.1

Release Date: 2019-11-25

Fixed

Core: Distributed delete affects data consistency after GPE restart
Core: Shuffle hangs when sendingQueue is full
Core: Longevity test failing due to change in memory allocator (TCMalloc)
GPE: Crash after upgrade from 2.4.1 to 2.5
GPE: Serialization error when reading from input stream
GPE: Query state can result in race condition inside ReadOneDelta;
GPE: GPE crashes when a query calls a sub-query with a write operation
GSE: Script to resolve delete inconsistency between GSE and GPE
GSE: Multiple Kafka loading jobs fail
GSQL: Built-in function names in GSQL are case sensitive
GSQL: Interpret query doesn't work when authentication is on
GSQL: Deadlock when graph store is being cleared and authentication is on
GSQL: Token authentication returning null during Global schema change
GSQL: SSO login failure due to missing org.apache.santuario:xmlsec library
GraphStudio: Vertex to edge expansion settings are not retained
GBAR Backup: Backup failure if loading jobs are in progress

TigerGraph 2.5.0

Release Date 2019-09-18

Changes

New and modified features and described in the TigerGraph 2.5 Release Notes.

Fixed

Improvements to fix possible crash, deadlock, overflow, and memory leak situations
Improve query performance stability
Fix some query string passing and parsing issues
Correct some inconsistencies between the documented specification and actual behavior
Improve robustness of Kakfa and S3 Loaders
Clean up files and graph properly after certain failed operations
Fix some installation issues

TigerGraph 2.4.1

Release Date 2019-07-23

Changes

To select pattern matching support in a query, the syntax is now CREATE QUERY ... SYNTAX v2 instead of CREATE QUERY ... SYNTAX("v2")

Fixed

GPE: Fix uint32 overflow
Loader: Allow temp_table to be used without flatten function
IDS: Disable empty UID
ZMQ: Fix crash on ill-formed message
Util: Fix Unix domain socket file not generated correctly in cron job
Util: Extend data size for GoutputStreamBuffer beyond 4GB
Connector: Fix first line is not ignored with has_header enabled
Connector: Fix failures on retrieving connector status
GSQL: Fix syntax version setting inconsistency issues
GSQL: Fix schema change with USING primary_id_as_attribute
GSQL: Fix JSON output format of requesttoken API
Admin Portal: Display correct counts of physical vertices and edges on each machine

TigerGraph 2.4.0

Release Date 2019-06-25

New Features

See Release Notes - TigerGraph 2.4

Fixed

GSQL: The built-in count() function gives the correct value in all cases.
GPE: startup hang
GSQL server start/stop command not working
LDAP config truncated by space
GSE: boolean values are not displayed correctly
Security issue CVE-2013-7459 caused by unused python crypto library
IUM status is displayed incorrectly in some cases;

TigerGraph 2.3.2

Release Date 2019-04-01

Issues

GSQL: The built-in count() function may give the incorrect value for clustered systems after some vertices have been deleted.

Fixed

GraphStudio: Send query pre-install dependency analysis result through WebSocket
GraphStudio: filter out improper attributes in when building filter expressions
GPE: fix wrong enumerator id issue
GPE: avoid using /tmp
GPE: handle exceptions for LIKE <expr>
GPE: Fix crash due to writing wrong size of STRING_LIST
GPE: Fix global schema change error which added local vertex twice
GSE (Developer Edition): Keep one copy of segment

TigerGraph 2.3.1

Release Date 2019-02-19

New Features

See Release Notes-TigerGraph 2.3

Issues

GSQL: The built-in count() function may give the incorrect value for clustered systems after some vertices have been deleted.

Fixed

Install: The IP list fetched by the installer could be incomplete.
Loading: Speed up batch-delta loading.
GraphStudio: Disable Install Query button for queryreader users.
GraphStudio: Re-initialize the database after import.
GraphStudio: Could not drop query with non-default username/password.
AdminPortal: Queries-Per-Second display didn't work if RESTPP authorization was enabled.
Schema change: Improve schema change stability by reducing schema change history and increasing gRPC max message limit.
GPE: Improve query HA stability.
GPE: Fix crash under certain conditions.
Core: Memory leak due to yamlcpp.
Core: compatibility issue between libc and ssh utility.
IUM: Fix exceptions due to legacy config entries.

TigerGraph 2.2.4

Release Date: 2018-12-13

Fixed

Distributed System: Fix possible deadlock and race conditions
GSE Storage Engine: Fix disk seek overflow
RESTPP: Optimize the memory consumption when system is idle
RESTPP: Optimize config reload time
GSQL: Fix query installation error with option -optimize
GSQL: Fix a code generation bug related to static variable
GSQL: Fix a compilation error when a statement is in nested if statement
GraphStudio: Security update for npm-run-all
GraphStudio: Change Help button to point to new docs.tigergraph.com site
Gadmin: Fix gadmin/ts3 restart and status error after changing port of TS3

TigerGraph 2.2.3

Release Date: 2018-11-30

Fixed

GraphStudio: Fix schema change bug (Note: In 2.2, GraphStudio now does not drop all data when making a schema change.)
GraphStudio: Fix display issue in Graph Explore when switch to a new graph
GraphStudio: Improve password security
GraphStudio: Modify URL to AdminPortal for better universal support
IUM: Fix kafka-loader configuration after cluster expansion
IUM: Resolve python module name conflict
IUM: Fix ssh_port is always 1 under bash interactive mode
GSE Storage Engine: Reduce memory consumption
RESTPP: Improve logging messages

TigerGraph 2.2

Release Date: 2018-11-05

New Features

See Release Notes-TigerGraph 2.2

Fixed

GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft
Admin Portal: Number of nodes in the cluster is reported as 0 when no graph yet exists

TigerGraph 2.1.8

Release Date: 2018-11-05

Issues

GBAR Backup fails if HA is enabled
GSE status shows unknown with HA enabled
TS3 fails to collect QPS when RESTPP Authentication is enabled (Admin Portal QPS monitor will be unavailable in this case).
GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft.
Admin Portal: Number of cluster nodes is reported as 0 when no graph exists.

Fixed

GSQL server error if schema is too large
In a cluster, not all servers may be aware of deleted vertices.
PAM limit set-up issue in installer
In MultiGraph, a local (FROM *, TO *) local edge has global side effects.
RESTPP's default API version is not set after installation
An engine bug which occasionally causes crash

Added

SSH port configuration in installer.
Installation script checks that the machine meets the minimum RAM (8GB) and CPU (2-core) requirements.
For Ubuntu 16.04/18.04, support logon with systemd service.

TigerGraph 2.1.7

Release Date: 2018-08-20

Issues

GBAR backup fails if HA is enabled.
TS3 fails to collect QPS when RESTPP Authentication is enabled (Admin Portal QPS monitor will be unavailable in this case).
GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft.
Admin Portal: Number of cluster nodes is reported as 0 when no graph exists.

Fixed

Cluster configuration with HA enabled is wrong if the number of nodes is odd (3, 5, 7, 9...).
GraphStudio and GSQL inconsistent checking for some keywords
GBAR backup and restore fail if special character is in tag name

TigerGraph 2.1.6

Release Date: 2018-08-15

Issues

Cluster configuration with HA enabled is wrong if the number of nodes is odd (3, 5, 7, 9...).
GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft.
TS3 fails to collect QPS when RESTPP Authentication is enabled (Admin Portal QPS monitor will be unavailable in this case).
Admin Portal: Number of cluster nodes is reported as 0 when no graph exists.

Fixed

GSQL null pointer exception during schema change if a directed edge is dropped but its partner reverse edge is kept.
Some complex attribute types cannot be correctly posted via /graph endpoint.
In some cases, tuple on reverse edge crashes GPE.
GraphStudio throws an authentication error if RESTPP authentication is enabled.

Added

License level control of MultiGraph functionality.

Tigergraph 2.1.5

Release Date: 2018-07-24

Known Issues

GSQL null pointer exception during schema change if a directed edge is dropped but its partner reverse edge is kept.
Some complex attribute types cannot be correctly posted via /graph endpoint.
In some cases, tuple on reverse edge crashes GPE.

Fixed

GraphStudio Export package is occasionally incomplete.
GSE status is always "not ready" if schema is too large.
Cannot modify RESTPP port configuration.
IUM error in a cluster when not running on node m1

V3.0 Removal of Previously Deprecated Features

TigerGraph 2.x contained some features which were labeled as deprecated. These features are no longer necessary because they have been superseded already by improved approaches for using the TigerGraph platform. The new approaches were developed because they use more consistent grammar, are more extensible, or offer higher performance. Therefore, TigerGraph 3.0 has streamlined the product by removing support for some of these deprecated features, listed below:

If a vertex type is specified, the vertex type must be within parentheses.

Query, Job, and Token Management

These are documented in several places throughout the GSQL Language Reference:

CREATE QUERY Statement
Creating a Loading Job and Running a Loading Job
RUN SCHEMA_CHANGE JOB, and RUN GLOBAL SCHEMA_CHANGE JOB
CREATE / SHOW / DROP / REFRESH Token
offline2online in 'Creating a Loading Job'

Output

See PRINT Statement in 'Output Statements and File Objects'

Built-in Queries

See Run Built-in Queries in 'GSQL 101'

Knowledge Base and FAQs

If you have a problem with the procedure described in the TigerGraph Platform Installation Guide, please contact support@tigergraph.com and summarize your issue in the email subject.

Getting Started and Basics

I need help installing the system.

What version of the TigerGraph platform am I running?

Use the following command:

$ gsql --version

To see the version numbers of individual components of the platform:

$ gadmin version

How do I upgrade from an earlier version?

Each release comes with documentation addressing how to perform an upgrade. Upgrade instructions are documented in Installation guide. Please contact support@tigergraph.com for help in your specific situation.

I'm not sure how to run the TigerGraph system.

If you correctly installed the system and are now logged in as the TigerGraph system user, you should be able to enter the GSQL shell by typing the gsql command from an operating system prompt. If this command has never worked, then probably the installation was not successful. If it works but you are not sure what to do next, please see the GSQL Demo Examples guide.

The system does not seem to be running correctly.

If you believe you have installed the system correctly (e.g., you followed the TigerGraph Platform Installation Guide and received no errors, and the gsql and gadmin commands are now recognized), then please contact support@tigergraph.com and summarize your issue in the email subject.

Do I need to start the TigerGraph servers (e.g., GPE, GSE) to run the system?

Different servers are needed for different purposes, but the TigerGraph should automatically turn services on and off as needed. Please be sure that the Dictionary (dict) server is on when using the TigerGraph system:

To check the status of servers:

$ gadmin status

Does the TigerGraph system have in-tool help?

Yes. For the GSQL shell and language, first enter the shell (type gsql from an operating system prompt). Then type the help command, e.g.,

HELP

This gives you a short list of commands. Note that "help" itself is one of the listed commands; there are help options to get more details about BASIC , QUERY commands. For example,

HELP QUERY

lists the command syntax for queries. See the "System Basics" section of the GSQL Language Reference, Part 1: Defining Graphs and Loading Data. The gadmin administration tool also has a help menu and a manual page:

$ gadmin help

Is the GSQL language case sensitive?

User-defined identifiers are case-sensitive. For example, the names User and user are different. The GSQL language keywords (e.g., CREATE, LOAD, VERTEX) are not case-sensitive, but in our documentation examples, we generally show keywords in ALL CAPS to make them easy to distinguish.

What are the rules for naming identifiers?

An identifier consists of letters, digits, and the underscore. Identifiers may not begin with a digit. Identifiers are case sensitive. Special naming rules apply to accumulators (see the Query section).

When are quotation marks required? Single or double quotes?

The general rule is that string literals within the GSQL language are enclosed in double quotation marks. For data that is to be imported (not yet in the GSQL data store), the GSQL loading language lets the user specify how data fields are delimited within your input files. The loading language has an option to specify whether single quotes or double quotes are used to mark strings. For more help on loading, see the "Loading Data" section of this document or of the GSQL Language Reference, Part 1: Defining Graphs and Loading Data .

Can I run GSQL Shell commands in batch command?

Yes. You can create a text file containing a sequence of GSQL commands and then execute that file. To execute from outside the shell:

$ gsql filename

To execute the command file from within the shell:

@filename

See also the "Language Basics" and "System Basics" sections of the GSQL Language Reference, Part 1: Defining Graphs and Loading Data document.

I have a long command line. Can I split it into multiple lines?

Yes. Normally, an end-of-line character triggers execution of a line. You can use the BEGINand ENDkeywords to mark off a multi-line block of text that should not be executed until ENDis encountered.

This is an example of a loading statement split into multiple lines using BEGIN and END:

BEGIN
CREATE ONLINE_POST JOB load1 FOR GRAPH LaborForce {
  LOAD 
    TO VERTEX user VALUES ($0, _, _, _),
    TO VERTEX occupation VALUES ($0, _),
    TO EDGE user_occupation VALUES ($0, $1);
}
END

What is Limited Capacity Mode?

When a license limit has been reached, your system will be placed in a read-only mode - incapable of loading anymore data. You will still be able to delete data and view the graph.

Defining a Graph Schema

What are the components of a graph schema?

A TigerGraph graph schema consists of (A) one or more vertex types, (B) one or more edge types, and (C) a graph type. Each edge type is defined to be either DIRECTED or UNDIRECTED. The graph type is simply the list of vertex types and edges types which may exist in the graph. For more: See the section "Defining a Graph Schema" in the GSQL Language Reference, Part 1: Defining Graphs and Loading Data . Below is an example of a graph schema containing two vertex types, one edge type, and one graph type:

CREATE VERTEX user (PRIMARY_ID user_id UINT, age UINT, gender STRING, postalCode STRING)
CREATE VERTEX occupation (PRIMARY_ID occ_id STRING, occ_name STRING)
CREATE UNDIRECTED EDGE user_occupation (FROM user, TO occupation)
CREATE GRAPH LaborForce (user, occupation, user_occupation)

Alternately, a generic CREATE GRAPH statement can be used:

CREATE GRAPH LaborForce (*)

Should I model this data field as an attribute or as a vertex type?

Property graphs can model data fields ("properties") as either a property of a vertex or edge or as a vertex linked to other vertices. If your property relates to an edge, it should be an attribute of that edge (for example, a Date attribute of a CustomerBoughtProduct edge). If your property relates to a vertex, you have a choice. The optimal choice depends on how you will typically use this attribute in your application. If you will frequently search or filter based on that data, we suggest your treat it as a separate vertex type. Otherwise, we recommend modeling this data as an attribute of the principal vertex.

What data types do you support for vertex and edge attributes?

Each attribute of a vertex or edge has an assigned data type. v0.8 of the TigerGraph adds support for many more attribute types.: DATETIME, UDT, and container types LIST, SET, and MAP. The following is an abbreviated list. For a complete list and description, see the section "Attribute Data Types" of the GSQL Language Reference, Part 1: Defining Graphs and Loading Data .

Discontinued Feature

The UINT_SET and STRING_SET COMPRESS types have been discontinued since there is now equivalent functionality from the more general SET and SET types.

Can I define and load multiple graph schemas?

The TigerGraph MultiGraph service, an add-on option, supports logical partitions of one unified global graph. Each partition is treated as an independent local graph, with its own set of user privileges. Local graphs can overlap, to create a shared data space.

How many vertex and edge types can I include in a graph?

For performance reasons, we recommend to keep the number of different vertex and edge types under 5,000. The upper limit for the number of different vertex and edge types is approximately 10,000, depending on the complexity of the types.

How do I check the definition of the current schema?

From within the GSQL Shell, the ls command lists the catalog : the vertex type, edge type, and graph type definitions, job definitions, query definitions, and some system configuration settings. If you have not set your active graph, then ls will show only item which have global scope. To see graph-specific items (including loading jobs and queries), you must define an active graph.

How do I modify my graph schema?

The GSQL language includes ADD, ALTER, and DROP commands. See the section "Update Your Data" in the GSQL Demo Examples or the section "Modifying a Graph Schema" in the GSQL Language Reference, Part 1: Defining Graphs and Loading Data for details. Note that altering the graph schema will invalidate your old data loading and query jobs. You should create and install new loading and query jobs.

How do I delete my entire graph schema?

To delete your entire catalog, containing not just your vertex, edge, and graph type definitions, but also your loading job and query definitions, use the following command: GSQL> DROP ALL

To delete just your graph schema, use the DROP GRAPH command: GSQL> DROP GRAPH g1

UPDATE Deleting the graph schema also erases the contents of the graph store. To erase the graph store without deleting the graph schema, use the following command: GSQL> CLEAR GRAPH STORE

See also " How do I erase all data? "

Loading Data

How do I load data?

To load structured data stored in files, you write a loading job and then execute it. See GSQL 101 and the GSQL Demo Examples for introductory examples. Loading jobs can include instructions for parsing and processing the data, in order to perform many ETL tasks. See Creating a Loading Job for the complete specifications. To load streaming data or data coming from other data stores, see Data Loader User Guides.

In v2.0, the TigerGraph introduced a more powerful and comprehensive syntax which has several advantages:

The TigerGraph platform can handle concurrent loading jobs, which can greatly increase throughput.
The data file locations can be specified at compile time or at run time. Run-time settings override compile-time settings.
A loading job definition can include several input files. When running the job, the user can choose to run only part of the job by specifying only some of the input files.
Loading jobs can be monitored, aborted, and restarted.

What types of data can be read?

The GSQL data loader reads text files organized in tabular or JSON format . Each field may represent numeric, boolean, string, or binary data. Each data field may contain a single value or a list of values (see How do I split a data field containing a list of values into separate vertices and edges? ).

Additional data formats are continually being added. See Data Loader User Guides and the TigerGraph Ecosystem Github Repository's etl folder https://github.com/tigergraph/ecosys/tree/master/tools/etl

What is the format of a tabular input data file?

Each tabular input data file should be structured as a table, in which each line represents a row, and each row is a sequence of data fields, or columns. A data field can contain string or numeric data. To represent boolean values, 0 or 1 is expected. A header line may be included, to associate a name with each column. A designated character separates columns. For example, if the designated separator character is the comma, this format is commonly called CSV, for Comma-Separated Values. Below is an example of a CSV file with a header. The uidcolumn is int type, nameis string type, avg_scoreis float type, and is_memberis boolean type. See simple examples in Real-Life Data Loading and Querying Examples and a complete specification in the section "Creating a Loading Job" in GSQL Language Reference, Part 1: Defining Graphs and Loading Data .

uid,name,avg_score,is_member
100,"Lee, Tom",48.5,1
101,"Wu, Ming",33.9,0
102,"Gables, Anne", 72.2,1

The loader does not filter out extra white space (spaces or tabs). The user should filter out extra white space from the files before loading into the TigerGraph system.

How should data fields be separated?

The data field (or token ) separator can be any single ASCII character, including one of the non-printing characters. The separator is specified with the SEPARATOR phrase in the USING clause. For example, to specify the semicolon as the separator: USING SEPARATOR=";"

To specify the tab character, use \t. To specify any ASCII character, use \nn where nn is the character's ASCII code, in decimal. For example, to specify ASCII 30, the Record Separator (RS): USING SEPARATOR="\30"

Should fields be enclosed in quotation marks?

TigerGraph does not require fields to be enclosed in quotation marks, but is it recommended for string fields. If the QUOTE option is enabled, and if the loader finds a pair of quotation marks, then the loader treats the text within the quotation marks as one value, regardless of any separation characters that may occur in the value. The user must specify whether strings are marked by single quotation marks or double quotation marks. USING QUOTE="single" or USING QUOTE="double"

For example, if SEPARATOR="," and QUOTE="double" are set, then when the following data are read,

uid,name,avg_score,is_member
100,"Lee, Tom",48.5,1
101,"Wu, Ming",33.9,0
102,"Gables, Anne,"72.2,1

"Lee, Tom" will be read as a single field. The comma between Lee and Tom will not separate the field.

Does the GSQL Loader automatically interpret quotation marks as enclosing strings?

No. You must specify either QUOTE="single" or QUOTE="double" .

What are the parameters (in the USING clause) for a loading job?

The following three parameters should be considered for every loading job from a tabular input file:

The next two parameters, FILENAME and EOL are required if the job is an ONLINE_POST job:

All of the these five parameters are combined into one USING clause with a list of parameter/value pairs. The parameters may appear in any order.

USING parameter1="value1", parameter2="value2",... , parameterN="valueN"

The location of the USING clause depends on whether the job is an offline loading job or an online loading job. For offline loading, the USING clause appears at the end of the LOAD statement. For example:

CREATE LOADING JOB load1 FOR GRAPH LaborForce{
  LOAD "jobs.csv" TO VERTEX occupation VALUES ($0, $1) USING HEADER="true", SEPARATOR="|", QUOTE="double";
}

For online loading, the USING clause appears at the end of the RUN statement

CREATE ONLINE_POST JOB load2 FOR GRAPH LaborForce{
  LOAD TO VERTEX occupation VALUES ($0, $1);
}
RUN JOB load2 USING FILENAME="./jobs.csv", HEADER="true", SEPARATOR="|", QUOTE="double", EOL="\n"

My data file doesn't have a header but I still want to name the columns.

You can define a header line (a sequence of column names) within a loading job using a DEFINE HEADER statement, such as the following:

DEFINE HEADER head1 = "jobId", "jobName";

This statement must appear before the LOAD statement that wishes to use the header definition. Then, the LOAD statement must set the USER_DEFINED_HEADER parameter in the USING clause. A brief example is shown below:

CREATE ONLINE_POST JOB load2 FOR GRAPH LaborForce{
  DEFINE HEADER head1 = "jobId", "jobName";
  LOAD TO VERTEX occupation VALUES ($"jobId", $"jobName") USING USER_DEFINED_HEADER="head1";
}

How do I identify and refer to the input data fields?

Input data fields can always be referenced by position. They can also be referenced by name, if a header has been defined.

- Position-based reference: The leftmost field is $0, the next one is $1, and so on.
- Name-based reference: $"name" , where name is one of the header column names.

For example, if the header is abc,def,ghi

then the third field can be referred to as either $2 or $"ghi" .

How do I split (flatten) a data field containing a list of values into separate vertices and edges?

First, to clarify the task, consider a graph schema with two vertex types, Book and Genre, and one edge type, book_genre:

CREATE VERTEX Book  (PRIMARY_ID bookcode STRING, title STRING)
CREATE VERTEX Genre (PRIMARY_ID genre_id STRING, genre_name STRING)
CREATE UNDIRECTED EDGE book_genre (FROM Book, TO Genre)
CREATE GRAPH book_rating (Book, Genre, book_genre)

Further, each row of the input data file contains three fields: bookcode , title , and genres , where genres is a list of strings associated with the book. For example, the first few lines of the data file could be the following:

bookcode|title|genres
101|"Harry Potter and the Philosopher's Stone"|fiction,fantasy,young adult
102|"The Three-Body Problem"|fiction,science fiction,Chinese

The data line for bookcode 101 should generate one Book instance ("Harry Potter and the Philosopher's Stone"), four Genre instances ("fiction", "adventure", "fantasy", "young adult"), and four Book_Genre instances, connecting the Book instance to each of the Genre instances. This process of creating multiple instances from a list field (e.g., the genres field) is called flattening .

To flatten the data, we use a two-step load. The first LOAD statement uses the flatten() function to split the multi-value field and stores the results in a TEMP_TABLE. The second LOAD statement takes the TEMP_TABLE contents and writes them to the final edge type.

CREATE ONLINE_POST JOB load_books FOR GRAPH book_rating {
  LOAD
      TO VERTEX Book VALUES ($0, $1),
      TO TEMP_TABLE t1(bookcode,genre) VALUES ($0, flatten($2,",",1));
 
  LOAD TEMP_TABLE t1
      TO VERTEX Genre VALUES($"genre", $"genre"),
      TO EDGE book_genre VALUES($"bookcode", $"genre");
}
RUN JOB load_books USING FILENAME="book.dat", SEPARATOR="|", HEADER="true", QUOTE="double", EOL="\n"

The flatten function has three arguments: (field_to_split, separator, number_of_parts_in_one_field). In this example, we want to split $2 (genres), the separator is the comma, and each field has only 1 part. So, the flatten function is called with the following arguments:flatten($2, ",",1) . Using the example of data file , TEMP_TABLE t1 will then contain the following:

The second LOAD statement uses the TEMP_TABLE t1 to generates Genre vertex instances and book_genre_instances. While there are 7 rows shown in the sample TEMP_TABLE, only 6 Genre vertices will be generated, because there are only 6 unique values; "Fiction" appears twice. Seven book_genre edges will be generated, one for each row in the TEMP_TABLE.

There is another version of the flatten function which has four arguments and which supports a two-level grouping. That is, the field contains a list of groups, each group composed of N subfields. The arguments are (field_to_split, group_separator, sub_field_separator, number_of_parts_in_one_group). For example, suppose the data line were organized this way instead:

bookcode|title|genres
101|"Harry Potter and the Philosopher's Stone"|FIC:fiction,ADV:adventure,FTS:fantasy,YA:young adult
102|"The Three-Body Problem"|FIC:fiction,SF:science fiction,CHN:Chinese"

Then the following loading statements would be appropriate:

CREATE ONLINE_POST JOB load_books2 FOR GRAPH book_rating {
  LOAD
      TO VERTEX Book VALUES ($0, $1),
      TO TEMP_TABLE t1(bookcode,genre_id,genre_name) VALUES ($0, flatten($2,",",":",2));
 
  LOAD TEMP_TABLE t1
      TO VERTEX Genre VALUES($"genre_id", $"genre_name"),
      TO EDGE book_genre VALUES($"bookcode", $"genre_id");
}
RUN JOB load_books2 USING FILENAME="book2.dat", SEPARATOR="|", EOL="\n"

Can the TigerGraph system load data from a streaming source?

Yes. Two approaches are to use our Kafka Loader or to periodically read from one or more files. A loading job lets you define a general loading process without naming the data source. Every time you call an online loading job, you name the source file. It can be a different file each time, or it can be the same file, if the contents of the file are changing over time. Also, if it happens that the loader re-reads a data line that it has encountered before, it will just reload the data (except for container attributes, e.g., a LIST attribute, using a reduce() loading function. In that case, there is an accumulative effect for re-reading a data line).

I want to compute an attribute value. What built-in functions are available?

The GSQL Loading includes some built-in token functions (a token is one column or field of a data input line.) A user can also define custom token functions. Please see the section "Built-In Loader Token Functions" in the GSQL Language Reference, Part 1: Defining Graphs and Loading Data .

Do I need a one-to-one correspondence between input files and vertex types and edge types?

No. One of the advantages of the TigerGraph loading system is the flexible relationship between input files and resulting vertex and edge instances. In general, there is a many-to-many relationship: one input file can generate many vertex and edge types.

From the LOAD statement perspective for a online loading job:

LOAD 
  TO VERTEX vertex_type VALUES (attr_expr...) [WHERE conditions],
  ...,
  TO VERTEX vertex_typeN VALUES (attr_expr...) [WHERE conditions],
  TO EDGE  edge_type VALUES (attr_expr...) [WHERE conditions] [OPTION (options)],
  ...,
  TO EDGE edge_typeN VALUES (attr_expr...) [WHERE conditions] [OPTION (options)]
  [Parsing_Conditions];

Each LOAD statement refers to one input file.
Each LOAD statement can have one or more resulting vertex types and one or more resulting edge types.
Hence, one LOAD statement can potentially describe the one-to-many mapping from one input file to many resulting vertex and edge types.
It is not necessary for every input line to always generate the same set of vertex types and edge types. The WHERE clause in each TO VERTEX | TO EDGE clause can be used to selectively choose and filter which input lines generate which resulting types.

My input data includes multiple edge instances between a pair of vertices. Why is there only one in the graph?

This not an error. There can only be one instance of a certain edge type between any given pair of vertices, so the most recently loaded edge data will be the edge that you will see in the graph.

Updating and Modifying Data

How can I insert / load more data?

If there is already data in the graph store and you wish to insert more data, you have a few options. First, if you have bulk data stored in a file (local disk, remote or distributed storage), you can us e Online Loading .

Second, if you have a few specific insertions, you can use the Upsert da ta command in the RESTPP API User Guide . For Upsert, the data must be formatted in JSON format.

Third, you can write a query containing INSERT statements. The syntax is similar to SQL INSERT. (See GSQL Language Reference Part 2 - Querying . ) The advantage of query-based INSERT is that the details (id values and attribute values) can be determined at run time and even can be based on an exploration and analysis of the existing graph. The disadvantage is that the query-insert job must be compiled first and data values must either be hardcoded or supposed as input parameters.

How can I modify the graph schema?

You can modify the schema in several ways:

Add new vertex or edge types
Drop existing vertex or edge types
Add or drop attributes from an existing vertex or edge type

Any schema change can invalidate existing loading jobs and queries.

See the section "Modifying a Graph Schema" in GSQL Language Reference Part 1 - Defining Graphs and Loading Data .

How do I modify data?

To make a known modification of a known vertex or edge: Option 1) Make a RESTPP endpoint request, to the POST /graph or DELETE /graph endpoint. See the RESTPP API User Guide .

Option 2) The Loading language includes an upsert command. The UPSERT statement performs a combined modify-or-add operation, depending on whether the indicated vertex or edge already exists. Examples of UPSERT are described in the GSQL Demo Examples document. The GSQL Language Reference Part 1 - Defining Graphs and Loading Data provides a full specification .

Option 3) The query language now includes an UPDATE statement which enables sophisticated selection of which vertices and edges to update and how to update them. Likewise, there is an INSERT statement in the query language. See the GSQL Language Reference Part 2 - Querying .

How do I selectively delete data?

You can write a query which selects vertices or edges to be deleted. See the DELETE subsections of the "Data Modification Statements" section in GSQL Language Reference Part 2 - Querying .

How do I erase all the data?

If you wish to completely clear all the data in the graph store, use the CLEAR GRAPH STORE -HARD command. Be very careful using this command; deleted data cannot be restored (except from a Backup). Note that clearing the data does not erase the catalog definitions of vertex, edge, and graph types. See also " How do I delete my entire graph schema? "

-HARD must be in all capital letters.

Querying

Is there more than one TigerGraph query language?

Yes. The GSQL Query Language is a full-featured graph query-and-data-computation language. In addition, there is a small lightweight set of built-in query commands that can inspect the set of stored vertices and edges, but these built-in commands do not support graph traversal (moving from one vertex to another via edges). We refer to this as the Standard Data Manipulation API or the Built-in Query Language (described in RESTPP API User Guide and the GSQL Demo Examples )

What is the basic syntax for the TigerGraph query language?

For a first-time user: See the documents GSQL Demo Examples and then GSQL Language Reference Part 2 - Querying . For users with some experience, a reference card is now available: GSQL Query Language Reference Card.

Is GSQL a query language or a programming language?

The GSQL Query Language supports powerful graph querying, but it is also designed to perform powerful computations. GSQL is Turing-complete, so it can be considered a programming language. It can be used for simple SQL-like queries, but it also features control flow (IF, WHILE, FOREACH), procedural calls, local and global variables, complex data types, and accumulators to enable much more sophisticated use.

What types of accumulators are available?

Three new types were introduced in v0.8: GroupByAccum, BitwiseAndAccum, and BitwiseOrAccum. Version 0.8.1. added ArrayAccum. This is a quick summary. For a more detailed explanation, see the "Accumulator Types" section of GSQL Language Reference Part 2 - Querying .

In the following table, baseType means any of the following: INT, UINT, FLOAT, DOUBLE, STRING, BOOL, VERTEX, EDGE, JSONARRAY, JSONOBJECT, DATETIME

How do I use accumulators?

See the section "Accumulators" in the GSQL Language Reference Part 2 - Querying document.

How do I reference the ID fields of a vertex or edge in a built-in query?

Vertex and edge IDs (i.e., the unique identifier for each vertex or edge) are treated differently than user-defined attributes. Special keywords must be used to refer to the PRIMARY_ID, FROM, or TO id fields.

Vertices :

In a CREATE VERTEX statement, the PRIMARY_ID is required and is always listed first. User-defined attributes are optional and come after the required ID fields.

CREATE VERTEX Book  (PRIMARY_ID bookcode STRING, title STRING)
CREATE VERTEX Genre (PRIMARY_ID genre_id STRING, genre_name STRING)
CREATE UNDIRECTED EDGE book_genre (FROM Book, TO Genre)
CREATE GRAPH book_rating (Book, Genre, book_genre)

In a built-in query, if you wish to select vertices by specifying an attribute value, you use the attribute name (e.g., title):

SELECT * FROM Book WHERE title=="The Three-Body Problem"

In contrast, if you wish to reference vertices by the id value, the lowercase keyword primary_id must be used. Note that that query does not use the id name pid .

SELECT * FROM Book WHERE primary_id=="101"

Edges :

In a CREATE EDGE statement, the FROM and TO vertex identifiers are required and are always listed first. The FROM and TO values should match the PRIMARY_ID values of a source vertex and a target vertex. In the example below, rating and date_time are user-defined optional attributes.

CREATE UNDIRECTED EDGE book_genre (FROM Book, TO Genre, rating uint, date_time datetime)

In a query, if you wish to select edges by specifying their FROM or TO vertex values, you must use the lowercase keywords from_id or to_id .

SELECT * FROM Book-(book_genre)->Genre WHERE from_id=="101"

What is the format of data returned by a query?

The data are in JSON format. See the section "Output Statements" in the GSQL Language Reference Part 2 - Querying .

Is there an output size limit for a data query?

Yes. The maximum output size for a query is 2GB. If the result of a query would be larger than 2GB, the system may return no data. No error message is returned.

Also, for built-in queries (using the Standard Data Manipulation REST API), queries return at most 10240 vertices or edges.

How and when do I use INSTALL QUERY and INSTALL QUERY -OPTIMIZE?

INSTALL QUERY query_name is required for each GSQL query, after its initial CREATE QUERY query_name statement and before using RUN QUERY query_name . After INSTALL query has been executed, RUN QUERY can now be used.

Anytime after INSTALL QUERY, another statement, INSTALL QUERY -OPTIMIZE can be executed once. This operation optimizes all previously installed queries, reducing their run times by about 20%.

Should I run INSTALL QUERY -OPTIMIZE?

Optimize a query if query run time is more important to you than query installation time. The initial INSTALL QUERY operation runs quickly. This is good for the development phase.

The optional additional operation INSTALL QUERY -OPTIMIZE will take more time, but it will speed up query run time. This makes sense for production systems.

Legal:

CREATE QUERY query1... 
INSTALL QUERY query1 
RUN QUERY query1(...) 
... 
INSTALL QUERY -OPTIMIZE    # (optional) optimizes run time performance for query1 and query2 
RUN QUERY query1(...)      # runs faster than before

Illegal:

INSTALL QUERY -OPTIMIZE query_name

Can multiple users install queries at the same time?

In short, yes. They will not be executed at the same time, but the installations will be queued by the order in which they were received.

Can I make a 2-dimensional (or multi-dimensional) array?

Yes. A ListAccum is like an array, a 1-dimensional array. If you nest ListAccums as the elements within an outer ListAccum, you have effectively made a 2-dimensional array. Please read Section "Nested Accumulators" in the GSQL Language Reference Part 2 - Querying for more details. Here is an example:

CREATE QUERY nestedAccumEx() FOR GRAPH anyGraph {
  ListAccum<ListAccum<INT>> @@_2d_list;
  ListAccum<ListAccum<ListAccum<INT>>> @@_3d_list;
  ListAccum<INT> @@_1d_list;
  SumAccum <INT> @@sum = 4;
   
  @@_1d_list += 1;
  @@_1d_list += 2;
  // add 1D-list to 2D-list as element
  @@_2d_list += @@_1d_list;
   
  // add 1D-enum-list to 2D-list as element
  @@_2d_list += [@@sum, 5, 6];
  // combine 2D-enum-list and 2d-list
  @@_2d_list += [[7, 8, 9], [10, 11], [12]];
   
  // add an empty 1D-list
  @@_1d_list.clear();
  @@_2d_list += @@_1d_list;
   
  // combine two 2D-list
  @@_2d_list += @@_2d_list;
   
  PRINT @@_2d_list;
   
  // test 3D-list
  @@_3d_list += @@_2d_list;
  @@_3d_list += [[7, 8, 9], [10, 11], [12]];
  PRINT @@_3d_list;
}

Can I make nested container Accumulators?

Yes, please read Section "Nested Accumulators" in the GSQL Language Reference Part 2 - Querying for more details. There are seven types of container accumulators: ListAccum, SetAccum, BagAccum, MapAccum, ArrayAccum HeapAccum, and GroupByAccum. Here the allowed combinations:

ListAccum can contain ListAccum.
MapAccum and GroupByAccum can contain any container accumulator except HeapAccum.
ArrayAccum is always nested.

Here is an example:

CREATE QUERY nestedMap() FOR GRAPH anyGraph
{
  MapAccum<String, MapAccum<int, String>> @@testMap;

  @@testMap += ("m1" -> (0 -> "value1"));
  @@testMap += ("m1" -> (1 -> "value2"));
  @@testMap += ("m2" -> (2 -> "value3"));

  IF @@testMap.containsKey("m1") THEN
    PRINT @@testMap.get("m1");
  END;
   //for map, we can get it's value, and then, get the value's key.
  PRINT @@testMap.get("m1").get(0);
}

Testing and Debugging

How can I validate a loading job?

To write a loading job, you must know the format of the input data files, so that you can describe to GSQL how to parse each data line and convert it into vertex and edge attributes. To validate a loading job, that is, to check that the actual input data meet your expectations, and that they produce the expected vertices and edges, you can use two features of the RUN JOB command: the -DRYRUN option and loading a specified range of data lines.

The full syntax for an (offline) loading job is the following:

RUN JOB [-DRYRUN] [-n [ first_line_num ,] last_line_num ] job_name

The -DRYRUN option will read input files and process data as instructed by the job, but it does not store data in the graph store.

The -n option limits the loading job to processing only a range of lines of each input data file. The selected data will be stored in the graph store, so the user can check the results. The -n flag accepts one or two arguments. For example,

-n 50 means read lines 1 to 50. -n 10,50 means read lines 10 to 50. The special symbol $ is interpreted as "last line", so -n 10,$ means reads from line 10 to the end.

Where are the logs?

The following command lists the log locations of the log files:

gadmin log

If the platform has been installed with default file locations, so that <TigerGraph_root_dir> = /home/tigergraph/tigergraph, then the output would be the following:

GPE : /home/tigergraph/tigergraph/logs/gpe/gpe1.out
GPE : /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO
GSE : /home/tigergraph/tigergraph/logs/gse/gse1.out
GSE : /home/tigergraph/tigergraph/logs/GSE_1_1/log.INFO
RESTPP : /home/tigergraph/tigergraph/logs/restpp/restpp1.out
RESTPP : /home/tigergraph/tigergraph/logs/RESTPP_1_1/log.INFO
RESTPP : /home/tigergraph/tigergraph/logs/RESTPP-LOADER_1_1/log.INFO
GSQL : /home/tigergraph/tigergraph/logs/gsql_server_log/GSQL_LOG

As of v2.4, the GSQL log files have been moved in order to keep all logs in a standard directory.

GPE: general system performance logs. GSE: Graph services logs. RESTPP: REST API call logs. GSQL: General GSQL logs.

Where are the log files of loading runs?

Each loading run creates a log file, stored in the folder <TigerGraph_rootdir>/app/<VERSION_NUM>/dev/gdk/gsql/output. The filename load_output.log is a link to the most recent log file. This file contains summary statistics on the number of lines read, the vertices created, and various types of errors encountered. Or, you can type a shell command to find log paths "gadmin log".

What are in the log files?

The log files record detailed internal operations and state information in response to user actions. They provide vital information for diagnosing and debugging your system. All log files can be found in the /home/tigergraph/tigergraph/logs directory. Through typing the command gadmin log, you will be given all the file paths of the most commonly used log files.

GPE Logs - Graph Processing Engine Logs GSE Logs - Graph Storage Engine Logs GSQL Logs - System & Query Logs RESTPP Logs - API call Logs NGINX Logs - HTTP Request Logs VIS Logs - GraphStudio Logs

I can’t seem to load any more data. What’s the matter?

One possible explanation is that you have reached a capacity limit controlled by your product license. To check if this is the case, run the command gadmin status. If the limit has been reached, there will be a warning message, such as the following:

[Warning] License limit exceeded. The system is running in limited capacity mode.

In Limited Capacity mode, additional data may not be inserted. Data may be queried and deleted.

Troubleshooting Guide

This troubleshooting guide is only up to date for v2.6 and below. Additional guidance for v3.0+ is in development.

Introduction

The Troubleshooting Guide teaches you how to monitor the status of your TigerGraph system, and when needed, find the log files in order to get a better understanding of why certain errors are occurring. This section covers log file debugging for data loading and querying.

General

Before any deeper investigation, always run these general system checks :

$ gadmin status        (Make sure all TigerGraph services are UP.)

$ df -lh               (Make sure all servers are getting enough disk space.)

$ free -g              (Make sure all servers have enough memory.)

$ tsar                 (Make sure there is no irregular memory usage on the system.)

$ dmesg -T | tail      (Make sure there are no Out of Memory, or any other errors.)

$ grun all "date"      (Make sure the time across all nodes are synchronized
                        with time difference under 2 seconds. )

Location of Log Files

The following command reveals the location of the log files :

gadmin log

You will be presented with a list of log files. The left side of the resulting file paths is the component for which the respective log file is logging information. The majority of the time, these files will contain what you are looking for. You may notice that there are multiple files for each TigerGraph component.

The .out file extension is for errors. The .INFO file extension is for normal behaviors.

In order to diagnose an issue for a given component, you'll want to check the .out log file extension for that component.

Other log files that are not listed by the gadmin log command are those for Zookeeper and Kafka, which can be found here:

zookeeper : ~/tigergraph/zk/zookeeper.out.*
kafka     : ~/tigergraph/kafka/kafka.out

Synchronize time across nodes in a cluster

TigerGraph will experience a variety of issues if clocks across different nodes in a cluster are out of sync. If running grun all "date" shows that the clocks are out of sync, it is highly recommended that you install NTP implementations such as chrony or timesyncd to keep them in sync.

Installation Error Debugging

Missing Dependencies

The installation will quit if there are any missing dependency packages, and output a message. Please run bash install_tools.sh to install all missing packages. You will need an internet connection to install the missing dependencies.

Pinpoint The Failed Step

Using the -x flag during installation will show you the detailed shell commands being run during installation. bash -x install.sh

Disk Space Errors

The /home directory requires at least 200MB of space, or the installation will fail with an out of disk message. This is temporary only during installation and will be moved to the root directory once installation is complete.
The /tmp directory requires at least 1GB of space, or the installation will fail with an out of disk message
The directory in which you choose to install TigerGraph requires at least 20GB of space, or the installation will report the error and exit.

Firewall Errors

If your firewall blocks all ports not defined for use, we recommend opening up internal ports 1000-50000.

If you are using a cloud instance, you will need to configure the firewall rules through the respective consoles. e.g. Amazon AWS or Microsoft Azure If you are managing a local machine, you can manage your open ports using the iptables command. Please refer to the example below to help with your firewall configuration.

# iptables help page
$ sudo iptables -h

# This will list your firewall rules
$ sudo iptables -L  

# Allow incoming SSH connections to port 22 from the 192.168.0.0 subnet
$ sudo iptables -A INPUT -p tcp --dport 22 -s 192.168.0.0/24 -j ACCEPT
$ sudo iptables -A INPUT -p tcp --dport 22 -s 127.0.0.0/8 -j ACCEPT
$ sudo iptables -A INPUT -p tcp --dport 22 -j DROP

Query Debugging

Checking the Logs - Flow of a query in the system

To better help you understand the flow of a query within the TigerGraph system, we've provided the diagram below with arrows showing the direction of information flow. We'll walk through the execution of a typical query to show you how to observe the information flow as recorded in the log files.

From calling a query to returning the result, here is how the information flows: 1. Nginx receives the request.

grep <QUERY_NAME> /home/tigergraph/tigergraph/logs/nginx/ngingx_1.access.log

You can click on the image below to expand it.

2. Nginx sends the request to Restpp.

grep <QUERY_NAME> /home/tigergraph/tigergraph/logs/RESTPP_1_1/log.INFO

3. Restpp sends an ID translation task to GSE and a query request to GPE. 4. GSE sends the translated ID to GPE, and the GPE starts to process the query. 5. GPE sends the query result to Restpp, and sends a translation task to GSE, which then sends the translation result to Restpp.

grep <REQUEST_ID> /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO

grep <REQUEST_ID> /home/tigergraph/tigergraph/logs/GSE_1_1/log.INFO

6. Restpp sends the result back to Nginx.

grep <REQUEST_ID> /home/tigergraph/tigergraph/logs/RESTPP_1_1/log.INFO

7. Nginx sends the response.

grep <QUERY_NAME> /home/tigergraph/tigergraph/logs/nginx/nginx_1.access.log

Other Useful Commands for Query Debugging

Check recently executed query:
$ grep UDF:: /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO | tail -n 50

Get the number of queries executed recently:
$ grep UDF::End /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO | wc -l

Grep distributed query log:
$ grep “Action done” /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO | tail -n 50


Grep logs from all servers:
$ grun all “grep UDF:: /home/tigergraph/tigergraph/logs/GPE_*/log.INFO | tail -n 50”

Slow Query Performance

Multiple situations can lead to slower than expected query performance:

Insufficient Memory When a query begins to use too much memory, the engine will start to put data onto the disk, and memory swapping will also kick in. Use the Linux command: free -g to check available memory and swap status, or you can also monitor the memory usage of specific queries through GPE logs. To avoid running into insufficient memory problems, optimize the data structure used within the query or increase the physical memory size on the machine.
GSQL Logic Usually, a single server machine can process up to 20 million edges per second. If the actual number of vertices or edges is much much lower, most of the time it can be due to inefficient query logic. That is, the query logic is now following the natural execution of GSQL. You will need to optimize your query to tune the performance.
Disk IO When the query writes the result to the local disk, the disk IO may be the bottleneck for the query's performance. Disk performance can be checked with this Linux command : sar 1 10. If you are writing (PRINT) one line at a time and there are many lines, storing the data in one data structure before printing may improve the query performance.
Huge JSON Response If the JSON response size of a query is too massive, it may take longer to compose and transfer the JSON result than to actually traverse the graph. To see if this is the cause, check the GPE log.INFO file. If the query execution is already completed in GPE but has not been returned, and CPU usage is at about 200%, this is the most probable cause. If possible, please reduce the size of the JSON being printed.
Memory Leak This is a very rare issue. The query will progressively become slower and slower, while GPE's memory usage increases over time. If you experience these symptoms on your system, please report this to the TigerGraph team.
Network Issues When there are network issues during communication between servers, the query can be slowed down drastically. To identify that this is the issue, you can check the CPU usage of your system along with the GPE log.INFO file. If the CPU usage stays at a very low level and GPE keeps printing ??? , it means network IO is very high.
Frequent Data Ingestion in Small Batches Small batches of data can increase the data loading overhead and query processing workload. Please increase the batch size to prevent this issue.

Query Hangs

When a query hangs or seems to run forever, it can be attributed to these possibilities :

Services are down Please check that TigerGraph services are online and running. Run gadmin status and possibly check the logs for any issues that you find from the status check.
Query is in an infinite loop To verify this is the issue, check the GPE log.INFO file to see if graph iteration log lines are continuing to be produced. If they are, and the edgeMaps log the same number of edges every few iterations, you have an infinite loop in your query. If this is the case, please restart GPE to stop the query : gadmin restart gpe -y. Proceed to refine your query and make sure your loops within the query are able to break out of the loop.
Query is simply slow If you have a very large graph, please be patient. Ensure that there is no infinite loop in your query, and refer to the slow query performance section for possible causes.
GraphStudio Error If you are running the query from GraphStudio, the loading bar may continue spinning as if the query has not finished running. You can right-click the page and select inspect->console (in the Google Chrome browser) and try to find any suspicious errors there.

Query Returns No Result

If a query runs and does not return a result, it could be due to two reasons: 1. Data is not loaded. From the Load Data page on GraphStudio, you are able to check the number of loaded vertices and edges, as well as the number of each vertex or edge type. Please ensure that all the vertices and edges needed for the query are loaded.

2. Properties are not loaded. The number of vertices and edges traversed can be observed in the GPE log.INFO file. If for one of the iterations you see activated 0 vertices, this means no target vertex satisfied your searching condition. For example, the query can fail to pass a WHERE clause or a HAVING clause. If you see 0 vertex reduces while the edge map number is not 0, that means that all edges have been filtered out by the WHERE clause, and that no vertices have entered into the POST-ACCUM phase. If you see more than 0 vertex reduces, but activated 0 vertices, this means all the vertices were filtered out by the HAVING clause.

To confirm the reasoning within the log file, use GraphStudio to pick a few vertices or edges that should have satisfied the conditions and check their attributes for any unexpected errors.

Query Installation Failed

Query Installation may fail for a handful of reasons. If a query fails to install, please check the GSQL log file. The default location for the GSQL log is here :

/home/tigergraph/tigergraph/logs/gsql_server_log/GSQL_LOG

Go down to the last error and it will point you to the error. This will show you any query errors that could be causing the failed installation. If you have created a user-defined function, you could potentially have a c++ compilation error.

If you have a c++ user-defined function error, your query will fail to install, even if it does not utilize the UDF.

How to monitor memory usage by query

GPE records memory usage by query at different stages of the query and saves it to $(gadmin config get System.LogRoot)/gpe/log.INFO. You can monitor how much memory a query is using by searching the log file for the request ID and filter for lines that contain "QueryMem":

grep -i <request_id> $(gadmin config get System.LogRoot)/gpe/log.INFO | 
    grep -i "querymem"

You can also run a query first, and then run the following command immediately after to retrieve the most recent query logs and filter for "QueryMem":

tail -n 50 $(gadmin config get System.LogRoot)/gpe/log.INFO |
    grep -i "querymem"

You will get results that look like the following, which shows memory usage by the query at different stages of its execution. The number at the end of each line indicates the number of bytes of memory utilized by the query:

0415 01:33:40.885433  6553 gpr.cpp:195] Engine_MemoryStats|     \ 
ldbc_snb::,196612.RESTPP_1_1.1618450420870.N,NNN,15,0,0|        \
MONITORING Step(1) BeforeRun[GPR][QueryMem]: 116656             

I0415 01:33:42.716199  6553 gpr.cpp:241] Engine_MemoryStats|    \
ldbc_snb::,196612.RESTPP_1_1.1618450420870.N,NNN,15,0,0|        \
MONITORING Step(1) AfterRun[GPR][QueryMem]: 117000

How to check system free memory percentage

You can check how much free memory your system has as a percentage of its total memory by running the following command:

tail -n 50 $(gadmin config get System.LogRoot)/gpe/log.INFO | grep -i 'freepct'

The number following "FreePct" indicates the percentage of the system free memory. The following example shows the system free memory is 69%:

I0520 23:40:09.845811  7828 gsystem.cpp:622] 
System_GSystem|GSystemWatcher|Health|ProcMaxGB|0|ProcAlertGB|0|
CurrentGB|1|SysMinFreePct|10|SysAlertFreePct|30|FreePct|69

When free memory drops below 10 percent (SysMinFreePct), all queries are aborted. This threshold is adjustable through gadmin config.

How to retrieve information on queries aborted due to memory usage

 log:W0312 02:10:57.839139 15171 scheduler.cpp:116] System Memory in Critical state. Aborted.. Aborting.

Data Loading Debugging

Checking the Logs

GraphStudio

Using GraphStudio, you are able to see, from a high-level, a number of errors that may have occurred during the loading. This is accessible from the Load Data page. Click on one of your data sources, then click on the second tab of the graph statistics chart. There, you will be able to see the status of the data source loading, number of loaded lines, number of lines missing, and lines that may have an incorrect number of columns. (Refer to picture below.)

Command Line

If you see there are a number of issues from the GraphStudio Load Data page, you can dive deeper to find the cause of the issue by examining the log files. Check the loading log located here:

/home/tigergraph/tigergraph/logs/restpp/restpp_loader_logs/<GRAPH_NAME>/

Open up the latest .log file and you will be able to see details about each data source. The picture below is an example of a correctly loaded data file.

Here is an example of a loading job with errors :

From this log entry, you are able to see the errors being marked as lines with invalid attributes. The log will provide you the line number from the data source which contains the loading error, along with the attribute it was attempting to load to.

Slow Loading

Normally, a single server running TigerGraph will be able to load from 100k to 1000k lines per second, or 100GB to 200GB of data per hour. This can be impacted by any of the following factors:

Loading Logic How many vertices/edges are generated from each line loaded?
Data Format Is the data formatted as JSON or CSV? Are multi-level delimiters in use? Does the loading job intensively use temp_tables?
Hardware Configuration Is the machine set up with HDD or SSD? How many CPU cores are available on this machine?
Network Issue Is this machine doing local loading or remote POST loading? Any network connectivity issues?
Size of Files How large are the files being loaded? Many small files may decrease the performance of the loading job.
High Cardinality Values Being Loaded to String Compress Attribute Type How diverse is the set of data being loaded to the String Compress attribute?

To combat the issue of slow loading, there are also multiple methods:

If the computer has many cores, consider increasing the number of Restpp load handlers.

$ gadmin --config handler
increase the number of handlers
save
$ gadmin --config apply

Separate ~/tigergraph/kafka from ~/tigergraph/gstore and store them on separate disks.
Do distributed loading.
Do offline batch loading.
Combine many small files into one larger file.

Loading Hangs

When a loading job seems to be stuck, here are things to check for :

GPE is DOWN You can check the status of GPE with this command : gadmin status gpe If GPE is down, you can find the logs necessary with this command : gadmin log -v gpe
Memory is full Run this command to check memory usage on the system : free -g
Disk is full Check disk usage on the system : df -lh
Kafka is DOWN You can check the status of Kafka with this command : gadmin status kafka If it is down, take a look at the log with this command : vim ~/tigergraph/kafka/kafka.out
Multiple Loading Jobs By default, the Kafka loader is configured to allow a single loading job. If you execute multiple loading jobs at once, they will run sequentially.

Data Not Loaded

If the loading job completes, but data is not loaded, there may be issues with the data source or your loading job. Here are things to check for:

Any invalid lines in the data source file. Check the log file for any errors. If an input value does not match the vertex or edge type, the corresponding vertex or edge will not be created.
Using quotes in the data file may cause interference with the tokenization of elements in the data file. Please check the GSQL Language Reference section under Other Optional LOAD Clauses. Look for the QUOTE parameter to see how you should set up your loading job.
Your loading job loads edges in the incorrect order. When you defined the graph schema, the from and to vertex order will affect the way you write the loading job. If you wrote the loading job in reversed order, the edges will not be created, possibly also affecting the population of vertices.

Loading is Incorrect

If you know what data you expect to see (number of vertices and edges, and attribute values), but the loaded data does not mean your expectations, there are a number of possible causes to investigate:

First, check the logs for important clues.
Are you reaching and reading all the data sources (paths and permissions)?
Is the data mapping correct?
Are your data fields correct? In particular, check data types. For strings, check for unwanted extra strings. Leading spaces are not removed unless you apply an optional token function to trim the extra spaces.
Do you have duplicate ids, resulting in the same vertex or edge being loading more than once. Is this intended or unintended? TigerGraph's default loading semantics is UPSERT. Check the loading documentation to maker sure you understand the semantics in detail:
https://docs.tigergraph.com/dev/gsql-ref/ddl-and-loading/creating-a-loading-job#cumulative-loading

Loading Failure

Possible causes of a loading job failure are:

Loading job timed out If a loading job hangs for 600 seconds, it will automatically time out.
Port Occupied Loading jobs require port 8500. Please ensure that this port is open.

Schema Change Debugging

This section will only cover the debugging schema change jobs, for more information about schema changes, please read the Modifying a Graph Schema page.

Understanding what happens behind the scenes during a schema change.

DSC (Dynamic Schema Change) Drain - Stops the flow of traffic to RESTPP and GPE If GPE receives a DRAIN command, it will wait 1 minute for existing running queries to finish up. If the queries do not finish within this time, the DRAIN step will fail, causing the schema change to fail.
DSC Validation - Verification that no queries are still running.
DSC Apply - Actual step where the schema is being changed.
DSC Resume - Traffic resumes after schema change is completed. Resume will automatically happen if a schema change fails. RESTPP comes back online. All buffered query requests will go through after RESTPP resumes, and will use the new updated schema.

Schema changes are not recommended for production environments. Even if attributes are deleted, TigerGraph's engine will still scan all previous attributes. We recommend limiting schema changes to dev environments.

Schema changes are all or nothing. If a schema change fails in the middle, changes will not be made to the schema.

Signs of Schema Change Failure

Failure when creating a graph
Global Schema Change Failure
Local Schema Change Failure
Dropping a graph fails
If GPE or RESTPP fail to start due to YAML error, please report this to TigerGraph.

If you encounter a failure, please take a look at the GSQL log file : gadmin log gsql. Please look for these error codes:

Error code 8 - The engine is not ready for the snapshot. Either the pre-check failed or snapshot was stopped. The system is in critical non-auto recoverable error state. Manual resolution is required. Please contact TigerGraph support.
Error code 310 - Schema change job failed and the proposed change has not taken effect. This is the normal failure error code. Please see next section for failure reasons.

Reasons For Dynamic Schema Change Failure

Another schema change or a loading job is running. This will cause the schema change to fail right away.
GPE is busy. Potential reasons include :
- Long running query.
- Loading job is running.
- Rebuild process is taking a long time.
Service is down. (RESTPP/GPE/GSE)
Cluster system clocks are not in sync. Schema change job will think the request is stale, causing this partition's schema change to fail.
Config Error. If the system is shrunk manually, schema change will fail.

Log Files

You will need to check the logs in this order : GSQL log, admin_server log, service log. Admin_server log files can be found here : ~/tigergraph/logs/admin_server/ You will want to take a look at the INFO file. The service log is each of the services respectively. gadmin log <service_name> will show you the location of these log files.

Example of a successful schema change job. (admin_server log)

$ grep DSC ~/tigergraph/logs/admin_server/INFO.20181011-101419.98774 

I1015 12:04:14.707512 116664 gsql_service.cpp:534] Notify RESTPP DSCDrain successfully.
I1015 12:04:15.765108 116664 gsql_service.cpp:534] Notify GPE DSCDrain successfully.
I1015 12:04:16.788666 116664 gsql_service.cpp:534] Notify GPE DSCValidation successfully.
I1015 12:04:17.805620 116664 gsql_service.cpp:534] Notify GSE DSCValidation successfully.
I1015 12:04:18.832386 116664 gsql_service.cpp:534] Notify GPE DSCApply successfully.
I1015 12:04:21.270011 116664 gsql_service.cpp:534] Notify RESTPP DSCApply successfully.
I1015 12:04:21.692147 116664 gsql_service.cpp:534] Notify GSE DSCApply successfully.

Example of DSC fail

E1107 14:13:03.625350 98794 gsql_service.cpp:529] Failed to notify RESTPP with command: DSCDrain. rc: kTimeout. Now trying to send Resume command to recover.
E1107 14:13:03.625562 98794 gsql_service.cpp:344] DSC failed at Drain stage, rc: kTimeout
E1107 14:14:03.814132 98794 gsql_service.cpp:513] Failed to notify RESTPP with command: DSCResume. rc: kTimeout

In this case, we see that RESTPP failed at the DRAIN stage. We need to first look at whether RESTPP services are all up. Then, verify that the time of each machine is the same. If all these are fine, we need to look at RESTPP log to see why it fails. Again, use the "DSC" keyword to navigate the log.

GSE Error Debugging

To check the status of GSE, and all other processes, run gadmin status to show the status of key TigerGraph processes. As with all other processes, you are able to find the log file locations for GSE by the gadmin log command. Refer to the Location of Log Files for more information about which files to check.

$ gadmin log gse
[Warning] License will expire in 5 days
GSE : /home/tigergraph/tigergraph/logs/gse/gse_1_1.out
GSE : /home/tigergraph/tigergraph/logs/GSE_1_1/log.INFO

GSE Process Fails To Start

If the GSE process fails to start, it is usually attributed to a license issue, please check these factors :

License Expiration gadmin status license This command will show you the expiration date of your license.
Single Node License on a Cluster If you are on a TigerGraph cluster, but using a license key intended for a single machine, this will cause issues. Please check with your point of contact to see which license type you have.
Graph Size Exceeds License Limit Two cases may apply for this reason. The first reason is you have multiple graphs but your license only allows for a single graph. The second reason is that your graph size exceeds the memory size that was agreed upon for the license. Please check with your point of contact to verify this information.

GSE status is "not_ready"

Usually in this state, GSE is warming up. This process can take quite some time depending on the size of your graph.

Very rarely, this will be a ZEROMQ issue. Restarting TigerGraph should resolve this issue

gadmin restart -y

GSE crash

GSE crashes are likely due to and Out Of Memory issue. Use the dmesg -T command to check any errors.

If GSE crashes, and there are no reports of OOM, please reach out to TigerGraph support.

GSE High Memory Consumption

If your system has unexpectedly high memory usage, here are possible causes :

Length of ID strings is too long GSE will automatically deny IDs with a length longer than 16k. Memory issues could also arise if an ID string is too long ( > 500). One proposed solution to this is to hash the string.
Too Many Vertex Types Check the number of unique vertex types in your graph schema. If your graph schema requires more than 200 unique vertex types, please contact TigerGraph support.

GraphStudio Debugging

Browser Crash / Freeze

If your browser crashes or freezes (shown below), please refresh your browser.

GraphStudio Crash

If you suspect GraphStudio has crashed, first run gadmin status to verify all the components are in good shape. Two known causes of GraphStudio crashes are :

Huge JSON response User-written queries can return very large JSON responses. If GraphStudio often crashes on large query responses, you can try reducing the size limit for JSON responses by changing the GUI.RESTPPResponseMaxSizeBytes configuration using gadmin config. The default limit is 33554432 bytes.

$ gadmin config entry GUI.RESTPPResponseMaxSizeBytes
New: 33554431
[   Info] Configuration has been changed. Please use 'gadmin config apply' to persist the changes.
$ gadmin config apply

Very Dense Graph Visualization On the Explore Graph page, the "Show All Paths" query on a very dense graph is known to cause a crash.

DEBUG mode

To find the location of GraphStudio log files, use this command : gadmin log gui

$ gadmin log vis
[Warning] License will expire in 5 days
VIS : /home/tigergraph/tigergraph/logs/gui/gui_ADMIN.log
VIS : /home/tigergraph/tigergraph/logs/gui/gui_INFO.log

Allowing GraphStudio DEBUG mode will print out more information to the log files. To allow DEBUG mode, please edit the following file : /home/tigergraph/tigergraph/visualization/server/src/config/local.json

After editing the file, run gadmin restart gui -y to restart the GraphStudio service. Follow along the log file to see what is happening : tail -f /home/tigergraph/tigergraph/logs/gui/gui_INFO.log

Repeat the error-inducing operations in GraphStudio and view the logs.

Known Issues

There is a list of known GraphStudio issues here.

Further Debugging

If after taking these actions you cannot solve the issue, please reach out to support@tigergraph.com to request assistance.

Log Files

TigerGraph Database captures key information on activities occurring across its different components through log functions that output to log files. These log files are not only helpful in troubleshooting but also serve as an auditory resource. This document gives a high-level overview of TigerGraph's logging structure and lists some common information one might need to monitor their database services and where to obtain them in the logs.

Overall Logging Structure

Logs in TigerGraph are stored at <tigergraph_root_dir>/log/. TigerGraph's logs are divided into different folders by the different internal components and each folder corresponds to a different component. Log formats also vary across the different components. In folders where logs are checked often, such as restpp, gsql, and admin, there are three symbolic links that help you quickly get to the most recent log file of that category:

log.INFO
- Contains regular output and errors
log.ERROR
- Contains errors only
<component_name>.out
- Contains all output from the component process
log.WARNING or log.DEBUG
- log.WARNING contains warnings
- In thegsql folder, log.DEBUG contains very specific information you only need when certain errors happen

Knowing where certain activities are recorded allows one to use tools such as the Linux grep command to easily obtain critical information from your database.

Log locations on a cluster

In a TigerGraph cluster, each node will only keep logs of activities that took place on the node itself. For example, the GSQL logs on the m1 node will only record events for m1 and are not replicated across the cluster.

For GSQL specifically, the cluster will elect a leader to which all GSQL requests will be forwarded. To check which node is the leader, start by checking the GSQL logs of the m1 node. Check the most recent lines of log.INFO and look for lines containing information about leader switch. For example, the logs below recorded a GSQL leader switch from m2 to m1:

I@20210709 13:56:52.214  (GsqlHAHandler.java:231) GSQL leader switches from 'm2' to 'm1' ...
E@20210709 13:56:52.215  (GsqlHAHandler.java:246) GSQL HA leader switches to 'm1', abort and clear all sessions now.
If you want to lower the chance of leader switch by increasing timeout, please use 'gadmin config' to increase 'Controller.LeaderElectionHeartBeatMaxMiss' and/or 'Controller.LeaderElectionHeartBeatIntervalMS'.
I@20210709 13:56:52.219  (SessionManager.java:197) Abort and clear all sessions...
I@20210709 13:56:52.220  (SessionManager.java:204) All sessions aborted.
I@20210709 13:56:52.224  (GsqlHAHandler.java:283) switched to new leader m1

Monitor request history

All requests made to TigerGraph's REST endpoints are recorded by the RESTPP logs and Nginx logs. Information available in the logs includes:

Timestamp of the request
API request parameters
Request Status
User information (when RESTPP authentication is turned on)

RESTPP is responsible for many tasks in the TigerGraph internal architecture and records many internal API calls, which can be hard to distinguish from manual requests. When RESTPP authentication is on, the RESTPP log will record the user information and mark a call if it is made by an internal API. Therefore, you can use the command below to filter for manual requests:

# In the restpp log directory
$ grep -i "requestinfo" log.INFO | grep -v "__INTERNAL_API__"

# All requests exluding the ones made by internal API
I0315 21:11:59.666318 14535 handler.cpp:351] RequestInfo|,1.RESTPP_1_1.1615842719666.N,NNN,0,0,0|user:tigergraph|api:v2|function:NoSchema|graph_name:social|libudf:
I0315 21:41:36.462616 14541 handler.cpp:351] RequestInfo|,196622.RESTPP_1_1.1615844496462.N,NNN,0,0,0|user:tigergraph|api:v2|function:NoSchema|graph_name:social|libudf:

RequestInfo contains the ID of the request, which you can use to look up more information on the request :

Here is an example of using a request ID to look up a request in the restpp log:

$ grep "1615842719666" log.INFO

# Returns all information about the specific request
# RawRequest log is captured at the entry point of a query
I0315 21:11:59.666026 14535 handler.cpp:285] RawRequest|,1.RESTPP_1_1.1615842719666.N,NNN,0,0,0|GET|/echo?parameter1=parameter_value|async = 0|payload_data.size() = 0|api = v2
# RequestInfo log is captured after the request has been parsed, 
# and contains information such as username and the function or UDF to run 
I0315 21:11:59.666318 14535 handler.cpp:351] RequestInfo|,1.RESTPP_1_1.1615842719666.N,NNN,0,0,0|user:tigergraph|api:v2|function:NoSchema|graph_name:social|libudf:
# ReturnResult is captured when the request has been processed
I0315 21:11:59.666509 14535 requestrecord.cpp:325] ReturnResult|0|0ms|RESTPP|1.RESTPP_1_1.1615842719666.N|user:tigergraph|/echo|graph_id=1&graph_name=social&parameter1=parameter_value|39

Monitor user management tasks

User management activities, such as logins, role and privilege changes are recorded in the GSQL logs in the folder gsql.

To view recent activities, use the symlink log.INFO. There is a lot of information in the logs - to filter for information that you need, you can use Linux commands such as grep and tail For example, to view recent changes in roles, you can run the following command in the gsql log directory:

$ grep -i "role" log.INFO

# Returns all lines containing the word "role"
#                        username     source IP       
I@20210312 22:41:16.167 tigergraph|127.0.0.1:45854|00000000077 (BaseHandler.java:133) Received|POST|/gsql/roles?action=grant&role=globaldesigner&name=lennessy|0
I@20210312 22:41:16.863 tigergraph|127.0.0.1:45854|00000000077 (BaseHandler.java:167) Successful|POST|/gsql/roles?action=grant&role=globaldesigner&name=lennessy|application/json; charset=UTF-8|696ms

To view login activities, search log.INFO for "login" instead.

$ grep -i "login" log.INFO

# Returns all lines containing the world "login"
I@20210315 21:08:42.047 tigergraph|127.0.0.1:53960|00000000001 (BaseHandler.java:133) Received|POST|/gsql/login|28
I@20210315 21:08:42.061 tigergraph|127.0.0.1:53960|00000000001 (LoginHandler.java:52) The gsql client is started on the server, and the working directory 
is /home/tigergraph/tigergraph/log/restpp
I@20210315 21:08:42.072 tigergraph|127.0.0.1:53960|00000000001 (LoginHandler.java:80) Successful|Login|tigergraph
I@20210315 21:08:42.080 tigergraph|127.0.0.1:53960|00000000001 (BaseHandler.java:167) Successful|POST|/gsql/login|application/json; charset=UTF-8|35ms

Set up Log Viewing with Elasticsearch, Kibana and Filebeat

The TigerGraph system produces extensive and detailed logs about each of its components. Starting with TigerGraph 3.2, TigerGraph provides a gadmin utility that allows users to easily view log files through an Elasticsearch, Kibana, and Filebeat setup. This page offers a step-by-step guide to set up log viewing for all components in a TigerGraph cluster with Elastic search, Kibana, and Filebeat.

Before you begin

Install Elasticsearch on a machine that is running TigerGraph.
- If you have a TigerGraph cluster, you only need to install Elasticsearch on one node.
Install Kibana on the same machine where you installed Elasticsearch.
Install Filebeat.
- If you have a TigerGraph cluster, you need to install Filebeat on all nodes in the cluster.

Procedure

1. Configure Elasticsearch for remote access

The default Elasticsearch settings only allow the Elasticsearch service to be accessed from the same machine it starts from. In order to allow Elasticsearch to receive log files from other servers in the cluster, we have to make the following edits to the file at /etc/elasticsearch/elasticsearch.yml

network.host: "<server_private_ip>"
discovery.seed_hosts: ["<server_private_ip>"]
# server_private_ip refers to the private ip address of the machine where 
# elasticsearch is installed
cluster.initial_master_nodes: [ "node-1" ] 
# "node-1" is the default name of the Elasticsearch node. If you changed
# the defualt name, you would use the name you chose instead

After editing the configurations, restart the Elasticsearch service.

Elasticsearch is a memory-intensive service. For more information on memory management for Elasticsearch, see Managing and Troubleshooting Elasticsearch Memory.

2. Configure Kibana with Elasticsearch and enable remote access

You need to make the following changes to the file at /etc/kibana/kibana.yml:

To allow remote access, change the value of server.host to the IP address or DNS name of the Kibana server. Since the Kibana server is on the same machine as Elasticsearch, this value should be the same private IP that you specified as Elasticsearch's network.host.
Additionally, you need to provide the address of the Elasticsearch server in the elasticsearch.hosts setting. By default, Elasticsearch is on port 9200, so the value for this setting should be ["server_private_ip:9200"]

After editing the configurations, restart the Kibana service.

3. Configure Filebeat

Finally, we need to configure Filebeat to have each component on each node send its logs to the Elasticsearch server. To do so, run the following gadmin command:

gadmin log build filebeat <--host={ip_address_1}[,{ip_address_n}]*>

The command outputs a Filebeat configuration file filebeat.yml . The following options are available:

After generating the filebeat.yml file, copy it to the directory /etc/filebeat on every node, and restart the Filebeat service on each node.

After the service restarts, you should be able to view the logs through Kibana's user interface in your browser at server_ip:5601.

Error Codes

The reference page for status codes on the TigerGraph platform.

This page documents the status codes and exit codes on the TigerGraph platform. Each status code follows the format: <component>-<code> , while exit codes are numeric values between 0 - 255.

The GSQL Client will exit with a non-zero code if there’s an error while handling a user request. To view the exit code of the GSQL client, run the command echo $? and the exit code of the most recent command will be printed to the terminal.

Return codes

REST

This section covers return codes from the REST++ server.

0000 - 0999

Codes in this range are success codes. When the conditions for multiple codes are true, the lowest code is returned.

1000 - 1999

RESTPP endpoint errors.

2000 - 2999

Payload errors.

3000 - 3999

RESTPP general errors.

10000+

Other RESTPP errors.

GSQL

0000 - 0999

Codes in this range are success codes. GSQL will return the smallest code when the conditions are met for multiple codes.

1000 - 1999

Query parameter errors.

2000 - 2499

JSON string related errors.

2500 - 2999

Operator errors.

3000 - 3999

Dynamic expression errors and expression function errors.

4500 - 4999

Vertex type, edge type, and ID translation errors

5000 - 5499

Print errors.

6000 - 6499

Errors related to updating the graph.

6500 - 6999

7000 - 7999

Built-in query errors.

8000 - 8999

Unexpected exceptions (C++).

40000+

User-defined exception errors.

SYS

This section covers engine-related errors.

GSQL client exit codes

The GSQL client will exit with non-zero code if there’s an error while handling a user request. To check the exit code, run the Linux command echo $? and the exit code of the most recent command will be printed in the console.

*: The exit codes marked with a star (*) are only applicable when a GSQL script is given as an argument.

Change Log

This page will document all the changes to TigerGraph product including New Features and Bug Fixes.

Distributed Graph support and certain other enterprise-level features are available in the Enterprise Edition only. They do not pertain to the Developer Edition.

TigerGraph 3.2.1

Changes

Increased limit for graph catalog size
Added retention for Metadata topic in Kafka
Improved error handling when retrieving patterns from server in Visual Query Builder

Fixes

Fixed an issue during upgrade related to initiating Kafka
Fixed a bug that in rare cases caused a catalog size limit issue
Fixed an issue that slowed queries that write to files in 3.2.0
Fixed a bug that in rare cases caused query compilation issues

TigerGraph 3.2

Release Date: 2021-09-30

Features

Check release notes: 3.2.0 Release notes

Changes:

GSQL:

ADD Edge Pair commands as part of Schema Change operations are now allowed.
Query-calling-query limitation: Distributed main query cannot call a distributed sub-query.
Default logging level for GSQL logs has been changed from DEBUG to INFO

Enhancements

Database Server:

Core: Improve Abort transaction if the transaction is too long
Core: GPE hung under high number of concurrent queries
Core: Turn on transaction for RestPP Post for atomicity
Core: Workload management: Specify replica for a query to run in a distributed cluster
Core: Standardize the correct http response code for query requests
GSQL: Query Installation Performance Improvements
- Support longer reload time for Query installation on 1000’s of queries
- Don't drop parent queries when subquery is installed
GSQL: Support for new built-in functions:
- Math-related functions: round(), reverse(), repeat(), insert(), cot(), degrees(), radians, square, truncate, log2
- String-related functions instr(), length(), substr(), PI(), rand(), lpad(), rpad(), replace(), ascii(), chr(), soundex(), difference(), translate(), space(), ltrim(), rtrim(), find_in_set(), left(), right();
GSQL: Support FROM/TO vertex type change for edge type metadata
GSQL: Support VLAC tags in Import and Export operations
GSQL: Allow variable declaration anywhere in query body
GSQL: Support initialization from an expression;
GSQL: Support revoking superuser role from default user
GSQL: Improve the error message displayed when connecting to LDAP server
Platform: Upgrade to Java 11
Platform: Add support for ubuntu20
Platform: Show executor status and updated status of other services
Platform: Run upgrade locally without ssh if user is local with only a single node
Platform: Start/stop local executor will no longer need ssh,
Platform: Increase Backup/Restore S3 upload Partition Size
Platform: Make Backup/Restore Heartbeat timeout configurable to allow media with slower speeds.

GraphStudio:

WCAG compliance changes
Support overwriting exploration result
Support duplicate file-edge mappings and fix setSelection error;
Add graph information and variable names to auto-complete list;

Admin Portal:

WCAG changes
Support Privilege based management
Improve unauthorized access warning popup message
Display secrets table for each graph

Fixed

Database Server:

Core: Kafka loader should exit gracefully
Core: GPE crash if the request specifies an invalid replica
Core: Health check for 1 mins in RESTPP startup
Core: Fixed file loading failed due to OOM
Core: Fixed no error message when edge does not exist
Core: Fixed issue with deleted_vertex_check API after dropping vertex type;
GSQL: LDAP user privilege parsing missed authorization checks
GSQL: Fixed rhs check issue for direct interpret query;
GSQL: Fixed print Vset issue with vertex accum declaration order;
GSQL: Added semantic checker for rhs with the same name;
GSQL: Export fails due to mismatching token of an unexpected graph
GSQL: Fixed wrong name when looking up variable from global
GSQL: Fix datetime_format function not working for v2 syntax
GSQL: The result of printing string differs in interpret mode and installed mode
GSQL: Fixed issue with Order by for interpret query
GSQL: Fix to handle abort while adding queries if a concurrent delete fails
Platform: Service status for KAFKA is down when one zookeeper server offline
Platform: Fix for Admin log rotation time issue

GraphStudio:

Addressed Schema change logic for reversed edge
Fix for privilege based access control issue
Fix for loading job information migration failure
Remove loading job log on export;
Remove graphName from loading job information interface;
Use authorization token in header instead of logging in;
Send heartbeat to keep client connection alive

TigerGraph 3.1.6

Release Date: 2021-08-09

Fixed

Application

Configuration for light or dark mode in GraphStudio/Admin Portal
Multiple maps from a single file to an edge are indistinguishable
GraphStudio: Implement responsive design for all sizes of screens
GraphStudio: Rearrange elements to avoid overlay in small screen
GraphStudio: Support toolbar button announcement for screen readers
GraphStudio: Support keyboard shortcut for focusing elements within working panels

TigerGraph 3.1.5

Release Date: 2021-07-23

Fixed

Database Server

Core: GPE on DR cluster stuck in warm up state after failover due to invalid requests
GSQL: Prevent QueryReader role to run any graph updates query
GSQL: Validation script to check schema consistency issue
Platform: Increase in proxy request buffer size for NGINX
Platform: Change in GRPC maximum message size for GBAR backup of catalog data

Application

GraphStudio: Reuse controller connections to avoid running out of used ports
GraphStudio: Remove "change layout" button in toolbar in Visual Editor

TigerGraph 3.1.4

Release Date: 2021-07-01

Enhancements

GSQL: \requesttoken API can be used to create authorization tokens using User name/password in addition to secret.
GSQL: Secrets created without alias will be assigned a system-generated alias so that they can be dropped
Platform: Nginx upgrade from 1.18.0 to 1.21.0
Platform: Backup/Restore configuration improvements to allow use of slower HDD media for storage
GraphStudio: UI enhancements to support WCAG compliance

Fixed

Database Server

Core: GPE need to verify catalog updates after new schema changes are applied
Core: Running Louvain algorithm as a distributed query crashed GPE due to unnecessary vertex activation
Core: Backup failed with WaitForDeltaToBeProcessed timeout
Core: Updated log messages to reference /deleted_vertex_check endpoint in RESTPP correctly
GSQL: Fix schema consistency issues due to duplicate Vertex/Edge type names
GSQL: Fix for schema consistency issue due to GPE referencing a dropped Vertex
GSQL: Additional semantic check for local schema change job to prevent schema inconsistency
GSQL: Error when making schema changes using UI/ Install all queries fails
GSQL: Inconsistency between GSQL and GPE catalog data after ‘Drop graph’ fails
GSQL: ‘From’ clause missing from delete loading jobs when Export Graph command is run
GSQL: Query installation will fail due to wrong order of arguments in PRINT statement
GSQL: “Incompatible argument types for function/tuple evaluate" error when using evaluate without second argument on v2 syntax
GSQL: Designer Role unable to run a query in Interpret Mode
Platform: Updates to Nginx templates for security updates
Platform: Change in default value for UI request timeout to 3600

Application

GraphStudio: Vertex and Edge statistics generation optimization to avoid Cluster CPU usage spike
GraphStudio: Unexpected error when dropping edge with reversed edge
GraphStudio: Fix for failure to migrate loading job info from 3.0.x to 3.1.2+

TigerGraph 3.1.3

Release Date: 2021-06-05

Enhancements

GraphStudio

Theme color adjustment to meet Web Content Accessibility Guidelines(WCAG).
Support responsive page layout for "Home" page, "Load Data" page and "Write Queries" page.
Add information transcripts for visualization areas in each page.
Add keyboard navigation in graph charts.
Improve tabbing capability and tabbing order.
Improve element status announcement.
Add headings for the entire application.
Add aria-labels for the entire application to meet WCAG compliance.
Add captions for all table elements.

AdminPortal

Theme color adjustment to meet WCAG compliance.

TigerGraph 3.1.2

Release Date: 2021-05-20

Features

SQL to GSQL translation for Enterprise BI tools like Tableau and Power BI
- This enriches data visualization tools with graph-enabled dashboards

Enhancements

Core: Increase the maximum allowed size of Vertex/Edge delta files to allow larger number of updates for write-heavy applications.
GSQL: Support for more than 10K elements in a Set<> of a query parameter
GSQL: Support VertexAccessControl Tags in DBImportExport

Fixed

Database Server

Core: Pick the latest version of GPE data for backup
GSQL: datetime attribute type in a schema-level user-defined tuple translated as int32_t
GSQL: NullPointerException when handle VSet variable in nested if statement
GSQL: NullPointerException when using multiple POST-ACCUM clauses
GSQL: INSERT statement with non-existent edge does not report error in V1 syntax
GSQL: GSQL does not produce type error when inserting non-existent edge with vertices from query parameters
GSQL: NoSuchElementException when using a non-existent edge on INSERT statement
GSQL: Lexical error when a newline is followed by an exclamation mark (!) in a string
GSQL: Printing string with newline fails compilation
GSQL: Refresh RESTPP Token: output and default lifetime is not correct
GSQL: Multiplicity propagation ACCUM clauses should reset only if the block is within a loop
GSQL: Create user don't allow an empty password
GSQL: Pattern match - propagation accumulator values not cleared
GSQL: Push-down error reported for non-alias expressions
GSQL: Support TAGS in DBImportExport
GSQL: Fix TokenBank compilation slowdown
Platform: Graceful handling of port used by Executor component
Platform: Got failed to authenticate with GSQL server error when login with SSO on tg3.1.1
Platform: Remove gsql password printing

GraphStudio

The loading data status is incorrect while import a solution
Imported solution with no modification, should not ask user to publish Data mapping.
Failed to overwrite datafile in Map Data to Graph

AdminPortal

Display of secrets on AdminPortal - User management should be paginated.

TigerGraph 3.1.1

Release Date: 2021-04-02

Changes:

Change BY(OR|OVERWRITE) syntax to BY OR|OVERWRITE for explicit tag creation
Changed name of 'dbsanitycheck' endpoint to 'deleted_vertex_check'

Enhancements

Database Server

Core: Improved throttling mechanism for Updates when memory usage has hit critical threshold
Core: Improved reliability of transferring in-memory data to on-disk within GSE
Core: Logging improvements to support both time-based and size-based configuration for all the component logs
Fixes/Enhancements for Vertex Level Access Control feature
- GSQL: Performance improvement for tag creation only operations
- GSQL: Make tag description optional
- GSQL: Block altering taggable value of global vertex if being used in tag based graph
- GSQL: Show tag expression of tag graphs in base graph “ls” command
- GSQL: Allow vertex taggable property to be updated even if it is currently being used in a tag-based graph
GSQL: Support for accumulators in table-style SELECT clause expression lists
GSQL Query syntax extensions for table support
GADMIN: Allow script to be used to configure LDAP TrustStore Path
Platform: Security enhancement to allow HTTPS traffic only access securely through dedicated interfaces when SSLis enabled.
Platform: Upgrade grpc to 1.33.0

GraphStudio

Add a * in the label of a data source if the loading job is changed
Return detailed error messages when install queries failed
Enable only one column header to be editable at the same time
Enable closing popup with Escape
Add a max validator for timeout field for configuration
Query name conflict check uses all available type names from GSQL

Fixed

Database Server

Core: Retry logic for adding data to GSE in the DR cluster
Core: Fix for GPE crash due to potential race condition between queries and updates.
Core: Partial result output in extreme cases before a running query has finished
Core: restpp crashed when missing parameter name
Core: Fixed file loading job failures due to OOM
GSQL: Fix for catalog access issue due to concurrent schema change requests
GSQL: GPE crash due to incorrect catalog update issued by GSQL
GSQL: LDAP password visible in GSQL logs
GSQL: Exit code from GSQL CLI needs to return non-zero code if there is an error
GSQL: Unable to run global schema change on global vertex if local vertex with same name exists
GSQL: Query created through GSQL shell, but returns error through GraphStudio
GSQL: Add check for GPE readiness for create/drop vertex/edge operations for global schema changes
GSQL: GSQL v2 syntax - vertex-attached containers cannot be read in WHERE/ACCUM clauses
GSQL: Enhance Export/Import by pre-creating necessary directories
GSQL: Fix calling subquery without RETURNS clause
GSQL: Code generation error for multiple dynamic expressions with the same parameter
GSQL: Wrong result for the output of datetime_format function
GSQL: SET<VERTEX> Not Working in Query Parameter
GSQL: GLE error message uses incorrect terminology: 'batch mode' should say 'distributed query mode'
GSQL: Printing vertex set variable with parentheses causes wrong printing for attributes
GSQL: GSQL pattern match - incorrect WHERE condition parsing
GSQL: GSQL query doesn't work on HA cluster when RESTPP#1 is down
GSQL: Fix for Catalog backup file cleaning failure
GSQL: Empty gsql password should not be allowed.
GSQL: NullPointerException on creating a query with a body-level DML delete statement
GSQL: Query cannot be dropped after its caller queries have been dropped
Platform: Remove user authentication information after installation
Platform: GSQL user defined functions are not backed up
Platform: Residual GPE/GSE processes are not terminated before restore
Platform: GBAR gracefully exit after ctrl-c
Platform: guninstall does not take into account the password login
Platform: gbar restore failed with message: Failed to import key-value store
Platform: Single node 3.1 installation in in VMware private cloud environment in VMWare Private Cloud Environment
Platform: Restore failure from S3 didn’t update the replicas correctly
Platform: Check to prevent migration tool running twice
Platform: GBAR restore fails with invalid checksums
Platform: User didn’t receive correct feedback when incorrect password entered during 3.1 upgrade

GraphStudio

Query goes back to a previous version after schema change in query editor
Remove the use of regex for GSQL CLI and rely on exit code instead
Progress bar hangs if query installation fails
datetime's default value field does not support rfc3339 nor iso8601 format
Export solution is only available for superuser
Unexpected error when changing the schema (Fix from GSQL side)
Update global schema after a local schema is dropped
Uploading progress bar hangs after choosing unsupported file type
Query editor does not display full text if line cannot break
Undo button should clear the expand list
JSON result of "write query" is not updated in error mode
Not possible to unset/cancel custom radius in Graph Exploration
Syntax highlighting is incomplete
Link to License page from GST is wrong
Long messages in Design Schema overlap vertex properties editor's ✓ button
The loading progress bar is stuck if import fails
The data mapping will disappear after change the global vertex's attribute
Address Export/Import solution migration issues

Admin Portal

Validate input on config management
Ignore blank spaces in log search

TigerGraph 3.1.0

Release Date: 2020-12-02

Features

New features are described in 3.1.0 Release notes.

Changes:

GSQL: STRING COMPRESS data type will no longer be allowed for new data objects. However, existing objects with STRING COMPRESS data type will continue to work.
GSQL: Changes to ADD/DROP Edge Pair commands
- ADD edge pair in schema change will not be allowed
- Drop vertex will be disallowed if it is currently being used in edge pair.
Platform: tigergraph user id included with default installation will be allowed to be dropped
Platform: Root user will now be disallowed to do an upgrade using installer -U option

Enhancements

Database Server

Engine: License enforcement check improvements
Engine: Restpp memory footprint reduction by recycling memory periodically
GSQL: Support JSON Payload Method for Calling GSQL Built-In Dynamic Endpoints
GSQL: Support Async query execution with query status/result functionality
GSQL: Enhanced Interpreted Query support:
- Support graph update for interpreted query
- Support Where filter in PRINT statement for interpreted query
GSQL: Logging for /requesttoken API endpoint
GSQL: Reset function for vertex attached accumulators
GSQL: Make token expiration maximum limit configurable
Platform: Enterprise Free Package improvement to make pre-installed license work in both interactive and non-interactive modes
Platform: Allow users to set hard coded timeout for Backup jobs
Platform: Allow configurable minimum and maximum memory limits for Kafka, Kafka Connect and Kafka Stream
Platform: Software upgrades for the following packages:
- etcd, Kafka plugins, Jsoncpp library

GraphStudio

Add new application server framework to offer continuous availability in GraphStudio and Admin Portal
Update APIs for the new application server
Support solution export/import without graph metadata
Integrate GraphStudio with the new application server
Increase unit test timeout

Admin Portal

Add log management for viewing, searching and downloading
Add configuration management settings
Add Restpp setting: Default query timeout
Add Nginx setting: SSL setting and whitelist IP setting
Add application server setting: Query return size
Add security management settings: LDAP, SSO
Integrate Admin Portal with the new application server
Change SSO authorization request URL
Handle SAML ACS for SSO
Disable authorization check for SSO metadata

Fixed

Database Server:

Engine: Correct HTTP response code will be returned when query times out
Engine: GPE status reporting is delayed due to backlog of large number of Kafka messages in the queue.
Engine: GPE crash in Sub-query print statement
Engine: Infinite loop in refresh index when some attributes are disabled
Engine: RESTPP memory consumption increase caused by timed out queries
Engine: Query using index will not fully utilize compute resources.
Engine: When query times out, JSON may not be well formed
Engine: Failed to post data when id is int and primary_id_as_attribute is true
Engine: Avoid converting string compress index hint in remote topology edge action
Engine: GPE not responding to SIGTERM
GSQL: Refactor memory usage in query installation to reduce the memory footprint when there is a large number of queries
GSQL: When creating the edge pairs, allow use of new vertex types that will be added from the current schema change job
Platform: Backup/Restore fails to backup GUI related data
Platform: Installer will print progress message during package install to avoid ssh timeout

TigerGraph 3.0.6

Release Date: 2020-11-11

Enhancements

Database Server

Audit Logging Enhancements
- User information for all requests.
- Request Status (request succeeded or failed) for all requests irrespective of access mode
Remove Hard timeout limit for Backup/Restore operations

Fixed

Database Server

Platform: Resolve the issues where Kafka start-up will hang in certain OS and shell environment.
Platform: Backup/Restore hangs if there are too many files
Platform: Backup/Restore list error when backup files on S3 are corrupted
Engine: Builtin query running background blocks schema change
GSQL: Fix for SSL certificate exception

TigerGraph 3.0.5

Release Date: 2020-09-05

Features

New features and described in 3.0.5 Release notes.

Enhancements

Database Server

Longer timeout for retrieving enum maps when using STRING COMPRESS
Socket timeout adjustment to improve RESTPP stability
Implement SetAccum<vertex> as bitset
Semantic check for println of File object for compiled query
Installer improvements
- Enhancement to change the user and group separately.
- Check permission of parent dir of App/Temp/Data/Log Roots
TigerGraph 2.x to 3.x Migration tool enhancements
- Support for copying UDFs and other functions during migration
Enhanced license support for Cloud deployments
Enhanced upgrade version checking
Zookeeper client connection retry mechanism to avoid Zookeeper operation failures

Changes

Installer Configuration JSON format

Install Configuration is separated into basic configuration and advanced configuration sections
Support for allowing replication factor to be set during installation as opposed to limited HA on/off setting previously

Fixed

Database Server

Core: GPE down during Backup for large number of files
Core: GPE will crash if the data comes from a machine without relevant metadata.
Core: Query failure due to string overflow
Core: Query with large UDF job didn't stop for configured time out setting
Platform: Kafka loading bug when number of loaders exceeds 10
Platform: Backup hangs when there are very large number of files in Graph Store
Platform: Backup reports successful operation even if it's actually incomplete
Platform: gadmin reset does not reset all files
GSQL: V2 syntax removes edge type that is excluded by Accum clause.
GSQL: Force query install should regenerate the endpoints
GSQL: Loading Job failed with SSL enabled
GSQL: Query installation performance issue for V2 syntax
GSQL: ArrayAccum value is not accessible in the ACCUM block when query is installed in distributed mode.
GSQL: Dictionary Fails when Tokens are too many
GSQL: Query installation fails due to schema change
GSQL: gsql_client strips out newlines when writing gsql queries by pasting into gsql shell

GraphStudio

Apply previous visualization result should handle empty saved schema
Displaying attribute for raw type in visualization should not use JSON stringify
Remove clear text user password in error log for migration from RDBMS to Graph

TigerGraph 3.0

Release Date: 2020-06-30

Features

New and modified features and described in the TigerGraph 3.0 Release Notes.

Enhancements

Database Server

Support for reload libudf command
Schema validation before apply settings
Relax Developer Edition restrictions
YAML parsing support for edge pairs
Support SPLIT for UDT loading, Load From/To Type from File
Data generator 2.0
Change log level by SIGUSER1, avoid unnecessary error log
Restpp self-report status
Allow users to remove data for reinstallation
Upgrade kafka to 2.3.0
Path pattern optimization with pattern flipping and PER clause
Combine service status and processState into one log event
Support validation of entry value during gadmin config set command
Add strong check for symlinks
Support to_datetime builtin function in expressions
Support string set filter for edge and target vertex
Support local vertex and edge with same name in multiple graphs
Index hint for interpret mode
Support string compress attributes in built-in Query filters
Enable jemalloc profiling
Utility function to get disk free percentage
Allow concurrent user query access during Query Installation

GraphStudio:

Support multiple-pair edge type
Schema change job for add/drop attribute index
Improved clear graph warning
New layout for logo and multiple graphs
Allow user edit header for sample data
Support multiple files upload
Cancel autofit for adding vertex and double click actions
Cancel auto login if user has logged out
Save JSON format of query result to local storage
Create Edge Type from Multiple Vertex Types to Multiple Vertex Types

Fixed

Database Server

Add on-demand heap profiling for jemalloc
Delete legacy ids data
Periodically force Jemalloc release memory to OS / on demand profiling
Change debug log in convertids into verbose
Print warning but no assert in ZMQ
Wrong JSON format for tempTables
Fix wrong check for loading job completion
Allow interpret query to recognize html encoded string constant
Handle logical type in json converter
Corrected URL decode for whitespace character
Add time before delete edges command to ensure rebuild has enough time to complete
Fix remove session bug for the aborted handler after 'ctrl + c'
Synchronize concurrent install queries
Change logic to check service status for cluster mode
Support the ‘=‘ operator SumAccum;
Drop vertex/edge/graph when there are local and global vertex/edge have the same name;
Support removing a SetAccum from another SetAccum;
Remove the reversed edge too when removing an edge;
Cannot create query due to the overflow of the size of the HeapAccum;
Query referred as subquery from interpreted mode query can not be dropped;
Index out of bound when ignoring the parameter checking for interpret query
Output error message for invalid job id
Fix codegen to insert a vertex/edge without attributes
Support file regexp in checking header of filename
Support the true value of key word header and transaction in the loading data job to be case-insensitive
Dedupe proxy user's own roles from groups
Make schema change metadata modification a transaction
Fix builtin k_step expansion query bug
Check disk space before exporting each vertex/edge type
Allowed non-English string constants in interpreted queries
Edge variable prints attribute by default
Print developer information only in gadmin status
Restrict symlinks and check their existence

GraphStudio:

Fix error message for new secret creation
Refactor keywords
Do not emit explorer config if saved exploration doesn't have it
Check for Valid date time
Extend wait time for progress bar finish
Add right border for side navigation
Upgrade color-picker
Fix check accumulator format
Fix percentage of performing schema change
Run interpreted query through websocket

TigerGraph 2.6.6

Release Date: 2021-03-23

Fixed

Database Server

Core: Fix concurrent access of abort messages
Core: Fix for GPE crash due to wrong license
Core: Fixes to gcollect utility:
- Improvements to work in clustered environments
- Accidental removal of directory with old data collection run
GSQL: Fix for catalog access issue due to concurrent schema change requests
GSQL: Increase timeout for download upload catalog, make it configurable
Platform: Upgrade of gRPC version to 1.33.0
Platform: Remove user authentication information after installation

TigerGraph 2.6.5

Release Date: 2021-01-15

Enhancements:

Database Server

GSE/GPE segment consistency check utility
Integration with GSE/GPE consistency check utility with Backup/Restore

Changes

Increase in refresh timeout for RESTPP from 20 to 60 seconds;

Fixed

Database Server

GSE replica synchronization for Zookeeper errors
Explicitly check replica follower status before automatic promotion to leader is allowed
RESTPP fix - memory leaks caused by timed out queries
Backup/Restore: Ensure GPE and GSE snapshots are done in correct order

TigerGraph 2.6.4

Release Date: 2020-11-02

Enhancements

Database Server

Allow RESTPP to manage log files based on timestamp
Upgrade NGINX to 1.18 version
Correct status code to indicate GSQL operation result
Remove Hard timeout limit for Backup/Restore operations
Token Management Improvements:
- Improve GSQL stability by setting a limit on number of tokens allowed
- Logging improvement to indicate new and refreshed tokens separately

Fixed

Database Server

Core: GSE follower replicas lag leader replica on the data updates
Core: Shuffle abort causing GPE crash
Core: Handle un-released lock gracefully during json print command failure
Core: Incremental Snapshot triggers creation of all segments causing delays
Core: Kafka loading fails when number of loaders exceed 10
GSQL: Query Install fails for batch installs
Backup/Restore hangs if there are too many files

TigerGraph 2.6.3

Release Date 2020-08-21

Enhancements

Improved handling of query time outs for distributed queries.

Fixed

Longer timeout for retrieving large memory map for attributes of STRING COMPRESS data type with large number of distinct values.
Backup jobs report incorrect successful runs
Incorrect type check logic for trim function;

TigerGraph 2.6.2

Release Date 2020-08-14

Enhancements

Improvements to GSE Upsert performance
Add User Id information to RESTPP logs for all user initiated calls
Improvements to Query Installation performance time
Provide warning message when revoking a role from proxy user if needed

Fixed

Core: GPE crash on unknown vertex / segment
Core: PostWriter needs to skip vertices if the internal vertex id is invalid one.
Core: Handle exception in ResponseThread of RemoteTopology
Core: Query re-installation issue caused by non-deterministic transformation
Core: Address Data Loading speed for hub loading
Core: Inconsistent result with and without using local accumulators
Core: RestPP payload scale issue due to 3rd party FCGI library
GSQL: GSQL pattern match - translation error when vertex type is the keyword "ANY"
GSQL: Issue with reduce function with Bitwise OR operator in the LOAD functions
GSQL: gsql_client strips out newlines when writing gsql queries by pasting into gsql shell
GSQL: Secrets and token associated with a graph and not removed during graph delete
GraphStudio: Displaying attribute for raw type in visualization should not use JSON stringify method

TigerGraph 2.6.1

Release Date 2020-06-12

Enhancements

Allow concurrent user query access during Query Installation
GPE & GSE Data Sync Check Utility
Use of POST for /requesttoken API so that user password is not exposed
Write Performance improvements
Error handling and reporting improvements for Query Timeout and Failures
UX improvement for ‘Clear Graph’ command in GraphStudio

Fixed

Ensure cleanup and compaction of delta records in a large transaction even in the event of TigerGraph service restart
Performance improvement to make Graph Updates faster by parallelizing and sharing transaction
Fix for the leftover Shuffle threads after Query Abort/Timeout
Change in the error message of AbortQuery request inside the Shuffle Operator
Bug fixes for GSE compaction feature to address exporting with mixed segments of data and load data from the database in worker mode
Fix for GSE crash triggered by schema change
Enable background thread on JEMALLOC for memory cleanup even when system is idle
/showprocesslist and /abortquery APIs do not list the running queries of old worker if RESTPP is refreshed
S3 loader header check doesn't apply file filter regex
GSQL V2 syntax does not handle ACCUM operator correctly
Fix for RESTPP timeout error

TigerGraph 2.6.0

Changes

Release Date 2020-04-24

New and modified features and described in the TigerGraph 2.6 Release Notes.

Enhancements

Remove SSH connection use dependency for GSQL Install Query command
New 'force' parameter to RebuildNow so that engine to start the rebuild.

Fixed

Core: GSE crash in HA setup when CPU usage is extremely high
Core: Out Of Memory handling improvements to prevent GPE crash due bad memory allocation call
GLE: fix builtin query crash in worker due to graph id missing
Core: Skewed CPU usage for high-query throughput scenarios
Fixes in Rebuild to address broken edge count
Fix for 2.5.2 bug - Inconsistent query results when running non-distributed query on a cluster
Unable to find local vertex and edge with same name in multiple graphs
RESTPP memory leak due to yaml file
Reverse edge id is wrong when two local edges with reverse edge are created with same name

TigerGraph 2.5.4

Release Date: 2020-04-24

Enhancements

New 'force' parameter to RebuildNow so that engine to start the rebuild.
Improved version of /abortquery so that query can be aborted more quickly

Fixed

Fixes in Rebuild to address broken edge count
RESTPP memory leak due to yaml file
Builtin query crashed due to missing Graph Id
RESTPP crash for same vertex name in the global graph
Resolved the distributed query hanging issue which could block rebuild and schema change
Core: Skewed CPU usage for high-query throughput scenarios

TigerGraph 2.5.3

Release Date: 2020-02-26

Fixed

Ensure catalog data backed up before schema change
Support creation of two local edges with same name with one being a reverse edge
Support Local vertex and edge type with same name in multiple graphs in
Support for multi-lingual string constant in Interpret query mode
Upgrade to Release 2.5.2 leads to inconsistent query results
Compute resource usage spikes on particular node in cluster
GCleanUp failed to cleanup all pointers when adjusting thread

TigerGraph 2.5.2

Release Date: 2020-01-27

New Features

GPE: Increase MemoryCheck frequency based on Resource Usage
GPE: Abort Query if Memory usage crosses critical threshold
GSE: Support Log compaction as part of startup for GSE
GraphStudio: Support Multi-edge pair in design schema.
Core: Support OS RHEL 8.0 in Installer

Enhancements

REST: Increase the RESTPP reload timeout
GSQL: Change error message to specify user when default tigergraph user is dropped
GSQL: Make user tigergraph droppable
GraphStudio: Do not change layout when adding/updating/deleting vertex and edge

Fixed

Core: GPE crashed running distributed LDBC query
GST: Incorrect vertex count in TigerGraph GraphStudio
Core: Shuffle deadlock causing full system memory use
Core: Replace GASSERT with GWARN in GDataBox
Core: BATCH_SIZE of Kafka loader set from GSQL console doesn't work
GPE: Schema Change failed due to Query Install OOM
GSQL: Quote in string key is not escaped
GraphStudio: Reverse edge filter doesn't work
Core: Don't display LDAP password in IUM

TigerGraph 2.5.1

Release Date: 2019-11-25

Fixed

Core: Distributed delete affects data consistency after GPE restart
Core: Shuffle hangs when sendingQueue is full
Core: Longevity test failing due to change in memory allocator (TCMalloc)
GPE: Crash after upgrade from 2.4.1 to 2.5
GPE: Serialization error when reading from input stream
GPE: Query state can result in race condition inside ReadOneDelta;
GPE: GPE crashes when a query calls a sub-query with a write operation
GSE: Script to resolve delete inconsistency between GSE and GPE
GSE: Multiple Kafka loading jobs fail
GSQL: Built-in function names in GSQL are case sensitive
GSQL: Interpret query doesn't work when authentication is on
GSQL: Deadlock when graph store is being cleared and authentication is on
GSQL: Token authentication returning null during Global schema change
GSQL: SSO login failure due to missing org.apache.santuario:xmlsec library
GraphStudio: Vertex to edge expansion settings are not retained
GBAR Backup: Backup failure if loading jobs are in progress

TigerGraph 2.5.0

Release Date 2019-09-18

Changes

New and modified features and described in the TigerGraph 2.5 Release Notes.

Fixed

Improvements to fix possible crash, deadlock, overflow, and memory leak situations
Improve query performance stability
Fix some query string passing and parsing issues
Correct some inconsistencies between the documented specification and actual behavior
Improve robustness of Kakfa and S3 Loaders
Clean up files and graph properly after certain failed operations
Fix some installation issues

TigerGraph 2.4.1

Release Date 2019-07-23

Changes

To select pattern matching support in a query, the syntax is now CREATE QUERY ... SYNTAX v2 instead of CREATE QUERY ... SYNTAX("v2")

Fixed

GPE: Fix uint32 overflow
Loader: Allow temp_table to be used without flatten function
IDS: Disable empty UID
ZMQ: Fix crash on ill-formed message
Util: Fix Unix domain socket file not generated correctly in cron job
Util: Extend data size for GoutputStreamBuffer beyond 4GB
Connector: Fix first line is not ignored with has_header enabled
Connector: Fix failures on retrieving connector status
GSQL: Fix syntax version setting inconsistency issues
GSQL: Fix schema change with USING primary_id_as_attribute
GSQL: Fix JSON output format of requesttoken API
Admin Portal: Display correct counts of physical vertices and edges on each machine

TigerGraph 2.4.0

Release Date 2019-06-25

New Features

See Release Notes - TigerGraph 2.4

Fixed

GSQL: The built-in count() function gives the correct value in all cases.
GPE: startup hang
GSQL server start/stop command not working
LDAP config truncated by space
GSE: boolean values are not displayed correctly
Security issue CVE-2013-7459 caused by unused python crypto library
IUM status is displayed incorrectly in some cases;

TigerGraph 2.3.2

Release Date 2019-04-01

Issues

GSQL: The built-in count() function may give the incorrect value for clustered systems after some vertices have been deleted.

Fixed

GraphStudio: Send query pre-install dependency analysis result through WebSocket
GraphStudio: filter out improper attributes in when building filter expressions
GPE: fix wrong enumerator id issue
GPE: avoid using /tmp
GPE: handle exceptions for LIKE <expr>
GPE: Fix crash due to writing wrong size of STRING_LIST
GPE: Fix global schema change error which added local vertex twice
GSE (Developer Edition): Keep one copy of segment

TigerGraph 2.3.1

Release Date 2019-02-19

New Features

See Release Notes-TigerGraph 2.3

Issues

GSQL: The built-in count() function may give the incorrect value for clustered systems after some vertices have been deleted.

Fixed

Install: The IP list fetched by the installer could be incomplete.
Loading: Speed up batch-delta loading.
GraphStudio: Disable Install Query button for queryreader users.
GraphStudio: Re-initialize the database after import.
GraphStudio: Could not drop query with non-default username/password.
AdminPortal: Queries-Per-Second display didn't work if RESTPP authorization was enabled.
Schema change: Improve schema change stability by reducing schema change history and increasing gRPC max message limit.
GPE: Improve query HA stability.
GPE: Fix crash under certain conditions.
Core: Memory leak due to yamlcpp.
Core: compatibility issue between libc and ssh utility.
IUM: Fix exceptions due to legacy config entries.

TigerGraph 2.2.4

Release Date: 2018-12-13

Fixed

Distributed System: Fix possible deadlock and race conditions
GSE Storage Engine: Fix disk seek overflow
RESTPP: Optimize the memory consumption when system is idle
RESTPP: Optimize config reload time
GSQL: Fix query installation error with option -optimize
GSQL: Fix a code generation bug related to static variable
GSQL: Fix a compilation error when a statement is in nested if statement
GraphStudio: Security update for npm-run-all
GraphStudio: Change Help button to point to new docs.tigergraph.com site
Gadmin: Fix gadmin/ts3 restart and status error after changing port of TS3

TigerGraph 2.2.3

Release Date: 2018-11-30

Fixed

GraphStudio: Fix schema change bug (Note: In 2.2, GraphStudio now does not drop all data when making a schema change.)
GraphStudio: Fix display issue in Graph Explore when switch to a new graph
GraphStudio: Improve password security
GraphStudio: Modify URL to AdminPortal for better universal support
IUM: Fix kafka-loader configuration after cluster expansion
IUM: Resolve python module name conflict
IUM: Fix ssh_port is always 1 under bash interactive mode
GSE Storage Engine: Reduce memory consumption
RESTPP: Improve logging messages

TigerGraph 2.2

Release Date: 2018-11-05

New Features

See Release Notes-TigerGraph 2.2

Fixed

GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft
Admin Portal: Number of nodes in the cluster is reported as 0 when no graph yet exists

TigerGraph 2.1.8

Release Date: 2018-11-05

Issues

GBAR Backup fails if HA is enabled
GSE status shows unknown with HA enabled
TS3 fails to collect QPS when RESTPP Authentication is enabled (Admin Portal QPS monitor will be unavailable in this case).
GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft.
Admin Portal: Number of cluster nodes is reported as 0 when no graph exists.

Fixed

GSQL server error if schema is too large
In a cluster, not all servers may be aware of deleted vertices.
PAM limit set-up issue in installer
In MultiGraph, a local (FROM *, TO *) local edge has global side effects.
RESTPP's default API version is not set after installation
An engine bug which occasionally causes crash

Added

SSH port configuration in installer.
Installation script checks that the machine meets the minimum RAM (8GB) and CPU (2-core) requirements.
For Ubuntu 16.04/18.04, support logon with systemd service.

TigerGraph 2.1.7

Release Date: 2018-08-20

Issues

GBAR backup fails if HA is enabled.
TS3 fails to collect QPS when RESTPP Authentication is enabled (Admin Portal QPS monitor will be unavailable in this case).
GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft.
Admin Portal: Number of cluster nodes is reported as 0 when no graph exists.

Fixed

Cluster configuration with HA enabled is wrong if the number of nodes is odd (3, 5, 7, 9...).
GraphStudio and GSQL inconsistent checking for some keywords
GBAR backup and restore fail if special character is in tag name

TigerGraph 2.1.6

Release Date: 2018-08-15

Issues

Cluster configuration with HA enabled is wrong if the number of nodes is odd (3, 5, 7, 9...).
GraphStudio: When both a query draft and an installed query exist, Export Solution will keep the installed query code instead of the query draft.
TS3 fails to collect QPS when RESTPP Authentication is enabled (Admin Portal QPS monitor will be unavailable in this case).
Admin Portal: Number of cluster nodes is reported as 0 when no graph exists.

Fixed

GSQL null pointer exception during schema change if a directed edge is dropped but its partner reverse edge is kept.
Some complex attribute types cannot be correctly posted via /graph endpoint.
In some cases, tuple on reverse edge crashes GPE.
GraphStudio throws an authentication error if RESTPP authentication is enabled.

Added

License level control of MultiGraph functionality.

Tigergraph 2.1.5

Release Date: 2018-07-24

Known Issues

GSQL null pointer exception during schema change if a directed edge is dropped but its partner reverse edge is kept.
Some complex attribute types cannot be correctly posted via /graph endpoint.
In some cases, tuple on reverse edge crashes GPE.

Fixed

GraphStudio Export package is occasionally incomplete.
GSE status is always "not ready" if schema is too large.
Cannot modify RESTPP port configuration.
IUM error in a cluster when not running on node m1

Troubleshooting Guide

This troubleshooting guide is only up to date for v2.6 and below. Additional guidance for v3.0+ is in development.

Introduction

General

Before any deeper investigation, always run these general system checks :

$ gadmin status        (Make sure all TigerGraph services are UP.)

$ df -lh               (Make sure all servers are getting enough disk space.)

$ free -g              (Make sure all servers have enough memory.)

$ tsar                 (Make sure there is no irregular memory usage on the system.)

$ dmesg -T | tail      (Make sure there are no Out of Memory, or any other errors.)

$ grun all "date"      (Make sure the time across all nodes are synchronized
                        with time difference under 2 seconds. )

Location of Log Files

The following command reveals the location of the log files :

gadmin log

The .out file extension is for errors. The .INFO file extension is for normal behaviors.

In order to diagnose an issue for a given component, you'll want to check the .out log file extension for that component.

Other log files that are not listed by the gadmin log command are those for Zookeeper and Kafka, which can be found here:

zookeeper : ~/tigergraph/zk/zookeeper.out.*
kafka     : ~/tigergraph/kafka/kafka.out

Synchronize time across nodes in a cluster

Installation Error Debugging

Missing Dependencies

Pinpoint The Failed Step

Using the -x flag during installation will show you the detailed shell commands being run during installation. bash -x install.sh

Disk Space Errors

The /home directory requires at least 200MB of space, or the installation will fail with an out of disk message. This is temporary only during installation and will be moved to the root directory once installation is complete.
The /tmp directory requires at least 1GB of space, or the installation will fail with an out of disk message
The directory in which you choose to install TigerGraph requires at least 20GB of space, or the installation will report the error and exit.

Firewall Errors

If your firewall blocks all ports not defined for use, we recommend opening up internal ports 1000-50000.

# iptables help page
$ sudo iptables -h

# This will list your firewall rules
$ sudo iptables -L  

# Allow incoming SSH connections to port 22 from the 192.168.0.0 subnet
$ sudo iptables -A INPUT -p tcp --dport 22 -s 192.168.0.0/24 -j ACCEPT
$ sudo iptables -A INPUT -p tcp --dport 22 -s 127.0.0.0/8 -j ACCEPT
$ sudo iptables -A INPUT -p tcp --dport 22 -j DROP

Query Debugging

Checking the Logs - Flow of a query in the system

From calling a query to returning the result, here is how the information flows: 1. Nginx receives the request.

grep <QUERY_NAME> /home/tigergraph/tigergraph/logs/nginx/ngingx_1.access.log

You can click on the image below to expand it.

2. Nginx sends the request to Restpp.

grep <QUERY_NAME> /home/tigergraph/tigergraph/logs/RESTPP_1_1/log.INFO

grep <REQUEST_ID> /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO

grep <REQUEST_ID> /home/tigergraph/tigergraph/logs/GSE_1_1/log.INFO

6. Restpp sends the result back to Nginx.

grep <REQUEST_ID> /home/tigergraph/tigergraph/logs/RESTPP_1_1/log.INFO

7. Nginx sends the response.

grep <QUERY_NAME> /home/tigergraph/tigergraph/logs/nginx/nginx_1.access.log

Other Useful Commands for Query Debugging

Check recently executed query:
$ grep UDF:: /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO | tail -n 50

Get the number of queries executed recently:
$ grep UDF::End /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO | wc -l

Grep distributed query log:
$ grep “Action done” /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO | tail -n 50


Grep logs from all servers:
$ grun all “grep UDF:: /home/tigergraph/tigergraph/logs/GPE_*/log.INFO | tail -n 50”

Slow Query Performance

Multiple situations can lead to slower than expected query performance:

Insufficient Memory When a query begins to use too much memory, the engine will start to put data onto the disk, and memory swapping will also kick in. Use the Linux command: free -g to check available memory and swap status, or you can also monitor the memory usage of specific queries through GPE logs. To avoid running into insufficient memory problems, optimize the data structure used within the query or increase the physical memory size on the machine.
GSQL Logic Usually, a single server machine can process up to 20 million edges per second. If the actual number of vertices or edges is much much lower, most of the time it can be due to inefficient query logic. That is, the query logic is now following the natural execution of GSQL. You will need to optimize your query to tune the performance.
Disk IO When the query writes the result to the local disk, the disk IO may be the bottleneck for the query's performance. Disk performance can be checked with this Linux command : sar 1 10. If you are writing (PRINT) one line at a time and there are many lines, storing the data in one data structure before printing may improve the query performance.
Huge JSON Response If the JSON response size of a query is too massive, it may take longer to compose and transfer the JSON result than to actually traverse the graph. To see if this is the cause, check the GPE log.INFO file. If the query execution is already completed in GPE but has not been returned, and CPU usage is at about 200%, this is the most probable cause. If possible, please reduce the size of the JSON being printed.
Memory Leak This is a very rare issue. The query will progressively become slower and slower, while GPE's memory usage increases over time. If you experience these symptoms on your system, please report this to the TigerGraph team.
Network Issues When there are network issues during communication between servers, the query can be slowed down drastically. To identify that this is the issue, you can check the CPU usage of your system along with the GPE log.INFO file. If the CPU usage stays at a very low level and GPE keeps printing ??? , it means network IO is very high.
Frequent Data Ingestion in Small Batches Small batches of data can increase the data loading overhead and query processing workload. Please increase the batch size to prevent this issue.

Query Hangs

When a query hangs or seems to run forever, it can be attributed to these possibilities :

Services are down Please check that TigerGraph services are online and running. Run gadmin status and possibly check the logs for any issues that you find from the status check.
Query is in an infinite loop To verify this is the issue, check the GPE log.INFO file to see if graph iteration log lines are continuing to be produced. If they are, and the edgeMaps log the same number of edges every few iterations, you have an infinite loop in your query. If this is the case, please restart GPE to stop the query : gadmin restart gpe -y. Proceed to refine your query and make sure your loops within the query are able to break out of the loop.
Query is simply slow If you have a very large graph, please be patient. Ensure that there is no infinite loop in your query, and refer to the slow query performance section for possible causes.
GraphStudio Error If you are running the query from GraphStudio, the loading bar may continue spinning as if the query has not finished running. You can right-click the page and select inspect->console (in the Google Chrome browser) and try to find any suspicious errors there.

Query Returns No Result

To confirm the reasoning within the log file, use GraphStudio to pick a few vertices or edges that should have satisfied the conditions and check their attributes for any unexpected errors.

Query Installation Failed

Query Installation may fail for a handful of reasons. If a query fails to install, please check the GSQL log file. The default location for the GSQL log is here :

/home/tigergraph/tigergraph/logs/gsql_server_log/GSQL_LOG

If you have a c++ user-defined function error, your query will fail to install, even if it does not utilize the UDF.

How to monitor memory usage by query

grep -i <request_id> $(gadmin config get System.LogRoot)/gpe/log.INFO | 
    grep -i "querymem"

You can also run a query first, and then run the following command immediately after to retrieve the most recent query logs and filter for "QueryMem":

tail -n 50 $(gadmin config get System.LogRoot)/gpe/log.INFO |
    grep -i "querymem"

0415 01:33:40.885433  6553 gpr.cpp:195] Engine_MemoryStats|     \ 
ldbc_snb::,196612.RESTPP_1_1.1618450420870.N,NNN,15,0,0|        \
MONITORING Step(1) BeforeRun[GPR][QueryMem]: 116656             

I0415 01:33:42.716199  6553 gpr.cpp:241] Engine_MemoryStats|    \
ldbc_snb::,196612.RESTPP_1_1.1618450420870.N,NNN,15,0,0|        \
MONITORING Step(1) AfterRun[GPR][QueryMem]: 117000

How to check system free memory percentage

You can check how much free memory your system has as a percentage of its total memory by running the following command:

tail -n 50 $(gadmin config get System.LogRoot)/gpe/log.INFO | grep -i 'freepct'

The number following "FreePct" indicates the percentage of the system free memory. The following example shows the system free memory is 69%:

I0520 23:40:09.845811  7828 gsystem.cpp:622] 
System_GSystem|GSystemWatcher|Health|ProcMaxGB|0|ProcAlertGB|0|
CurrentGB|1|SysMinFreePct|10|SysAlertFreePct|30|FreePct|69

When free memory drops below 10 percent (SysMinFreePct), all queries are aborted. This threshold is adjustable through gadmin config.

How to retrieve information on queries aborted due to memory usage

 log:W0312 02:10:57.839139 15171 scheduler.cpp:116] System Memory in Critical state. Aborted.. Aborting.

Data Loading Debugging

Checking the Logs

GraphStudio

Command Line

If you see there are a number of issues from the GraphStudio Load Data page, you can dive deeper to find the cause of the issue by examining the log files. Check the loading log located here:

/home/tigergraph/tigergraph/logs/restpp/restpp_loader_logs/<GRAPH_NAME>/

Open up the latest .log file and you will be able to see details about each data source. The picture below is an example of a correctly loaded data file.

Here is an example of a loading job with errors :

Slow Loading

Normally, a single server running TigerGraph will be able to load from 100k to 1000k lines per second, or 100GB to 200GB of data per hour. This can be impacted by any of the following factors:

Loading Logic How many vertices/edges are generated from each line loaded?
Data Format Is the data formatted as JSON or CSV? Are multi-level delimiters in use? Does the loading job intensively use temp_tables?
Hardware Configuration Is the machine set up with HDD or SSD? How many CPU cores are available on this machine?
Network Issue Is this machine doing local loading or remote POST loading? Any network connectivity issues?
Size of Files How large are the files being loaded? Many small files may decrease the performance of the loading job.
High Cardinality Values Being Loaded to String Compress Attribute Type How diverse is the set of data being loaded to the String Compress attribute?

To combat the issue of slow loading, there are also multiple methods:

If the computer has many cores, consider increasing the number of Restpp load handlers.

$ gadmin --config handler
increase the number of handlers
save
$ gadmin --config apply

Separate ~/tigergraph/kafka from ~/tigergraph/gstore and store them on separate disks.
Do distributed loading.
Do offline batch loading.
Combine many small files into one larger file.

Loading Hangs

When a loading job seems to be stuck, here are things to check for :

GPE is DOWN You can check the status of GPE with this command : gadmin status gpe If GPE is down, you can find the logs necessary with this command : gadmin log -v gpe
Memory is full Run this command to check memory usage on the system : free -g
Disk is full Check disk usage on the system : df -lh
Kafka is DOWN You can check the status of Kafka with this command : gadmin status kafka If it is down, take a look at the log with this command : vim ~/tigergraph/kafka/kafka.out
Multiple Loading Jobs By default, the Kafka loader is configured to allow a single loading job. If you execute multiple loading jobs at once, they will run sequentially.

Data Not Loaded

If the loading job completes, but data is not loaded, there may be issues with the data source or your loading job. Here are things to check for:

Any invalid lines in the data source file. Check the log file for any errors. If an input value does not match the vertex or edge type, the corresponding vertex or edge will not be created.
Using quotes in the data file may cause interference with the tokenization of elements in the data file. Please check the GSQL Language Reference section under Other Optional LOAD Clauses. Look for the QUOTE parameter to see how you should set up your loading job.
Your loading job loads edges in the incorrect order. When you defined the graph schema, the from and to vertex order will affect the way you write the loading job. If you wrote the loading job in reversed order, the edges will not be created, possibly also affecting the population of vertices.

Loading is Incorrect

First, check the logs for important clues.
Are you reaching and reading all the data sources (paths and permissions)?
Is the data mapping correct?
Are your data fields correct? In particular, check data types. For strings, check for unwanted extra strings. Leading spaces are not removed unless you apply an optional token function to trim the extra spaces.
Do you have duplicate ids, resulting in the same vertex or edge being loading more than once. Is this intended or unintended? TigerGraph's default loading semantics is UPSERT. Check the loading documentation to maker sure you understand the semantics in detail:
https://docs.tigergraph.com/dev/gsql-ref/ddl-and-loading/creating-a-loading-job#cumulative-loading

Loading Failure

Possible causes of a loading job failure are:

Loading job timed out If a loading job hangs for 600 seconds, it will automatically time out.
Port Occupied Loading jobs require port 8500. Please ensure that this port is open.

Schema Change Debugging

This section will only cover the debugging schema change jobs, for more information about schema changes, please read the Modifying a Graph Schema page.

Understanding what happens behind the scenes during a schema change.

DSC (Dynamic Schema Change) Drain - Stops the flow of traffic to RESTPP and GPE If GPE receives a DRAIN command, it will wait 1 minute for existing running queries to finish up. If the queries do not finish within this time, the DRAIN step will fail, causing the schema change to fail.
DSC Validation - Verification that no queries are still running.
DSC Apply - Actual step where the schema is being changed.
DSC Resume - Traffic resumes after schema change is completed. Resume will automatically happen if a schema change fails. RESTPP comes back online. All buffered query requests will go through after RESTPP resumes, and will use the new updated schema.

Schema changes are all or nothing. If a schema change fails in the middle, changes will not be made to the schema.

Signs of Schema Change Failure

Failure when creating a graph
Global Schema Change Failure
Local Schema Change Failure
Dropping a graph fails
If GPE or RESTPP fail to start due to YAML error, please report this to TigerGraph.

If you encounter a failure, please take a look at the GSQL log file : gadmin log gsql. Please look for these error codes:

Error code 8 - The engine is not ready for the snapshot. Either the pre-check failed or snapshot was stopped. The system is in critical non-auto recoverable error state. Manual resolution is required. Please contact TigerGraph support.
Error code 310 - Schema change job failed and the proposed change has not taken effect. This is the normal failure error code. Please see next section for failure reasons.

Reasons For Dynamic Schema Change Failure

Another schema change or a loading job is running. This will cause the schema change to fail right away.
GPE is busy. Potential reasons include :
- Long running query.
- Loading job is running.
- Rebuild process is taking a long time.
Service is down. (RESTPP/GPE/GSE)
Cluster system clocks are not in sync. Schema change job will think the request is stale, causing this partition's schema change to fail.
Config Error. If the system is shrunk manually, schema change will fail.

Log Files

Example of a successful schema change job. (admin_server log)

$ grep DSC ~/tigergraph/logs/admin_server/INFO.20181011-101419.98774 

I1015 12:04:14.707512 116664 gsql_service.cpp:534] Notify RESTPP DSCDrain successfully.
I1015 12:04:15.765108 116664 gsql_service.cpp:534] Notify GPE DSCDrain successfully.
I1015 12:04:16.788666 116664 gsql_service.cpp:534] Notify GPE DSCValidation successfully.
I1015 12:04:17.805620 116664 gsql_service.cpp:534] Notify GSE DSCValidation successfully.
I1015 12:04:18.832386 116664 gsql_service.cpp:534] Notify GPE DSCApply successfully.
I1015 12:04:21.270011 116664 gsql_service.cpp:534] Notify RESTPP DSCApply successfully.
I1015 12:04:21.692147 116664 gsql_service.cpp:534] Notify GSE DSCApply successfully.

Example of DSC fail

E1107 14:13:03.625350 98794 gsql_service.cpp:529] Failed to notify RESTPP with command: DSCDrain. rc: kTimeout. Now trying to send Resume command to recover.
E1107 14:13:03.625562 98794 gsql_service.cpp:344] DSC failed at Drain stage, rc: kTimeout
E1107 14:14:03.814132 98794 gsql_service.cpp:513] Failed to notify RESTPP with command: DSCResume. rc: kTimeout

GSE Error Debugging

$ gadmin log gse
[Warning] License will expire in 5 days
GSE : /home/tigergraph/tigergraph/logs/gse/gse_1_1.out
GSE : /home/tigergraph/tigergraph/logs/GSE_1_1/log.INFO

GSE Process Fails To Start

If the GSE process fails to start, it is usually attributed to a license issue, please check these factors :

License Expiration gadmin status license This command will show you the expiration date of your license.
Single Node License on a Cluster If you are on a TigerGraph cluster, but using a license key intended for a single machine, this will cause issues. Please check with your point of contact to see which license type you have.
Graph Size Exceeds License Limit Two cases may apply for this reason. The first reason is you have multiple graphs but your license only allows for a single graph. The second reason is that your graph size exceeds the memory size that was agreed upon for the license. Please check with your point of contact to verify this information.

GSE status is "not_ready"

Usually in this state, GSE is warming up. This process can take quite some time depending on the size of your graph.

Very rarely, this will be a ZEROMQ issue. Restarting TigerGraph should resolve this issue

gadmin restart -y

GSE crash

GSE crashes are likely due to and Out Of Memory issue. Use the dmesg -T command to check any errors.

If GSE crashes, and there are no reports of OOM, please reach out to TigerGraph support.

GSE High Memory Consumption

If your system has unexpectedly high memory usage, here are possible causes :

Length of ID strings is too long GSE will automatically deny IDs with a length longer than 16k. Memory issues could also arise if an ID string is too long ( > 500). One proposed solution to this is to hash the string.
Too Many Vertex Types Check the number of unique vertex types in your graph schema. If your graph schema requires more than 200 unique vertex types, please contact TigerGraph support.

GraphStudio Debugging

Browser Crash / Freeze

If your browser crashes or freezes (shown below), please refresh your browser.

GraphStudio Crash

If you suspect GraphStudio has crashed, first run gadmin status to verify all the components are in good shape. Two known causes of GraphStudio crashes are :

Huge JSON response User-written queries can return very large JSON responses. If GraphStudio often crashes on large query responses, you can try reducing the size limit for JSON responses by changing the GUI.RESTPPResponseMaxSizeBytes configuration using gadmin config. The default limit is 33554432 bytes.

$ gadmin config entry GUI.RESTPPResponseMaxSizeBytes
New: 33554431
[   Info] Configuration has been changed. Please use 'gadmin config apply' to persist the changes.
$ gadmin config apply

Very Dense Graph Visualization On the Explore Graph page, the "Show All Paths" query on a very dense graph is known to cause a crash.

DEBUG mode

To find the location of GraphStudio log files, use this command : gadmin log gui

$ gadmin log vis
[Warning] License will expire in 5 days
VIS : /home/tigergraph/tigergraph/logs/gui/gui_ADMIN.log
VIS : /home/tigergraph/tigergraph/logs/gui/gui_INFO.log

Repeat the error-inducing operations in GraphStudio and view the logs.

Known Issues

There is a list of known GraphStudio issues here.

Further Debugging

If after taking these actions you cannot solve the issue, please reach out to support@tigergraph.com to request assistance.

Knowledge Base and FAQs

If you have a problem with the procedure described in the TigerGraph Platform Installation Guide, please contact support@tigergraph.com and summarize your issue in the email subject.

Getting Started and Basics

I need help installing the system.

What version of the TigerGraph platform am I running?

Use the following command:

$ gsql --version

To see the version numbers of individual components of the platform:

$ gadmin version

How do I upgrade from an earlier version?

I'm not sure how to run the TigerGraph system.

The system does not seem to be running correctly.

Do I need to start the TigerGraph servers (e.g., GPE, GSE) to run the system?

To check the status of servers:

$ gadmin status

Does the TigerGraph system have in-tool help?

Yes. For the GSQL shell and language, first enter the shell (type gsql from an operating system prompt). Then type the help command, e.g.,

HELP

This gives you a short list of commands. Note that "help" itself is one of the listed commands; there are help options to get more details about BASIC , QUERY commands. For example,

HELP QUERY

$ gadmin help

Is the GSQL language case sensitive?

What are the rules for naming identifiers?

When are quotation marks required? Single or double quotes?

Can I run GSQL Shell commands in batch command?

Yes. You can create a text file containing a sequence of GSQL commands and then execute that file. To execute from outside the shell:

$ gsql filename

To execute the command file from within the shell:

@filename

See also the "Language Basics" and "System Basics" sections of the GSQL Language Reference, Part 1: Defining Graphs and Loading Data document.

I have a long command line. Can I split it into multiple lines?

This is an example of a loading statement split into multiple lines using BEGIN and END:

BEGIN
CREATE ONLINE_POST JOB load1 FOR GRAPH LaborForce {
  LOAD 
    TO VERTEX user VALUES ($0, _, _, _),
    TO VERTEX occupation VALUES ($0, _),
    TO EDGE user_occupation VALUES ($0, $1);
}
END

What is Limited Capacity Mode?

When a license limit has been reached, your system will be placed in a read-only mode - incapable of loading anymore data. You will still be able to delete data and view the graph.

Defining a Graph Schema

What are the components of a graph schema?

CREATE VERTEX user (PRIMARY_ID user_id UINT, age UINT, gender STRING, postalCode STRING)
CREATE VERTEX occupation (PRIMARY_ID occ_id STRING, occ_name STRING)
CREATE UNDIRECTED EDGE user_occupation (FROM user, TO occupation)
CREATE GRAPH LaborForce (user, occupation, user_occupation)

Alternately, a generic CREATE GRAPH statement can be used:

CREATE GRAPH LaborForce (*)

Should I model this data field as an attribute or as a vertex type?

What data types do you support for vertex and edge attributes?

Discontinued Feature

The UINT_SET and STRING_SET COMPRESS types have been discontinued since there is now equivalent functionality from the more general SET and SET types.

Can I define and load multiple graph schemas?

How many vertex and edge types can I include in a graph?

How do I check the definition of the current schema?

How do I modify my graph schema?

How do I delete my entire graph schema?

To delete your entire catalog, containing not just your vertex, edge, and graph type definitions, but also your loading job and query definitions, use the following command: GSQL> DROP ALL

To delete just your graph schema, use the DROP GRAPH command: GSQL> DROP GRAPH g1

UPDATE Deleting the graph schema also erases the contents of the graph store. To erase the graph store without deleting the graph schema, use the following command: GSQL> CLEAR GRAPH STORE

See also " How do I erase all data? "

Loading Data

How do I load data?

In v2.0, the TigerGraph introduced a more powerful and comprehensive syntax which has several advantages:

The TigerGraph platform can handle concurrent loading jobs, which can greatly increase throughput.
The data file locations can be specified at compile time or at run time. Run-time settings override compile-time settings.
A loading job definition can include several input files. When running the job, the user can choose to run only part of the job by specifying only some of the input files.
Loading jobs can be monitored, aborted, and restarted.

What types of data can be read?

Additional data formats are continually being added. See Data Loader User Guides and the TigerGraph Ecosystem Github Repository's etl folder https://github.com/tigergraph/ecosys/tree/master/tools/etl

What is the format of a tabular input data file?

uid,name,avg_score,is_member
100,"Lee, Tom",48.5,1
101,"Wu, Ming",33.9,0
102,"Gables, Anne", 72.2,1

The loader does not filter out extra white space (spaces or tabs). The user should filter out extra white space from the files before loading into the TigerGraph system.

How should data fields be separated?

Should fields be enclosed in quotation marks?

For example, if SEPARATOR="," and QUOTE="double" are set, then when the following data are read,

uid,name,avg_score,is_member
100,"Lee, Tom",48.5,1
101,"Wu, Ming",33.9,0
102,"Gables, Anne,"72.2,1

"Lee, Tom" will be read as a single field. The comma between Lee and Tom will not separate the field.

Does the GSQL Loader automatically interpret quotation marks as enclosing strings?

No. You must specify either QUOTE="single" or QUOTE="double" .

What are the parameters (in the USING clause) for a loading job?

The following three parameters should be considered for every loading job from a tabular input file:

The next two parameters, FILENAME and EOL are required if the job is an ONLINE_POST job:

All of the these five parameters are combined into one USING clause with a list of parameter/value pairs. The parameters may appear in any order.

USING parameter1="value1", parameter2="value2",... , parameterN="valueN"

CREATE LOADING JOB load1 FOR GRAPH LaborForce{
  LOAD "jobs.csv" TO VERTEX occupation VALUES ($0, $1) USING HEADER="true", SEPARATOR="|", QUOTE="double";
}

For online loading, the USING clause appears at the end of the RUN statement

CREATE ONLINE_POST JOB load2 FOR GRAPH LaborForce{
  LOAD TO VERTEX occupation VALUES ($0, $1);
}
RUN JOB load2 USING FILENAME="./jobs.csv", HEADER="true", SEPARATOR="|", QUOTE="double", EOL="\n"

My data file doesn't have a header but I still want to name the columns.

You can define a header line (a sequence of column names) within a loading job using a DEFINE HEADER statement, such as the following:

DEFINE HEADER head1 = "jobId", "jobName";

CREATE ONLINE_POST JOB load2 FOR GRAPH LaborForce{
  DEFINE HEADER head1 = "jobId", "jobName";
  LOAD TO VERTEX occupation VALUES ($"jobId", $"jobName") USING USER_DEFINED_HEADER="head1";
}

How do I identify and refer to the input data fields?

Input data fields can always be referenced by position. They can also be referenced by name, if a header has been defined.

- Position-based reference: The leftmost field is $0, the next one is $1, and so on.
- Name-based reference: $"name" , where name is one of the header column names.

For example, if the header is abc,def,ghi

then the third field can be referred to as either $2 or $"ghi" .

How do I split (flatten) a data field containing a list of values into separate vertices and edges?

First, to clarify the task, consider a graph schema with two vertex types, Book and Genre, and one edge type, book_genre:

CREATE VERTEX Book  (PRIMARY_ID bookcode STRING, title STRING)
CREATE VERTEX Genre (PRIMARY_ID genre_id STRING, genre_name STRING)
CREATE UNDIRECTED EDGE book_genre (FROM Book, TO Genre)
CREATE GRAPH book_rating (Book, Genre, book_genre)

bookcode|title|genres
101|"Harry Potter and the Philosopher's Stone"|fiction,fantasy,young adult
102|"The Three-Body Problem"|fiction,science fiction,Chinese

CREATE ONLINE_POST JOB load_books FOR GRAPH book_rating {
  LOAD
      TO VERTEX Book VALUES ($0, $1),
      TO TEMP_TABLE t1(bookcode,genre) VALUES ($0, flatten($2,",",1));
 
  LOAD TEMP_TABLE t1
      TO VERTEX Genre VALUES($"genre", $"genre"),
      TO EDGE book_genre VALUES($"bookcode", $"genre");
}
RUN JOB load_books USING FILENAME="book.dat", SEPARATOR="|", HEADER="true", QUOTE="double", EOL="\n"

bookcode|title|genres
101|"Harry Potter and the Philosopher's Stone"|FIC:fiction,ADV:adventure,FTS:fantasy,YA:young adult
102|"The Three-Body Problem"|FIC:fiction,SF:science fiction,CHN:Chinese"

Then the following loading statements would be appropriate:

CREATE ONLINE_POST JOB load_books2 FOR GRAPH book_rating {
  LOAD
      TO VERTEX Book VALUES ($0, $1),
      TO TEMP_TABLE t1(bookcode,genre_id,genre_name) VALUES ($0, flatten($2,",",":",2));
 
  LOAD TEMP_TABLE t1
      TO VERTEX Genre VALUES($"genre_id", $"genre_name"),
      TO EDGE book_genre VALUES($"bookcode", $"genre_id");
}
RUN JOB load_books2 USING FILENAME="book2.dat", SEPARATOR="|", EOL="\n"

Can the TigerGraph system load data from a streaming source?

I want to compute an attribute value. What built-in functions are available?

Do I need a one-to-one correspondence between input files and vertex types and edge types?

From the LOAD statement perspective for a online loading job:

LOAD 
  TO VERTEX vertex_type VALUES (attr_expr...) [WHERE conditions],
  ...,
  TO VERTEX vertex_typeN VALUES (attr_expr...) [WHERE conditions],
  TO EDGE  edge_type VALUES (attr_expr...) [WHERE conditions] [OPTION (options)],
  ...,
  TO EDGE edge_typeN VALUES (attr_expr...) [WHERE conditions] [OPTION (options)]
  [Parsing_Conditions];

Each LOAD statement refers to one input file.
Each LOAD statement can have one or more resulting vertex types and one or more resulting edge types.
Hence, one LOAD statement can potentially describe the one-to-many mapping from one input file to many resulting vertex and edge types.
It is not necessary for every input line to always generate the same set of vertex types and edge types. The WHERE clause in each TO VERTEX | TO EDGE clause can be used to selectively choose and filter which input lines generate which resulting types.

My input data includes multiple edge instances between a pair of vertices. Why is there only one in the graph?

This not an error. There can only be one instance of a certain edge type between any given pair of vertices, so the most recently loaded edge data will be the edge that you will see in the graph.

Updating and Modifying Data

How can I insert / load more data?

Second, if you have a few specific insertions, you can use the Upsert da ta command in the RESTPP API User Guide . For Upsert, the data must be formatted in JSON format.

How can I modify the graph schema?

You can modify the schema in several ways:

Add new vertex or edge types
Drop existing vertex or edge types
Add or drop attributes from an existing vertex or edge type

Any schema change can invalidate existing loading jobs and queries.

See the section "Modifying a Graph Schema" in GSQL Language Reference Part 1 - Defining Graphs and Loading Data .

How do I modify data?

To make a known modification of a known vertex or edge: Option 1) Make a RESTPP endpoint request, to the POST /graph or DELETE /graph endpoint. See the RESTPP API User Guide .

How do I selectively delete data?

You can write a query which selects vertices or edges to be deleted. See the DELETE subsections of the "Data Modification Statements" section in GSQL Language Reference Part 2 - Querying .

How do I erase all the data?

-HARD must be in all capital letters.

Querying

Is there more than one TigerGraph query language?

What is the basic syntax for the TigerGraph query language?

Is GSQL a query language or a programming language?

What types of accumulators are available?

In the following table, baseType means any of the following: INT, UINT, FLOAT, DOUBLE, STRING, BOOL, VERTEX, EDGE, JSONARRAY, JSONOBJECT, DATETIME

How do I use accumulators?

See the section "Accumulators" in the GSQL Language Reference Part 2 - Querying document.

How do I reference the ID fields of a vertex or edge in a built-in query?

Vertices :

In a CREATE VERTEX statement, the PRIMARY_ID is required and is always listed first. User-defined attributes are optional and come after the required ID fields.

CREATE VERTEX Book  (PRIMARY_ID bookcode STRING, title STRING)
CREATE VERTEX Genre (PRIMARY_ID genre_id STRING, genre_name STRING)
CREATE UNDIRECTED EDGE book_genre (FROM Book, TO Genre)
CREATE GRAPH book_rating (Book, Genre, book_genre)

In a built-in query, if you wish to select vertices by specifying an attribute value, you use the attribute name (e.g., title):

SELECT * FROM Book WHERE title=="The Three-Body Problem"

In contrast, if you wish to reference vertices by the id value, the lowercase keyword primary_id must be used. Note that that query does not use the id name pid .

SELECT * FROM Book WHERE primary_id=="101"

Edges :

CREATE UNDIRECTED EDGE book_genre (FROM Book, TO Genre, rating uint, date_time datetime)

In a query, if you wish to select edges by specifying their FROM or TO vertex values, you must use the lowercase keywords from_id or to_id .

SELECT * FROM Book-(book_genre)->Genre WHERE from_id=="101"

What is the format of data returned by a query?

The data are in JSON format. See the section "Output Statements" in the GSQL Language Reference Part 2 - Querying .

Is there an output size limit for a data query?

Yes. The maximum output size for a query is 2GB. If the result of a query would be larger than 2GB, the system may return no data. No error message is returned.

Also, for built-in queries (using the Standard Data Manipulation REST API), queries return at most 10240 vertices or edges.

How and when do I use INSTALL QUERY and INSTALL QUERY -OPTIMIZE?

Anytime after INSTALL QUERY, another statement, INSTALL QUERY -OPTIMIZE can be executed once. This operation optimizes all previously installed queries, reducing their run times by about 20%.

Should I run INSTALL QUERY -OPTIMIZE?

Optimize a query if query run time is more important to you than query installation time. The initial INSTALL QUERY operation runs quickly. This is good for the development phase.

The optional additional operation INSTALL QUERY -OPTIMIZE will take more time, but it will speed up query run time. This makes sense for production systems.

Legal:

CREATE QUERY query1... 
INSTALL QUERY query1 
RUN QUERY query1(...) 
... 
INSTALL QUERY -OPTIMIZE    # (optional) optimizes run time performance for query1 and query2 
RUN QUERY query1(...)      # runs faster than before

Illegal:

INSTALL QUERY -OPTIMIZE query_name

Can multiple users install queries at the same time?

In short, yes. They will not be executed at the same time, but the installations will be queued by the order in which they were received.

Can I make a 2-dimensional (or multi-dimensional) array?

CREATE QUERY nestedAccumEx() FOR GRAPH anyGraph {
  ListAccum<ListAccum<INT>> @@_2d_list;
  ListAccum<ListAccum<ListAccum<INT>>> @@_3d_list;
  ListAccum<INT> @@_1d_list;
  SumAccum <INT> @@sum = 4;
   
  @@_1d_list += 1;
  @@_1d_list += 2;
  // add 1D-list to 2D-list as element
  @@_2d_list += @@_1d_list;
   
  // add 1D-enum-list to 2D-list as element
  @@_2d_list += [@@sum, 5, 6];
  // combine 2D-enum-list and 2d-list
  @@_2d_list += [[7, 8, 9], [10, 11], [12]];
   
  // add an empty 1D-list
  @@_1d_list.clear();
  @@_2d_list += @@_1d_list;
   
  // combine two 2D-list
  @@_2d_list += @@_2d_list;
   
  PRINT @@_2d_list;
   
  // test 3D-list
  @@_3d_list += @@_2d_list;
  @@_3d_list += [[7, 8, 9], [10, 11], [12]];
  PRINT @@_3d_list;
}

Can I make nested container Accumulators?

ListAccum can contain ListAccum.
MapAccum and GroupByAccum can contain any container accumulator except HeapAccum.
ArrayAccum is always nested.

Here is an example:

CREATE QUERY nestedMap() FOR GRAPH anyGraph
{
  MapAccum<String, MapAccum<int, String>> @@testMap;

  @@testMap += ("m1" -> (0 -> "value1"));
  @@testMap += ("m1" -> (1 -> "value2"));
  @@testMap += ("m2" -> (2 -> "value3"));

  IF @@testMap.containsKey("m1") THEN
    PRINT @@testMap.get("m1");
  END;
   //for map, we can get it's value, and then, get the value's key.
  PRINT @@testMap.get("m1").get(0);
}

Testing and Debugging

How can I validate a loading job?

The full syntax for an (offline) loading job is the following:

RUN JOB [-DRYRUN] [-n [ first_line_num ,] last_line_num ] job_name

The -DRYRUN option will read input files and process data as instructed by the job, but it does not store data in the graph store.

-n 50 means read lines 1 to 50. -n 10,50 means read lines 10 to 50. The special symbol $ is interpreted as "last line", so -n 10,$ means reads from line 10 to the end.

Where are the logs?

The following command lists the log locations of the log files:

gadmin log

If the platform has been installed with default file locations, so that <TigerGraph_root_dir> = /home/tigergraph/tigergraph, then the output would be the following:

GPE : /home/tigergraph/tigergraph/logs/gpe/gpe1.out
GPE : /home/tigergraph/tigergraph/logs/GPE_1_1/log.INFO
GSE : /home/tigergraph/tigergraph/logs/gse/gse1.out
GSE : /home/tigergraph/tigergraph/logs/GSE_1_1/log.INFO
RESTPP : /home/tigergraph/tigergraph/logs/restpp/restpp1.out
RESTPP : /home/tigergraph/tigergraph/logs/RESTPP_1_1/log.INFO
RESTPP : /home/tigergraph/tigergraph/logs/RESTPP-LOADER_1_1/log.INFO
GSQL : /home/tigergraph/tigergraph/logs/gsql_server_log/GSQL_LOG

As of v2.4, the GSQL log files have been moved in order to keep all logs in a standard directory.

GPE: general system performance logs. GSE: Graph services logs. RESTPP: REST API call logs. GSQL: General GSQL logs.

Where are the log files of loading runs?

What are in the log files?

GPE Logs - Graph Processing Engine Logs GSE Logs - Graph Storage Engine Logs GSQL Logs - System & Query Logs RESTPP Logs - API call Logs NGINX Logs - HTTP Request Logs VIS Logs - GraphStudio Logs

I can’t seem to load any more data. What’s the matter?

[Warning] License limit exceeded. The system is running in limited capacity mode.

In Limited Capacity mode, additional data may not be inserted. Data may be queried and deleted.