Overview

TigerGraph In-Database Graph Data Science Library is a collection of expertly written GSQL queries, each of which implements a standard graph algorithm. Each algorithm is ready to be installed and used, either as a stand-alone query or as a building block of a larger analytics application.

We renamed our library (formerly known as GSQL Graph Algorithm Library) to emphasize our focus on graph data science. As the world’s only scalable graph analytics platform, TigerGraph is committed to providing the best graph analytics framework for data scientists.

GSQL running on the TigerGraph platform is particularly well-suited for graph algorithms for several reasons:

  • Turing-complete with full support for imperative and procedural programming, ideal for algorithmic computation.

  • Parallel and Distributed Processing, enabling computations on larger graphs.

  • User-Extensible. Because the algorithms are written in standard GSQL and compiled by the user, they are easy to modify and customize.

  • Open-Source. Users can study the GSQL implementations to learn by example, and they can develop and submit additions to the library.

Library Structure

You can download the library from Github: https://github.com/tigergraph/gsql-graph-algorithm

The library contains two folders: algorithms and graphs.

algorithms

The algorithms folder contains the GSQL implementation of all the graph algorithms offered by the library. Within the algorithms folder are six subfolders that group the algorithms into six classes:

graphs

The graphs folder contains small sample graphs that you can use to experiment with the algorithms. In this document, we use the test graphs to show you the expected result for each algorithm. The graphs are small enough that you can manually calculate and sometimes intuitively see what the answers should be.

Release Branches

Starting with TigerGraph product version 2.6, the Library has release branches:

  • Product version branches (2.6, 3.0, etc.) are snapshots created shortly after a product version is released. They contain the best version of the graph algorithm library at the time of that product version's initial release. They will not be updated, except to fix bugs.

  • Master branch: the newest released version. This should be at least as new as the newest. It may contain new or improved algorithms.

  • Other branches are development branches.

It is possible to run newer algorithms on an older product version, as long as the algorithm does not rely on features available only in newer product versions.

Run an algorithm

All GSQL graph algorithms are schema-free, which means they are ready to use with any graph, regardless of the graph's data model or schema. The algorithms have run-time input parameters for the vertex type(s), edge type(s), and attributes which the user wishes to use.

To use an algorithm, the algorithm (GSQL query) must first be installed. If your database is on a distributed cluster, you should use the DISTRIBUTED option when installing the query to install it in Distributed Query Mode.

Running an algorithm is the same as running a GSQL query. For example, if you selected the JSON option for pageRank, you could run it from GSQL as below:

GSQL > RUN QUERY tg_pageRank("Page","Links_to",_,30,_,50,_,_,_)

Installing a query also creates a REST endpoint. The same query could be run thus:

curl -X GET 'http://127.0.0.1:9000/query/alg_graph/pageRank?v_type=Page&e_type=Links_to&max_iter=30&top_k=50'

GSQL lets you run queries from within other queries. This means you can use a library algorithm as a building block for more complex analytics.

Last updated