TigerGraph ML Workbench

TigerGraph ML Workbench is a R&D platform for data scientists and AI/ML practitioners to develop Graph Neural Network (GNN) models with production-scale graph data stored in TigerGraph. It provides robust and efficient data pipelines at the Python level to stream graph data from the TigerGraph database to the user’s ML system, performs common data processing tasks such as training, validation and testing on graph data sets, and various subgraph sampling methods.

TigerGraph ML Workbench is designed to work with your existing ML framework and infrastructure. It is compatible with other popular ML frameworks such as PyTorch Geometric and Deep Graph Library (GDL). Additionally, it can be plugged into your existing on-prem infrastructure, or in the Cloud with Amazon SageMaker and Google Vertex AI.

High-level architecture

The TigerGraph ML Workbench contains three major components:

  • Graph Data Processing Service (GDPS)

  • TGML, the Python client for GDPS

  • TigerLab, a JupyterLab-based IDE

mlworkbench architecture
Figure 1. High-level production architecture
mlworkbench architecture flow
Figure 2. Architecture flow diagram
The Graph Data Processing Service (GDPS)

GDPS runs on the same machine as the core TigerGraph database and performs graph machine learning operations such as sampling graph data, feature extraction, data preparation, as well as data caching and sending data to your ML development environment. It also contains REST endpoints for the tgml package to call in order to perform these operations.

TGML

TGML is a Python package installed on the computer or server where you want to perform your machine learning training. The tgml package provides utilities such as vertex set splitting for training, validation, and testing, as well as graph data loaders for both PyTorch Geometric (PyG) and Deep Graph Library (DGL). As tgml is a Python package, it can be installed anywhere Python is used.

TigerLab

TigerLab is a JupyterLab-based development environment with TigerGraph specific utilities and components, such as a server manager and link to GraphStudio. In addition, all Python libraries such as PyTorch Geometric, DGL, and TGML come pre-installed, so you don’t have to worry about setting up the right Python environment.

Graph neural networks and their applications

GNNs tend to outperform other machine learning techniques when there are well-defined relationships between data as it directly models the connectivity of your graph data. From recent research, GNNs have proven its success across various business domains and applications. With TigerGraph ML Workbench, you can now easily explore the potentials of GNN for your domains. Below are some papers and resources to spark ideas in a range of applications and industries:

Recommendation Engines

Pinterest introduced PinSAGE[1], an architecture that can serve real-time recommendations to their users, resulting in a 10-30% improvement compared to other deep learning methods when evaluated in A/B testing.

Supply Chain

Amazon released a GNN architecture[2] that incorporates temporal information with GNNs for demand forecasting. The method models interactions between products and their sellers on Amazon in a graph, resulting in a 16% improvement over other state-of-the-art forecasting methods.

Healthcare

AstraZeneca has used graph neural networks like GraphSAGE to generate knowledge graph embeddings for predicting possible drug-drug interactions such as possible synergies between drugs, as well as possible polypharmacy side effects[3]. Additionally, the possibility of repurposing drugs to treat COVID has been studied using a drug repurposing knowledge graph and GNNs[4].

Financial Institutions

GCNs have been studied for predicting money-laundering behavior in Bitcoin transaction networks, and have been shown to perform admirably compared to other approaches[5].

If you are interested in learning more about the fundamental research on different variations of Graph Neural Network, here is a list of helpful publications:


1. Ying, Rex et al. “Graph Convolutional Neural Networks for Web-Scale Recommender Systems”, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018.
2. Ankit Gandhi, Aakankasha, Sivaramakrishnan Kaveri, Vineet Chaoji, “Spatio-temporal multi-graph networks for demand forecasting in online marketplaces”
3. Benedek Rozemberczki, Stephen Bonner, Andriy Nikolov, Michael Ughetto, Sebastian Nilsson, Eliseo Papa, “A Unified View of Relational Deep Learning for Drug Pair Scoring”, CoRR, November 2021.
4. Hsieh, K., Wang, Y., Chen, L. et al. “Drug repurposing for COVID-19 using graph neural network and harmonizing multiple evidence”, Sci Rep 11, 23179, 2021.
5. Mark Weber, Giacomo Domeniconi, Jie Chen, Daniel Karl I. Weidele, Claudio Bellei, Tom Robinson, and Charles E. Leiserson, “Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics”, In Proceedings of ACM Conference (KDD ’19 Workshop on Anomaly Detection in Finance), 2019.