Encryption for Data at Rest and Data in Motion
Loading...
Loading...
TigerGraph supports secure data-in-flight communication, using SSL/TLS encryption protocol. This applies to any outward-facing channel, including GSQL clients, RESTPP endpoints, and the GraphStudio web interface. When SSL/TLS is enabled, HTTPS takes the place of HTTP for RESTPP and GraphStudio connections.
You should have basic knowledge about how SSL works:
What the SSL certificate and key are used for
That an SSL certificate is bound to a domain
How an SSL certificate chain works
A good primer on SSL is available to https://httpd.apache.org/docs/2.4/ssl/ssl_intro.html
TigerGraph uses the Nginx web server, so SSL configuration makes use of some built-in support in Nginx.
http://nginx.org/en/docs/http/configuring_https_servers.html
The two main options for obtaining an SSL Certificate are to generate your own self-signed certificate or to purchase a certificate from a trusted Certificate Authority. Regardless of which method you choose, your certificate should be chained to a trusted root certificate embedded in your browser. The options and details for producing a trusted SSL certificate are beyond the scope of this document. The focus of this document is how to configure your TigerGraph system to use the certificate to enable SSL.
First, obtain an SSL certificate from a trusted agent of your choice. Certificate vendors will provide clear instructions for ordering a certificate and then for installing it on your system.
Then you can configure the certificate with gadmin config entry ssl
There are multiple ways to create a self-signed certificate. One example is shown below.
For simplicity, the method below will use the root certificate directly as the HTTPS server certificate. This method is satisfactory for testing but should not be used for a production system.
In the example below, the Common Name value should be your server hostname, since HTTPS certificates are bound to domain names.
For security reasons, the certificates can only be used with permission 600 or less.
gadmin
With the self-signed certificate successfully generated, you can configure it with gadmin
, so that all the HTTP traffic will be protected with SSL.
TigerGraph's SSL only accepts PEM-encoded certificates. If you have a certificate encoded in other formats (e.g. DER), you need to convert it to a PEM-encoded certificate first.
After saving the settings, apply the configuration settings.
Then restart the following services: gsql
, nginx
, ts3
, and gui
.
Now you may test the connection.
A direct curl request to the server will fail due to certificate verification failure:
In v1.2, the default TCP/IP port for Nginx has changed from 44240 to 14240, to avoid possible port conflicts with Zookeeper.
You may use the -k option to turn off the verification, but it is unsafe and not recommended.
To successfully make requests with curl, you will need to specify the certificate by using the --cacert
parameter:
The TigerGraph graph data store uses a proprietary encoding scheme which both compresses the data and obscures the data unless the user knows the encoding/decoding scheme. In addition, the TigerGraph system supports integration with industry-standard methods for encrypting data when stored in disk ("data at rest").
Data at rest encryption can be applied at many different levels. A user can choose to use one or more level.
File system encryption employs advanced encryption algorithms. Some tools allow the user to select from a menu of encryption algorithms. It can be done either in kernel mode or user mode. To run in kernel mode, superuser permission is required.
Since Linux 2.6, device-mapper has been an infrastructure, which provides a generic way to create virtual layers of block devices with transparent encryption blocks using the kernel crypto API.
In Ubuntu, full-disk encryption is an option during the OS installation process. For other Linux distributions, the disk can be encrypted with dm-encrypt .
A commonly used utility is eCryptfs , which is licensed under GPL, and it is built into some kernels, such as Ubuntu.
If root privilege is not available, a workaround is to use FUSE (Filesystem in User Space) to create a user-level filesystem running on top of the host operating system. While the performance may not be as good as running in kernel mode, there are more options available for customization and tuning.
In this example, we use dm-crypt to provide kernel-mode file system encryption. The dm-crypt utility is widely available and offers a choice of encryption algorithms. It also can be set to encrypt various units of storage – full disk, partitions, logical volumes, or files.
The basic idea of this solution is to create a file, map an encrypted file system to it, and mount it as a storage directory for TigerGraph with R/W permission only to authorized users.
Before you start, you will need a Linux machine on which
you have root permission,
the TigerGraph system has not yet been installed,
and you have sufficient disk space for the TigerGraph data you wish to encrypt. This may be on your local disk or on a separate disk you have mounted.
Install cryptsetup (cryptsetup is included with Ubuntu, but other OS users may need to install it with yum).
Install the TigerGraph system.
Grant sudo privilege to the TigerGraph OS user.
Stop all TigerGraph services with the following commands: gadmin stop all -y gadmin stop admin -y
Acting as the tigergraph OS user, run the following export commands to set variables. Replace the placeholders enclosed in angle brackets <...> with the values of your choice:
Create a file for TigerGraph data storage.
Change the permission of the file so that only the owner of the file (that is, only the tigergraph user who created the file in the previous step) will be able to access it:
Associate a loopback device with the file:
Encrypt storage in the device. cryptsetup will use the Linux device mapper to create, in this case, $encrypted_file_path . Initialize the volume and set a password interactively with the password you set to $encryption_password :
If you are trying to automate the process with a script running with root TTY session , you may use the following command:
Open the partition, and create a mapping to $encrypted_file_path :
If you are trying to automate the process with a script running with root TTY session , you may use the following command:
Clear the password from bash variables and bash history.
The following commands may clear your previous bash histories as well. Instead, you may edit ~/.bash_history to selectively delete the related entries.
Create a file system and verify its status:
Mount the new file system to /mnt/secretfs:
Change the permission to 700 so that only $db_user has access to the file system:
Move the original TigerGraph files to the encrypted filesystem and make a symbolic link. If you wish to encrypt only the TigerGraph data store (called gstore), use the following commands:
There are other TigerGraph files which you might also consider to be sensitive and wish to encrypt. These include the dictionary, kafka data files, and log files. You could selectively identify files to protect or you could encrypt the entire TigerGraph folder(App/Data/Log/TempRoot). In this case, simply move $tigergraph_data_root instead of $tigergraph_data_root/gstore.
The data of TigerGraph data is now stored in an encrypted filesystem. It will be automated decrypted when the tigergraph user (and only this user) accesses it.
To automatically deploy this encryption solution, you may
Chain all the steps as a bash script
Remove all "sudo" since the script will be running as root.
Run the script as root user after TigerGraph Installation.
The setup scripts contain your encryption password. To follow good security procedures, do not leave your password in plaintext format in any files on your disk. Either remove the setup scripts or edit out the password.
Encryption is usually CPU-bound rather than I/O-bound. If CPU usage reamains below 100%, encryption should not cause much performance slowdown. A performance test using both small and large queries supports this prediction: for small (~1 sec) and large (~100 sec) queries, there is a ~5% slowdown due to filesystem encryption.
We used the TPC-H dataset with scale factor 10 ( http://www.tpc.org/tpch/ ). The data size is 23GB after loading into TigerGraph..The write test (data loading) was done by running a loading job and then killing the GPE with SIGTERM (to exit gracefully) to ensure that all kafka data is consumed.The read test (GSE cold start) measures the time from "gadmin start gse" until "online" appears in "gadmin status gse".
Major cloud service providers often provide their own methodologies for encrypting data at rest. For Amazon EC2, we recommend users start by reading the AWS Security Blog: How to Protect Data at Rest with Amazon EC2 Instance Store Encryption .
In this section, we provide a simple example for configuring file system encryption for a TigerGraph running on Amazon EC2. The steps are based on those given in How to Protect Data at Rest with Amazon EC2 Instance Store Encryption , with some additions and modifications.
The basic idea of this solution is to create a file, map an encrypted file system to it, and mount it as a storage directory for TigerGraph with permission only to authorized users.
Angle brackets <...> are used to mark placeholders which you should replace with your own values (without the angle brackets).
Make sure you have installed and configured AWS CLI with keys locally.
If you don't have a KMS key, you can create it first:
From the IAM console , choose Encryption keys from the navigation pane.
Select Create Key , and type in <your-key-alias>
For Step 2 and Step 3 , see https://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html for advice.
In Step 4 : Define Key Usage Permissions , select <your-role-name>
The role now has permission to use the key.
In this section, you launch a new EC2 instance with the new IAM role and a bootstrap script that executes the steps to encrypt the file system.
The script in this section requires root permission, and it cannot be run manually through an ssh tunnel or by an unprivileged user.
In the EC2 console , launch a new instance (see this tutorial for more details). Amazon Linux AMI 2017.09.1 (HVM), SSD Volume Type (If NOT using Amazon Linux AMI, a script the installs python, pip and AWS CLI needs to be added in the beginning).
In Step 3: Configure Instance Details
In IAM role , choose <your-role-name>
In User Data , paste the following code block after replacing the placeholders with your values and appending TigerGraph installation script
It may take a few minutes for the script to complete after system launch.
Then, you should be able to launch one or more EC2 machines with an encrypted folder under /mnt/secretfs that only OS user tigergraph can access.
Encryption is usually CPU-bound rather than I/O bound. If CPU usage is below 100%, TigerGraph tests show no significant performance downgrade.
Encryption Level
Description
TigerGraph Support
Hardware
Use specialized hard disks which perform automatic encryption on write and decryption on read (by authorized OS users)
Invisible to TigerGraph
Kernel-level file system
Use Linux built-in utilities to encrypt data. Root privilege required.
Invisible to TigerGraph
User-level file system
Use Linux built-in utilities and customized libraries to encrypt data. Root privilege is not required.
Invisible to TigerGraph
GSE Cold Start (read)
Load Data (write)
original
45s
809s
encrypted
47s
854s
% slowdown
4.4%
5.8%