Prepare your environment - Pattern Matching tutorial
This section of the pattern matching tutorial walks you through creating the graph and loading graph data, so you have the right setup to move forward with the tutorial.
We will use the LDBC Social Network Benchmark (LDBC SNB) data set. This data set models a typical online forum where users post messages and discuss topics. It comes with a data generator, which allows you to generate data at different scale factors. Scale factor 1 generates roughly 1GB of raw data, scale factor 10 generates roughly 10GB of raw data, and so on.
Figure 1 shows the schema (from the LDBC SNB specification). It models the activities and relationships of social forum participants. For example, a forum Member can publish Posts on a Forum, and other Members of the Forum can make a Comment on the Post or on someone else’s Comment.
A Person’s home location is a hierarchy (Continent>Country>City), and a person can be affiliated with a University or a Company. A Tag can be used to classify a Forum, a Message, or a Person’s interests. Tags can further be classified by Tag_Class.
The relationships between entities are modeled as directed edges, except Person KNOWS Person is modeled as an undirected edge.
For example, Person connects to Tag by the hasInterest edge. Forum connects to Person by two different edges, Has_Member and Has_Moderator.
Prerequisites
-
A running TigerGraph instance. See Get Started with Docker to quickly provision a container running the latest version of TigerGraph.
-
You have full privileges to create graphs, load data, and drop graphs. It is recommended that you go through this tutorial with the default user
tigergraph
, which has thesuperuser
role with all supported privileges.
Download raw data
We have generated a data set with scale factor 1 (approximately 1GB). You can download it from the following link: https://s3-us-west-1.amazonaws.com/tigergraph-benchmark-dataset/LDBC/SF-1/ldbc_snb_data-sf1.tar.gz
wget https://s3-us-west-1.amazonaws.com/tigergraph-benchmark-dataset/LDBC/SF-1/ldbc_snb_data-sf1.tar.gz
After downloading the raw file, run the tar command below to decompress the downloaded file.
tar -xzf ldbc_snb_data-sf1.tar.gz
After decompressing the file, you see a folder named ldbc_snb_data
.
Inside are two sub-folders:
-
social_network
-
substitution_parameters
The raw data is in the social_network
folder.
Run script to create schema and loading jobs
Download setup_schema.gsql
, which is a script that defines the schema and creates the loading job.
Use the export
command to specify the environment variable LDBC_SNB_DATA_DIR to point to your raw file folder un-tarred in the previous section.
In our example below, the raw data is in /home/tigergraph/ldbc_snb_data/social_network.
Then, start your TigerGraph services if needed. Finally, run the setup_schema.gsql script to create your LDBC Social Network graph.
# Change the directory to your raw file directory
export LDBC_SNB_DATA_DIR=/home/tigergraph/ldbc_snb_data/social_network/
# Start all TigerGraph services
gadmin start all
# Setup schema and loading job
gsql setup_schema.gsql
Run loading job
Download the loading job script and invoke it on the command line to run the loading job with the files specified in the previous loading job script.
bash ./load_data.sh
tigergraph/gsql_102$ ./load_data.sh
[Tip: Use "CTRL + C" to stop displaying the loading status update, then use "SHOW LOADING STATUS jobbed" to track the loading progress again]
[Tip: Manage loading jobs with "ABORT/RESUME LOADING JOB jobid"]
Starting the following job, i.e.
JobName: load_ldbc_snb, jobid: ldbc_snb.load_ldbc_snb.file.m1.1558053156447
Loading log: '/mnt/data/tigergraph/logs/restpp/restpp_loader_logs/ldbc_snb/ldbc_snb.load_ldbc_snb.file.m1.1558053156447.log'
Job "ldbc_snb.load_ldbc_snb.file.m1.1558053156447" loading status
[FINISHED] m1 ( Finished: 31 / Total: 31 )
[LOADED]
+----------------------------------------------------------------------------------------------------------------------------------+
| FILENAME | LOADED LINES | AVG SPEED | DURATION|
| /mnt/data/download/ldbc_snb_data/social_network/comment_0_0.csv | 2052170 | 281 kl/s | 7.28 s|
| /mnt/data/download/ldbc_snb_data/social_network/comment_has_creator_person_0_0.csv | 2052170 | 251 kl/s | 8.17 s|
| /mnt/data/download/ldbc_snb_data/social_network/comment_has_tag_tag_0_0.csv | 2698394 | 422 kl/s | 6.38 s|
| /mnt/data/download/ldbc_snb_data/social_network/comment_is_located_in_place_0_0.csv | 2052170 | 291 kl/s | 7.04 s|
| /mnt/data/download/ldbc_snb_data/social_network/comment_reply_of_comment_0_0.csv | 1040750 | 253 kl/s | 4.11 s|
| /mnt/data/download/ldbc_snb_data/social_network/comment_reply_of_post_0_0.csv | 1011421 | 248 kl/s | 4.07 s|
| /mnt/data/download/ldbc_snb_data/social_network/forum_0_0.csv | 90493 | 87 kl/s | 1.03 s|
| /mnt/data/download/ldbc_snb_data/social_network/forum_container_of_post_0_0.csv | 1003606 | 240 kl/s | 4.18 s|
| /mnt/data/download/ldbc_snb_data/social_network/forum_has_member_person_0_0.csv | 1611870 | 431 kl/s | 3.74 s|
| /mnt/data/download/ldbc_snb_data/social_network/forum_has_moderator_person_0_0.csv | 90493 | 89 kl/s | 1.01 s|
| /mnt/data/download/ldbc_snb_data/social_network/forum_has_tag_tag_0_0.csv | 309767 | 297 kl/s | 1.04 s|
| /mnt/data/download/ldbc_snb_data/social_network/organisation_0_0.csv | 7956 | 7 kl/s | 1.00 s|
|/mnt/data/download/ldbc_snb_data/social_network/organisation_is_located_in_place_0_0.csv | 7956 | 7 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_0_0.csv | 9893 | 9 kl/s | 1.05 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_has_interest_tag_0_0.csv | 229167 | 223 kl/s | 1.03 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_is_located_in_place_0_0.csv | 9893 | 9 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_knows_person_0_0.csv | 180624 | 169 kl/s | 1.06 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_likes_comment_0_0.csv | 1438419 | 449 kl/s | 3.20 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_likes_post_0_0.csv | 751678 | 331 kl/s | 2.27 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_study_at_organisation_0_0.csv | 7950 | 7 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/person_work_at_organisation_0_0.csv | 21655 | 21 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/place_0_0.csv | 1461 | 1 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/place_is_part_of_place_0_0.csv | 1455 | 1 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/post_0_0.csv | 1003606 | 195 kl/s | 5.14 s|
| /mnt/data/download/ldbc_snb_data/social_network/post_has_creator_person_0_0.csv | 1003606 | 320 kl/s | 3.13 s|
| /mnt/data/download/ldbc_snb_data/social_network/post_has_tag_tag_0_0.csv | 713259 | 341 kl/s | 2.09 s|
| /mnt/data/download/ldbc_snb_data/social_network/post_is_located_in_place_0_0.csv | 1003606 | 327 kl/s | 3.07 s|
| /mnt/data/download/ldbc_snb_data/social_network/tag_0_0.csv | 16081 | 16 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/tag_has_type_tagclass_0_0.csv | 16081 | 16 kl/s | 1.00 s|
| /mnt/data/download/ldbc_snb_data/social_network/tagclass_0_0.csv | 72 | 71 l/s | 1.00 s|
|/mnt/data/download/ldbc_snb_data/social_network/tagclass_is_subclass_of_tagclass_0_0.csv | 71 | 70 l/s | 1.00 s|
+----------------------------------------------------------------------------------------------------------------------------------+
Next steps
After the loading job finishes running, your tutorial setup is complete, and you are ready to start learning about One-hop patterns.