After creating a graph schema, the next step is to load data into it. The task here is to instruct the GSQL loader how to associate ("map") the fields in a set of data files to the attributes in your vertex types and edge types of the graph schema we just defined.

You should have the two data files person.csv and friendship.csv on your local disk. It’s not necessary that they are in the same folder with you.

If you need to exit the GSQL shell for any reason, you can do so by typing "quit" without the quotes. Type gsql to enter again.

The loading job below assumes that your data files are in the folder /home/tigergraph. If they are elsewhere, then in the loading job script below replace /home/tigergraph/person.csv and /home/tigergraph/friendship.csv with their corresponding file paths respectively. Enter the following set of commands in the GSQL shell.

 In the default Docker installation described in Get Started with Docker, these files are in the folder /home/tigergraph/tutorial/3.x/gsql101/
USE GRAPH Social
BEGIN
DEFINE FILENAME file1="/home/tigergraph/tutorial/3.x/gsql101/person.csv";
DEFINE FILENAME file2="/home/tigergraph/tutorial/3.x/gsql101/friendship.csv";

LOAD file1 TO VERTEX Person VALUES ($"name",$"name", $"age",$"gender", $"state") USING header="true", separator=","; LOAD file2 TO EDGE Friendship VALUES ($0, $1,$2) USING header="true", separator=",";
}
END

Let’s walk through the commands:

• USE GRAPH Social: Tells GSQL which graph you want to work with.

• BEGIN …​ END: Indicates multiple-line mode. The GSQL shell will treat everything between these markers as a single statement. These are only needed for interactive mode. If you run GSQL statements that are stored in a command file, the command interpreter will study your whole file, so it doesn’t need the BEGIN and END hints.

• CREATE LOADING JOB: One loading job can describe the mappings from multiple files to multiple graph objects. Each file must be assigned to a filename variable. The field labels can be either by name or by position. By-name labelling requires a header line in the source file. By-position labelling uses integers to indicate source column position 0, 1,…​ In the example above, the first LOAD statement refers to the source file columns by name, whereas the second LOAD statement refers to the source file columns by position. Note the following details:

• The column "name" in file1 gets mapped to two fields, both the PRIMARY_ID and the "name" attribute of the Person vertex.

• In file1, gender comes before age. In the Person vertex, gender comes after age. When loading, state your attributes in the order needed by the target object (in this case, the Person vertex).

• Each LOAD statement has a USING clause. Here it tells GSQL that both files contain a header (whether we choose to use the names or not, GSQL still needs to know whether to consider the first line as data or not). It also says the column separator is a comma. GSQL can handle any single-character separator, not just commas.

When you run the CREATE LOADING JOB statement, GSQL checks for syntax errors and checks that you have data files in the locations specified. If it detects no errors, it compiles and saves your job.

GSQL shell
GSQL > USE GRAPH Social
Using graph 'Social'
GSQL > BEGIN
GSQL >    DEFINE FILENAME file1="/home/tigergraph/person.csv";
GSQL >    DEFINE FILENAME file2="/home/tigergraph/friendship.csv";
GSQL >
GSQL >    LOAD file1 TO VERTEX Person VALUES ($"name",$"name", $"age",$"gender", $"state") USING header="true", separator=","; GSQL > LOAD file2 TO EDGE Friendship VALUES ($0, $1,$2) USING header="true", separator=",";
GSQL > }
GSQL > END

GSQL command

The result is shown below.

GSQL shell
Starting the following job, i.e.