flatten_json_array()

This function parses a JSON array of primitive values or JSON objects, as well as JSON objects in a column of a CSV file.

When it flattens a JSON array of primitive values, it returns all the values in the array, each value creating its own record. When it flattens a JSON array of JSON objects, it returns all the values of fields specified in the parameters; each parameter is a column, and each object creates a record.

Syntax

  • flatten_json_array(array_name)

  • flatten_json_array (array_name, sub_obj_1, …, sub_obj_n)

  • flatten_json_array (column_name, sub_obj_1, …, sub_obj_n)

Parameters

The function can be used to flatten a JSON array of primitive values (strings, numbers and booleans), a JSON array of JSON objects or a JSON object in a CSV file.

When it flattens an array of primitive values, it only takes one parameter:

array_name

Required. The name of the array field in the JSON object.

When it flattens an array of JSON objects, it takes multiple parameters:

array_name

Required. The name of the array field in the JSON object.

sub_obj_1…​n

Required. The name of the fields in the JSON object.

When it flattens a of JSON object in a CSV column, it takes multiple parameters:

array_name

Required. The name of the column containing the JSON object values.

sub_obj_1…​n

Required. The name of the fields in the JSON object.

Examples

Flatten a JSON array of primitive values

When loading JSON lines, each line is a JSON object. If a value field of the JSON object is a JSON array of primitive values , you can use flatten_json_array() to flatten the array:

  • Data file

  • Loading job

  • Temporary table

For example, if we have the following JSON line file. The plug-in field is a JSON array of primitive values:

encoding2.json
{"plug-ins" : ["C", "c++"],"encoding" : "UTF-6","indent" : { "length" : 3, "use_space": false}}

The function flattens the array, and loads each item in the array into a temporary table. Each value in the array creates a record.

CREATE VERTEX Encoding (PRIMARY_ID id STRING, length FLOAT DEFAULT 10)
CREATE UNDIRECTED EDGE Encoding_Edge (FROM encoding, TO encoding)
CREATE GRAPH Encoding_Graph (*)

CREATE LOADING JOB json_flatten FOR GRAPH Encoding_Graph {
    LOAD "encoding2.json" TO TEMP_TABLE t2 (name, length)
      VALUES (flatten_json_array($"plug-ins"), $"indent":"length") USING JSON_FILE ="true";
    LOAD TEMP_TABLE t2
      TO VERTEX encoding VALUES ($"name", $"length");
}
RUN LOADING JOB json_flatten

The data and loading job creates the following temporary table:

id length

C

3

c++

3

Flatten a JSON array of JSON objects

When loading JSON lines, each line is a JSON object. If a value field of the JSON object is a JSON array of JSON objects , you can use flatten_json_array() to flatten the array:

When splitting a JSON array of JSON objects, the primitive values in the array are skipped and only JSON objects are processed.

  • Data file

  • Loading job

For example, if we have the following JSON line file. There is a plug-in field in each JSON object, and it is an array.

encoding3.json
{"encoding":"UTF-1","indent":{"use_space": "dontloadme"}, "plug-ins" : [null, true, false, {"lang":"golang","prop":{"age":"noidea"}}]}
{"encoding": "UTF-8", "plug-ins" : [{"lang": "pascal", "score":"1.0", "prop":{"age":"old"}}, {"lang":"c++", "score":2.0}],"indent":{"length" :12,"use_space": true}}
{"encoding": "UTF-7",  "plug-ins" : [{"lang":"java", "score":2.22}, {"lang":"python", "score":3.0},{"lang":"go", "score":4.0, "prop":{"age":"new"}}],"indent" : { "length" : 30, "use_space": true }}
{"plug-ins" : ["C", "c++"],"encoding" : "UTF-6","indent" : { "length" : 3, "use_space": false}}

The flatten_json_array() function in the destination clause returns three values to the three columns in the temporary table. The lang field corresponds to the name column; the score field corresponds to the score column; the age field of the prop field corresponds to the prop_age column. These columns are subsequently loaded into to the vertices.

The fourth line in the original file does not generate any value, because the plug-in field does not contain any JSON object.

json_flatten_array_test.gsql
CREATE VERTEX Encoding3 (PRIMARY_ID id STRING, score FLOAT default -1.0, age STRING default "Unknown", length INT default -1)
CREATE UNDIRECTED EDGE Encoding3_Edge (FROM Encoding3, TO Encoding3)
CREATE GRAPH Encoding_Graph (*)

CREATE LOADING JOB json_flatten_array FOR GRAPH Encoding_Graph {
  LOAD "encoding3.json" TO TEMP_TABLE t3 (name, score, prop_age, indent_length)
    VALUES (flatten_json_array($"plug-ins", $"lang", $"score", $"prop":"age"), $"indent":"length")
    USING JSON_FILE="true";
  LOAD TEMP_TABLE t3
    TO VERTEX encoding3 VALUES ($"name", $"score", $"prop_age", $"indent_length");
}
RUN LOADING JOB json_flatten_array

Flatten a JSON object in a CSV file

The function can also be used to flatten a JSON object in a CSV file.

  • Data file

  • Loading job

  • Temporary table

If we have the following CSV file. The second column in the CSV file is a JSON object we want to flatten.

encoding.csv
golang|{"prop":{"age":"noidea"}}
pascal|{"score":"1.0", "prop":{"age":"old"}}
c++|{"score":2.0, "indent":{"length":12, "use_space": true}}
java|{"score":2.22, "prop":{"age":"new"}, "indent":{"use_space":"true", "length":2}}
python|{ "prop":{"compiled":"false"}, "indent":{"length":4}, "score":3.0}
go|{"score":4.0, "prop":{"age":"new"}}

flatten_json_array() can be used in this case without the USING JSON_FILE="true" clause:

json_flatten_cvs.gsql
CREATE VERTEX Encoding3 (PRIMARY_ID id STRING, score FLOAT default -1.0, age STRING default "Unknown", length INT default -1)
CREATE UNDIRECTED EDGE Encoding3_Edge (FROM Encoding3, TO Encoding3)
CREATE GRAPH Encoding_Graph (*)

CREATE LOADING JOB json_flatten_cvs FOR GRAPH Encoding_Graph {
  LOAD "encoding.csv" TO TEMP_TABLE t4 (name, score, prop_age, indent_length)
    VALUES ($0,flatten_json_array($1, $"score", $"prop":"age", $"indent":"length"))
    USING SEPARATOR="|";
  LOAD TEMP_TABLE t4
    TO VERTEX Encoding3 VALUES ($"name", $"score", $"prop_age", $"indent_length");
}
RUN LOADING JOB json_flatten_cvs

The example generates the temporary table shown below:

id score age length

golang

-1 (default)

noidea

-1 (default)

pascal

1

old

-1 (default)

c++

2

unknown (default)

12

java

2.22

new

2

python

3

unknown (default)

4

go

4

new

-1 (default)