flatten_json_array()

This function parses a JSON array of primitive values or JSON objects, as well as JSON objects in a column of a CSV file.

When it flattens a JSON array of primitive values, it returns all the values in the array, each value creating its own record. When it flattens a JSON array of JSON objects, it returns all the values of fields specified in the parameters; each parameter is a column, and each object creates a record.

Syntax

flatten_json_array(array_name)
flatten_json_array (array_name, sub_obj_1, …, sub_obj_n)
flatten_json_array (column_name, sub_obj_1, …, sub_obj_n)

Parameters

The function can be used to flatten a JSON array of primitive values (strings, numbers and booleans), a JSON array of JSON objects or a JSON object in a CSV file.

When it flattens an array of primitive values, it only takes one parameter:

array_name: Required. The name of the array field in the JSON object.

When it flattens an array of JSON objects, it takes multiple parameters:

array_name: Required. The name of the array field in the JSON object.
sub_obj_1…n: Required. The name of the fields in the JSON object.

When it flattens a of JSON object in a CSV column, it takes multiple parameters:

array_name: Required. The name of the column containing the JSON object values.
sub_obj_1…n: Required. The name of the fields in the JSON object.

Examples

Flatten a JSON array of primitive values

When loading JSON lines, each line is a JSON object. If a value field of the JSON object is a JSON array of primitive values , you can use flatten_json_array() to flatten the array:

Data file
Loading job
Temporary table

For example, if we have the following JSON line file. The plug-in field is a JSON array of primitive values:

encoding2.json

{"plug-ins" : ["C", "c++"],"encoding" : "UTF-6","indent" : { "length" : 3, "use_space": false}}

The function flattens the array, and loads each item in the array into a temporary table. Each value in the array creates a record.

CREATE VERTEX Encoding (PRIMARY_ID id STRING, length FLOAT DEFAULT 10)
CREATE UNDIRECTED EDGE Encoding_Edge (FROM encoding, TO encoding)
CREATE GRAPH Encoding_Graph (*)

CREATE LOADING JOB json_flatten FOR GRAPH Encoding_Graph {
    LOAD "encoding2.json" TO TEMP_TABLE t2 (name, length)
      VALUES (flatten_json_array($"plug-ins"), $"indent":"length") USING JSON_FILE ="true";
    LOAD TEMP_TABLE t2
      TO VERTEX encoding VALUES ($"name", $"length");
}
RUN LOADING JOB json_flatten

The data and loading job creates the following temporary table:

id	length
C	3
c++	3

length

c++

Flatten a JSON array of JSON objects

When loading JSON lines, each line is a JSON object. If a value field of the JSON object is a JSON array of JSON objects , you can use flatten_json_array() to flatten the array:

When splitting a JSON array of JSON objects, the primitive values in the array are skipped and only JSON objects are processed.

Data file
Loading job

For example, if we have the following JSON line file. There is a plug-in field in each JSON object, and it is an array.

encoding3.json

{"encoding":"UTF-1","indent":{"use_space": "dontloadme"}, "plug-ins" : [null, true, false, {"lang":"golang","prop":{"age":"noidea"}}]}
{"encoding": "UTF-8", "plug-ins" : [{"lang": "pascal", "score":"1.0", "prop":{"age":"old"}}, {"lang":"c++", "score":2.0}],"indent":{"length" :12,"use_space": true}}
{"encoding": "UTF-7",  "plug-ins" : [{"lang":"java", "score":2.22}, {"lang":"python", "score":3.0},{"lang":"go", "score":4.0, "prop":{"age":"new"}}],"indent" : { "length" : 30, "use_space": true }}
{"plug-ins" : ["C", "c++"],"encoding" : "UTF-6","indent" : { "length" : 3, "use_space": false}}

The flatten_json_array() function in the destination clause returns three values to the three columns in the temporary table. The lang field corresponds to the name column; the score field corresponds to the score column; the age field of the prop field corresponds to the prop_age column. These columns are subsequently loaded into to the vertices.

The fourth line in the original file does not generate any value, because the plug-in field does not contain any JSON object.

json_flatten_array_test.gsql

CREATE VERTEX Encoding3 (PRIMARY_ID id STRING, score FLOAT default -1.0, age STRING default "Unknown", length INT default -1)
CREATE UNDIRECTED EDGE Encoding3_Edge (FROM Encoding3, TO Encoding3)
CREATE GRAPH Encoding_Graph (*)

CREATE LOADING JOB json_flatten_array FOR GRAPH Encoding_Graph {
  LOAD "encoding3.json" TO TEMP_TABLE t3 (name, score, prop_age, indent_length)
    VALUES (flatten_json_array($"plug-ins", $"lang", $"score", $"prop":"age"), $"indent":"length")
    USING JSON_FILE="true";
  LOAD TEMP_TABLE t3
    TO VERTEX encoding3 VALUES ($"name", $"score", $"prop_age", $"indent_length");
}
RUN LOADING JOB json_flatten_array

Flatten a JSON object in a CSV file

The function can also be used to flatten a JSON object in a CSV file.

Data file
Loading job
Temporary table

If we have the following CSV file. The second column in the CSV file is a JSON object we want to flatten.

encoding.csv

golang|{"prop":{"age":"noidea"}}
pascal|{"score":"1.0", "prop":{"age":"old"}}
c++|{"score":2.0, "indent":{"length":12, "use_space": true}}
java|{"score":2.22, "prop":{"age":"new"}, "indent":{"use_space":"true", "length":2}}
python|{ "prop":{"compiled":"false"}, "indent":{"length":4}, "score":3.0}
go|{"score":4.0, "prop":{"age":"new"}}

flatten_json_array() can be used in this case without the USING JSON_FILE="true" clause:

json_flatten_cvs.gsql

CREATE VERTEX Encoding3 (PRIMARY_ID id STRING, score FLOAT default -1.0, age STRING default "Unknown", length INT default -1)
CREATE UNDIRECTED EDGE Encoding3_Edge (FROM Encoding3, TO Encoding3)
CREATE GRAPH Encoding_Graph (*)

CREATE LOADING JOB json_flatten_cvs FOR GRAPH Encoding_Graph {
  LOAD "encoding.csv" TO TEMP_TABLE t4 (name, score, prop_age, indent_length)
    VALUES ($0,flatten_json_array($1, $"score", $"prop":"age", $"indent":"length"))
    USING SEPARATOR="|";
  LOAD TEMP_TABLE t4
    TO VERTEX Encoding3 VALUES ($"name", $"score", $"prop_age", $"indent_length");
}
RUN LOADING JOB json_flatten_cvs

The example generates the temporary table shown below:

id	score	age	length
golang	-1 (default)	noidea	-1 (default)
pascal	1	old	-1 (default)
c++	2	unknown (default)	12
java	2.22	new	2
python	3	unknown (default)	4
go	4	new	-1 (default)

score

age

length

golang

-1 (default)

noidea

-1 (default)

pascal

old

-1 (default)

c++

unknown (default)

java

2.22

new

python

unknown (default)

new

-1 (default)