flatten_json_array()
This function parses a JSON array of primitive values or JSON objects, as well as JSON objects in a column of a CSV file.
When it flattens a JSON array of primitive values, it returns all the values in the array, each value creating its own record. When it flattens a JSON array of JSON objects, it returns all the values of fields specified in the parameters; each parameter is a column, and each object creates a record.
Syntax
-
flatten_json_array(array_name)
-
flatten_json_array (array_name, sub_obj_1, …, sub_obj_n)
-
flatten_json_array (column_name, sub_obj_1, …, sub_obj_n)
Parameters
The function can be used to flatten a JSON array of primitive values (strings, numbers and booleans), a JSON array of JSON objects or a JSON object in a CSV file.
When it flattens an array of primitive values, it only takes one parameter:
- array_name
-
Required. The name of the array field in the JSON object.
When it flattens an array of JSON objects, it takes multiple parameters:
- array_name
-
Required. The name of the array field in the JSON object.
- sub_obj_1…n
-
Required. The name of the fields in the JSON object.
When it flattens a of JSON object in a CSV column, it takes multiple parameters:
- array_name
-
Required. The name of the column containing the JSON object values.
- sub_obj_1…n
-
Required. The name of the fields in the JSON object.
Examples
Flatten a JSON array of primitive values
When loading JSON lines, each line is a JSON object.
If a value field of the JSON object is a JSON array of primitive values , you can use flatten_json_array()
to flatten the array:
For example, if we have the following JSON line file.
The plug-in
field is a JSON array of primitive values:
{"plug-ins" : ["C", "c++"],"encoding" : "UTF-6","indent" : { "length" : 3, "use_space": false}}
The function flattens the array, and loads each item in the array into a temporary table. Each value in the array creates a record.
CREATE VERTEX Encoding (PRIMARY_ID id STRING, length FLOAT DEFAULT 10)
CREATE UNDIRECTED EDGE Encoding_Edge (FROM encoding, TO encoding)
CREATE GRAPH Encoding_Graph (*)
CREATE LOADING JOB json_flatten FOR GRAPH Encoding_Graph {
LOAD "encoding2.json" TO TEMP_TABLE t2 (name, length)
VALUES (flatten_json_array($"plug-ins"), $"indent":"length") USING JSON_FILE ="true";
LOAD TEMP_TABLE t2
TO VERTEX encoding VALUES ($"name", $"length");
}
RUN LOADING JOB json_flatten
The data and loading job creates the following temporary table:
id | length |
---|---|
C |
3 |
c++ |
3 |
Flatten a JSON array of JSON objects
When loading JSON lines, each line is a JSON object.
If a value field of the JSON object is a JSON array of JSON objects , you can use flatten_json_array()
to flatten the array:
When splitting a JSON array of JSON objects, the primitive values in the array are skipped and only JSON objects are processed.
For example, if we have the following
JSON line file.
There is a plug-in
field in each JSON object, and it is an array.
{"encoding":"UTF-1","indent":{"use_space": "dontloadme"}, "plug-ins" : [null, true, false, {"lang":"golang","prop":{"age":"noidea"}}]}
{"encoding": "UTF-8", "plug-ins" : [{"lang": "pascal", "score":"1.0", "prop":{"age":"old"}}, {"lang":"c++", "score":2.0}],"indent":{"length" :12,"use_space": true}}
{"encoding": "UTF-7", "plug-ins" : [{"lang":"java", "score":2.22}, {"lang":"python", "score":3.0},{"lang":"go", "score":4.0, "prop":{"age":"new"}}],"indent" : { "length" : 30, "use_space": true }}
{"plug-ins" : ["C", "c++"],"encoding" : "UTF-6","indent" : { "length" : 3, "use_space": false}}
The flatten_json_array()
function in the destination clause returns three values to the three columns in the temporary table.
The lang
field corresponds to the name
column; the score
field corresponds to the score
column; the age
field of the prop
field corresponds to the prop_age
column.
These columns are subsequently loaded into to the vertices.
The fourth line in the original file does not generate any value, because the plug-in
field does not contain any JSON object.
CREATE VERTEX Encoding3 (PRIMARY_ID id STRING, score FLOAT default -1.0, age STRING default "Unknown", length INT default -1)
CREATE UNDIRECTED EDGE Encoding3_Edge (FROM Encoding3, TO Encoding3)
CREATE GRAPH Encoding_Graph (*)
CREATE LOADING JOB json_flatten_array FOR GRAPH Encoding_Graph {
LOAD "encoding3.json" TO TEMP_TABLE t3 (name, score, prop_age, indent_length)
VALUES (flatten_json_array($"plug-ins", $"lang", $"score", $"prop":"age"), $"indent":"length")
USING JSON_FILE="true";
LOAD TEMP_TABLE t3
TO VERTEX encoding3 VALUES ($"name", $"score", $"prop_age", $"indent_length");
}
RUN LOADING JOB json_flatten_array
Flatten a JSON object in a CSV file
The function can also be used to flatten a JSON object in a CSV file.
If we have the following CSV file. The second column in the CSV file is a JSON object we want to flatten.
golang|{"prop":{"age":"noidea"}}
pascal|{"score":"1.0", "prop":{"age":"old"}}
c++|{"score":2.0, "indent":{"length":12, "use_space": true}}
java|{"score":2.22, "prop":{"age":"new"}, "indent":{"use_space":"true", "length":2}}
python|{ "prop":{"compiled":"false"}, "indent":{"length":4}, "score":3.0}
go|{"score":4.0, "prop":{"age":"new"}}
flatten_json_array()
can be used in this case without the USING JSON_FILE="true"
clause:
CREATE VERTEX Encoding3 (PRIMARY_ID id STRING, score FLOAT default -1.0, age STRING default "Unknown", length INT default -1)
CREATE UNDIRECTED EDGE Encoding3_Edge (FROM Encoding3, TO Encoding3)
CREATE GRAPH Encoding_Graph (*)
CREATE LOADING JOB json_flatten_cvs FOR GRAPH Encoding_Graph {
LOAD "encoding.csv" TO TEMP_TABLE t4 (name, score, prop_age, indent_length)
VALUES ($0,flatten_json_array($1, $"score", $"prop":"age", $"indent":"length"))
USING SEPARATOR="|";
LOAD TEMP_TABLE t4
TO VERTEX Encoding3 VALUES ($"name", $"score", $"prop_age", $"indent_length");
}
RUN LOADING JOB json_flatten_cvs
The example generates the temporary table shown below:
id | score | age | length |
---|---|---|---|
golang |
-1 (default) |
noidea |
-1 (default) |
pascal |
1 |
old |
-1 (default) |
c++ |
2 |
unknown (default) |
12 |
java |
2.22 |
new |
2 |
python |
3 |
unknown (default) |
4 |
go |
4 |
new |
-1 (default) |