k-Nearest Neighbors (Batch Version)

This algorithm is a batch version of the k-Nearest Neighbors, Cosine Neighbor Similarity, single vertex. It makes a prediction for every vertex whose label is not known (i.e., the attribute for the known label is empty), based on its k nearest neighbors' labels.

Specifications

tg_knn_cosine_all( SET<STRING> v_type_set,  SET<STRING> e_type_set, SET<STRING> reverse_e_type_set,
  STRING weight_attribute, STRING label, INT top_k, BOOL print_results = TRUE,
  STRING file_path = "", STRING attr = "")

Parameters

Parameter Description Default Value

SET<STRING> v_type_set

The vertex types to calculate the distance for.

(empty set of strings)

SET<STRING> e_type_set

The edge types to use

(empty set of strings)

SET<STRING> reverse_e_type_set

The reverse edge types to use

(empty set of strings)

STRING weight_attribute

If not empty, use this edge attribute as the edge weight.

(empty string)

STRING label

If not empty, read an existing label from this attribute.

(empty string)

INT top_k

The number of nearest neighbors to consider

N/A

BOOL print_results

If true, print output in JSON format to the standard output.

True

STRING filepath

If not empty, write output to this file.

(empty string)

STRING attr

If not empty, store the predicted label to this vertex attribute.

(empty string)

Output

Returns the predicted label for the vertices whose label attribute is empty.

The result is available in three forms:

  • streamed out in JSON format

  • written to a file in tabular format, or

  • stored as a vertex attribute value.

The result size is equal to \$V\$, the number of vertices in the graph.

Time complexity

This algorithm has a complexity of \$O(E^2 / V)\$, where \$E\$ is the number of edges and \$V\$ is the number of vertices.

Example

For the movie graph shown in the single vertex version, run knn_cosine_all, using topK=3. Then you get the following result:

  {
    "Source": [
      {
        "v_id": "Jing",
        "v_type": "Person",
        "attributes": {
          "name": "Jing",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": "a"
        }
      },
      {
        "v_id": "Neil",
        "v_type": "Person",
        "attributes": {
          "name": "Neil",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": "b"
        }
      },
      {
        "v_id": "Elena",
        "v_type": "Person",
        "attributes": {
          "name": "Elena",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": ""
        }
      }
    ]
  }
]