k-Nearest Neighbors (Batch Version)

This algorithm is a batch version of the k-Nearest Neighbors, Cosine Neighbor Similarity, single vertex. It makes a prediction for every vertex whose label is not known (i.e., the attribute for the known label is empty), based on its k nearest neighbors' labels.

Specifications

tg_knn_cosine_all(SET<STRING> v_type, SET<STRING> e_type, SET<STRING> re_type,
  STRING weight, STRING label, INT top_k, BOOL print_accum = TRUE,
  STRING file_path = "", STRING attr = "")

Parameters

Parameter Description Default Value

SET<STRING> v_type

The vertex types to calculate the distance for.

(empty set of strings)

SET<STRING> e_type

The edge types to use

(empty set of strings)

SET<STRING> re_type

The reverse edge types to use

(empty set of strings)

STRING weight

If not empty, use this edge attribute as the edge weight.

(empty string)

STRING label

If not empty, read an existing label from this attribute.

(empty string)

INT top_k

The number of nearest neighbors to consider

N/A

BOOL print_accum

If true, print output in JSON format to the standard output.

True

STRING filepath

If not empty, write output to this file.

(empty string)

STRING attr

If not empty, store the predicted label to this vertex attribute.

(empty string)

Output

Returns the predicted label for the vertices whose label attribute is empty.

The result is available in three forms:

  • streamed out in JSON format

  • written to a file in tabular format, or

  • stored as a vertex attribute value.

The result size is equal to \$V\$, the number of vertices in the graph.

Time complexity

This algorithm has a complexity of \$O(E^2 / V)\$, where \$E\$ is the number of edges and \$V\$ is the number of vertices.

Example

For the movie graph shown in the single vertex version, run knn_cosine_all, using topK=3. Then you get the following result:

  {
    "Source": [
      {
        "v_id": "Jing",
        "v_type": "Person",
        "attributes": {
          "name": "Jing",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": "a"
        }
      },
      {
        "v_id": "Neil",
        "v_type": "Person",
        "attributes": {
          "name": "Neil",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": "b"
        }
      },
      {
        "v_id": "Elena",
        "v_type": "Person",
        "attributes": {
          "name": "Elena",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": ""
        }
      }
    ]
  }
]