k-Nearest Neighbors (Batch Version)

This algorithm is a batch version of the k-Nearest Neighbors, Cosine Neighbor Similarity, single vertex. It makes a prediction for every vertex whose label is not known (i.e., the attribute for the known label is empty), based on its k nearest neighbors' labels.

Specifications

tg_knn_cosine_all(SET<STRING> v_type, SET<STRING> e_type, SET<STRING> re_type,
  STRING weight, STRING label, INT top_k, BOOL print_accum = TRUE,
  STRING file_path = "", STRING attr = "")

Time complexity

This algorithm has a complexity of \$O(E^2 / V)\$, where \$E\$ is the number of edges and \$V\$ is the number of vertices.

Characteristic Value

Result

The predicted label for the vertices whose label attribute is empty.

The result is available in three forms:

  • streamed out in JSON format

  • written to a file in tabular format, or

  • stored as a vertex attribute value.

Input Parameters

  • SET<STRING> v_type: Vertex types to calculate distance to source vertex for

  • SET<STRING> e_type: Edge types to traverse

  • SET<STRING> re_type: Reverse edge types to traverse

  • STRING weight: Edge attribute to use as the weight of the edge

  • STRING label: Vertex attribute to recognize as the label of the vertex

  • INT top_k: number of nearest neighbors to consider

  • BOOL print_accum: Boolean value that indicates whether to output to console in JSON

  • STRING filepath: If provided, the algorithm will output to this file path in CSV format

  • STRING attr: Vertex attribute to save the predicted label as.

Result Size

V = number of vertices

Graph Types

Undirected or directed edges, weighted edges

Example

For the movie graph shown in the single vertex version, run knn_cosine_all, using topK=3. Then you get the following result:

  {
    "Source": [
      {
        "v_id": "Jing",
        "v_type": "Person",
        "attributes": {
          "name": "Jing",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": "a"
        }
      },
      {
        "v_id": "Neil",
        "v_type": "Person",
        "attributes": {
          "name": "Neil",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": "b"
        }
      },
      {
        "v_id": "Elena",
        "v_type": "Person",
        "attributes": {
          "name": "Elena",
          "known_label": "",
          "predicted_label": "",
          "@predicted_label": ""
        }
      }
    ]
  }
]