1 of 1

k-Nearest Neighbors (Cross-Validation Version)

k-Nearest Neighbors (kNN) is often used for machine learning. You can choose the value for topK based on your experience, or using cross-validation to optimize the hyperparameters. In our library, Leave-one-out cross-validation for selecting optimal k is provided. Given a k value, we run the algorithm repeatedly using every vertex with a known label as the source vertex and predict its label. We assess the accuracy of the predictions for each value of k, and then repeat for different values of k in the given range. The goal is to find the value of k with highest predicting accuracy in the given range, for that dataset.

Specifications

tg_knn_cosine_cv(SET<STRING> v_type, SET<STRING> e_type, SET<STRING> re_type, 
STRING weight, STRING label, INT min_k, INT max_k) RETURNS (INT)

Example

Run knn_cosine_cv with min_k=2, max_k = 5. The JOSN result:

[
  {
    "@@correct_rate_list": [
      0.33333,
      0.33333,
      0.33333,
      0.33333
    ]
  },
  {
    "best_k": 2
  }
]

k-Nearest Neighbors (Cross-Validation Version)

Specifications

tg_knn_cosine_cv(SET<STRING> v_type, SET<STRING> e_type, SET<STRING> re_type, 
STRING weight, STRING label, INT min_k, INT max_k) RETURNS (INT)

Example

Run knn_cosine_cv with min_k=2, max_k = 5. The JOSN result:

[
  {
    "@@correct_rate_list": [
      0.33333,
      0.33333,
      0.33333,
      0.33333
    ]
  },
  {
    "best_k": 2
  }
]

SET<STRING> v_type: Vertex types to calculate distance to source vertex for
SET<STRING> e_type: Edge types to traverse
SET<STRING> re_type: Reverse edge types to traverse
STRING weight: Edge attribute to use as the weight of the edge
STRING label: Vertex attribute to recognize as the label of the vertex
INT min_k: lower bound of k (inclusive)
INT max_k: upper bound of k (inclusive)