kNearest Neighbors (CrossValidation Version)
kNearest Neighbors (kNN) is often used for machine learning.
You can choose the value for topK
based on your experience, or using crossvalidation to optimize the hyperparameters.
In our library, Leaveoneout crossvalidation for selecting optimal k is provided. Given a k value, we run the algorithm repeatedly using every vertex with a known label as the source vertex and predict its label. We assess the accuracy of the predictions for each value of k, and then repeat for different values of k in the given range.
The goal is to find the value of k with highest predicting accuracy in the given range, for that dataset.
Specifications
tg_knn_cosine_cv( SET<STRING> v_type_set, SET<STRING> e_type_set, SET<STRING> reverse_e_type_set,
STRING weight_attribute, STRING label, INT min_k, INT max_k) RETURNS (INT)
Parameters
Parameter  Description  Default Value 


The vertex types to calculate the distance to the source vertex for. 
(empty set of strings) 

The edge types to use 
(empty set of strings) 

The reverse edge types to use 
(empty set of strings) 

If not empty, use this edge attribute as the edge weight. 
(empty string) 

If not empty, read an existing label from this attribute. 
(empty string) 

The lower bound of k (inclusive) 
N/A 

The upper bound of k (inclusive) 
N/A 