# k-Nearest Neighbors (Cross-Validation Version)

k-Nearest Neighbors (kNN) is often used for machine learning. You can choose the value for `topK` based on your experience, or using cross-validation to optimize the hyperparameters. In our library, Leave-one-out cross-validation for selecting optimal k is provided. Given a k value, we run the algorithm repeatedly using every vertex with a known label as the source vertex and predict its label. We assess the accuracy of the predictions for each value of k, and then repeat for different values of k in the given range. The goal is to find the value of k with highest predicting accuracy in the given range, for that dataset.

## Specifications

``````tg_knn_cosine_cv(SET<STRING> v_type, SET<STRING> e_type, SET<STRING> re_type,
STRING weight, STRING label, INT min_k, INT max_k) RETURNS (INT)``````

### Time complexity

This algorithm has a complexity of $O(max\_k*\frac{E^2}{V})$, where $E$ is the number of edges and $V$ is the number of vertices.

Characteristic Value

Result

A list of prediction accuracy for every k value in the given range, and

the value of k with the highest predicting accuracy in the given range.

The result is available in JSON format

Input Parameters

• `SET<STRING> v_type`: Vertex types to calculate distance to source vertex for

• `SET<STRING> e_type`: Edge types to traverse

• `SET<STRING> re_type`: Reverse edge types to traverse

• `STRING weight`: Edge attribute to use as the weight of the edge

• `STRING label`: Vertex attribute to recognize as the label of the vertex

• `INT min_k`: lower bound of k (inclusive)

• `INT max_k`: upper bound of k (inclusive)

Result Size

max_k-min_k+1

Graph Types

Undirected or directed edges, weighted edges

## Example

Run knn_cosine_cv with min_k=2, max_k = 5. The JOSN result:

``````[
{
"@@correct_rate_list": [
0.33333,
0.33333,
0.33333,
0.33333
]
},
{
"best_k": 2
}
]``````