Summary: | This paper presents the analysis of distance metric variations in KNN for agarwood oil compounds differentiation. The work involved of the development of k-Nearest Neighbor (KNN) by varying the distance metrics. The input is abundances (%) of agarwood oil compounds and the output is agarwood oil quality either high or low. The data is divided into two parts; training and testing dataset with ratio of 80% and 20% respectively. The training dataset is used to develop the KNN model from K equal to 1 until K equal to 5, and the testing dataset is used to test the developed model. During the training, distance metric parameters were varied using Euclidean, City-block, Cosine, and Correlation. The performance of each parameter was recorded and observed. All the analytical works are performed automatically via MATLAB software version R2014b. The results showed that, among four distance metric variations, Euclidean and City-block yield 100% accuracy for both training and testing dataset. After that, 89.5% of accuracy was obtained by Cosine and Correlation. In general, the accuracy yielded by all distance metrics is above 80.00% and indicating a good KNN model. This finding proved the capability of KNN in differentiating the agarwood oil compounds to high or low qualities. The results in this study are important and contributed to further research work in agarwood oil grading system. © 2017 IEEE.
|