KNNOR: An Oversampling Technique for Imbalanced Datasets

Published in Applied Soft Computing, 2022

This paper presents a novel approach to addressing class imbalance in datasets, which is critical for improving the predictive performance of Machine Learning (ML) models. The K-Nearest Neighbor OveRsampling approach (KNNOR) proposed in this study characterizes the compactness of imbalanced datasets by utilizing each minority data point and its k-nearest neighbors to generate synthetic data points. This method effectively mitigates noise and outlier issues while ensuring a balanced representation of classes.

KNNOR is compared with ten contemporary oversampling techniques, consistently outperforming them in enhancing classifier accuracy across various imbalanced datasets. The technique is easy to implement and has been made available as an open-source Python library for the broader ML community.

Recommended citation: Islam, A., Belhaouari, S. B., Rehman, A. U., & Bensmail, H. (2022). "KNNOR: An Oversampling Technique for Imbalanced Datasets." Applied Soft Computing, 115, 108288. https://doi.org/10.1016/j.asoc.2021.108288
Download Paper | Download Slides