缅北禁地

Tubo, Bernadette F. » Research » Scholarly articles

Title Missing Data Imputation via Optimization Approach: An Application to K-Means Clustering of Extreme Temperature
Authors Labita, Geovert John D. and Tubo, Bernadette F.
Publication date 2024/06
Journal Reliability: Theory & Application (RT&A)
Volume Volume 19
Issue No. 2 (78)
Pages 115-123
Publisher Gnedenko Forum Publications
Abstract This paper introduces an optimization approach to impute missing data within the 饾惥-means cluster analysis framework. The proposed method has been applied to Philippine climate data over the previous 18 years (2006-2023) with the goal of classifying the regions according to average annual temperature including the maximum and minimum. This dataset contains missing values which is the result of the weather stations鈥 measurement failure for some time and there is no chance of recovery. As an effect, the regional groupings are greatly affected. This paper adapts a modified method of missing value imputation suitable for climate data clustering, inspired by the work of Bertsimas et al. (2017). The proposed methodology focuses on imputing missing values within observations by finding the value that minimizes the distance between the observation and a cluster centroid in which the Mahalanobis distance is used as the similarity measure. Consequently, the outcomes of clustering obtained through this optimization approach were compared with certain imputation techniques namely Mean Imputation, Expectation-Maximization algorithm, and MICE. The assessment of the derived clusters was conducted using the silhouette coefficient as the performance metric. Results revealed that the proposed imputation gave the highest silhouette scores which means that most of the observations were being clustered appropriately as compared to the results using other imputation algorithms. Moreover, it was found out that most of the areas showing the features of extreme condition are located in the middle part of the country.
Index terms / Keywords Optimization, K-Means, Mahalanobis
DOI
URL
Back Top