Title |
Missing Data Imputation via Optimization Approach: An Application to K-Means Clustering of Extreme Temperature |
Authors |
Labita, Geovert John D. and Tubo, Bernadette F. |
Publication date |
2024/06 |
Journal |
Reliability: Theory & Application (RT&A) |
Volume |
Volume 19 |
Issue |
No. 2 (78) |
Pages |
115-123 |
Publisher |
Gnedenko Forum Publications |
Abstract |
This paper introduces an optimization approach to impute missing data within the 饾惥-means cluster
analysis framework. The proposed method has been applied to Philippine climate data over the
previous 18 years (2006-2023) with the goal of classifying the regions according to average annual
temperature including the maximum and minimum. This dataset contains missing values which is
the result of the weather stations鈥 measurement failure for some time and there is no chance of
recovery. As an effect, the regional groupings are greatly affected. This paper adapts a modified
method of missing value imputation suitable for climate data clustering, inspired by the work of
Bertsimas et al. (2017). The proposed methodology focuses on imputing missing values within
observations by finding the value that minimizes the distance between the observation and a cluster
centroid in which the Mahalanobis distance is used as the similarity measure. Consequently, the
outcomes of clustering obtained through this optimization approach were compared with certain
imputation techniques namely Mean Imputation, Expectation-Maximization algorithm, and MICE.
The assessment of the derived clusters was conducted using the silhouette coefficient as the
performance metric. Results revealed that the proposed imputation gave the highest silhouette scores
which means that most of the observations were being clustered appropriately as compared to the
results using other imputation algorithms. Moreover, it was found out that most of the areas showing
the features of extreme condition are located in the middle part of the country. |
Index terms / Keywords |
Optimization, K-Means, Mahalanobis |
DOI |
|
URL |
|