The Importance of Feature Selection Methods for the Error Prediction Process of a Digital Twin

Abstract

The idea of building a digital twin is related to simultaneously creating a model that becomes a transportation vehicle for data within the information life cycle. In order to create such model, there should be well-defined feature space. Because of the "curse of dimensionality", while the complexity of the model exponentially increases, the accuracy rate of the model decreases. In this study, the importance of the methods chosen for dimensionality reduction while creating a model setup, which can predict the error on a digital twin, is presented with an exemplary implementation. Four different dimension reduction methods, PCA, Conventional PCA, WPCA, and Mars, were applied to dataset with 89016 observation values and 590 different attributes, in order to predict error via Non-linear SVM with Polynomial kernel. According to results WPCA and MARS methods, predicted the error more successfully than others. As a result, the feature extraction solutions, that the methods provide, affected the performance of the designed models.


Editor: H. Kemal İlter, Ankara Yıldırım Beyazıt University, Turkey
Received: August 19, 2018, Accepted: October 18, 2018, Published: November 10, 2018

Copyright: © 2018 IMISC Özdemir et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.