Speaker
Description
This paper investigates the effect of pretraining and fine-tuning for a multi-modal dataset. The detaset used in this study is accumulated in a garbage disposal facility for facility control and consists of 25000 sequential images and corresponding sensor values. The main task for this dataset is to classify the state of garbage incineration from an input image for the combustion state control. In this kind of tasks, pretraining with an unsupervised dataset and fine-tuning with a small supervised dataset is a typical and effective approach to reduce the costs of making supervised data. We investigated and compared lots of pretraining with sensors and autoencoders to find effective pretraining. Moreover, we compared some sensor selection methods for pretraining with sensors. The results show the performance and discussion about fine-tuned models with frozen and unfrozen pretraining parameters and the sensor selection.