Estimation theory and Neural Networks revisited: REKF and RSVSF as optimization techniques for Deep-Learning

Deep-Learning has become a leading strategy for artificial intelligence and is being applied in many fields due to its excellent performance that has surpassed human cognitive abilities in a number of classification and control problems (Ciregan, Meier, & Schmidhuber, 2012; Mnih et al., 2015). However, the training process of Deep-Learning is usually slow and requires high-performance computing, capable of handling large datasets. The optimization of the training method can improve the learning rate of the Deep-Learning networks and result in a higher performance while using the same number of training epochs (cycles). This paper considers the use of estimation theory for training of large neural networks and in particular Deep-Learning networks. Two estimation strategies namely the Extended Kalman Filter (EKF) and the Smooth Variable Structure Filter (SVSF) have been revised (subsequently referred to as RSVSF and REKF) and used for network training. They are applied to several benchmark datasets and comparatively evaluated.

Estimation theory and Neural Networks revisited: REKF and RSVSF as optimization techniques for Deep-Learning Journal Articles