Learning Sub-Patterns in Piecewise Continuous Functions
Abstract
Most stochastic gradient descent algorithms can optimize neural networks that
are sub-differentiable in their parameters; however, this implies that the
neural network's activation function must exhibit a degree of continuity which
limits the neural network model's uniform approximation capacity to continuous
functions. This paper focuses on the case where the discontinuities arise from
distinct sub-patterns, each defined on different parts of the input space. We
propose a new discontinuous deep neural network model trainable via a decoupled
two-step procedure that avoids passing gradient updates through the network's
only and strategically placed, discontinuous unit. We provide approximation
guarantees for our architecture in the space of bounded continuous functions
and universal approximation guarantees in the space of piecewise continuous
functions which we introduced herein. We present a novel semi-supervised
two-step training procedure for our discontinuous deep learning model, tailored
to its structure, and we provide theoretical support for its effectiveness. The
performance of our model and trained with the propose procedure is evaluated
experimentally on both real-world financial datasets and synthetic datasets.