On the Role of Noise in the Sample Complexity of Learning Recurrent
Neural Networks: Exponential Gaps for Long Sequences
Abstract
We consider the class of noisy multi-layered sigmoid recurrent neural
networks with $w$ (unbounded) weights for classification of sequences of length
$T$, where independent noise distributed according to $\mathcal{N}(0,\sigma^2)$
is added to the output of each neuron in the network. Our main result shows
that the sample complexity of PAC learning this class can be bounded by $O
(w\log(T/\sigma))$. For the non-noisy version of the same class (i.e.,
$\sigma=0$), we prove a lower bound of $\Omega (wT)$ for the sample complexity.
Our results indicate an exponential gap in the dependence of sample complexity
on $T$ for noisy versus non-noisy networks. Moreover, given the mild
logarithmic dependence of the upper bound on $1/\sigma$, this gap still holds
even for numerically negligible values of $\sigma$.