Basis Scaling and Double Pruning for Efficient Inference in Network-Based Transfer Learning
Abstract
Network-based transfer learning allows the reuse of deep learning features
with limited data, but the resulting models can be unnecessarily large.
Although network pruning can improve inference efficiency, existing algorithms
usually require fine-tuning that may not be suitable for small datasets. In
this paper, using the singular value decomposition, we decompose a
convolutional layer into two layers: a convolutional layer with the orthonormal
basis vectors as the filters, and a "BasisScalingConv" layer which is
responsible for rescaling the features and transforming them back to the
original space. As the filters in each decomposed layer are linearly
independent, when using the proposed basis scaling factors with the Taylor
approximation of importance, pruning can be more effective and fine-tuning
individual weights is unnecessary. Furthermore, as the numbers of input and
output channels of the original convolutional layer remain unchanged after
basis pruning, it is applicable to virtually all architectures and can be
combined with existing pruning algorithms for double pruning to further
increase the pruning capability. When transferring knowledge from ImageNet
pre-trained models to different target domains, with less than 1% reduction in
classification accuracies, we can achieve pruning ratios up to 74.6% for
CIFAR-10 and 98.9% for MNIST in model parameters.