abstract
- Although artists' actions in photo retouching appear to be highly nonlinear in nature and very difficult to characterize analytically, we find that the net effects of interactively editing a mundane image to a desired appearance can be modeled, in most cases, by a parametric monotonically non-decreasing global tone mapping function in the luminance axis and by a global affine transform in the chrominance plane that are weighted by saliency. This allows us to simplify the machine learning problem of mimicking artists in photo retouching to constructing a deep artful image transform (DAIT) using convolutional neural networks (CNN). The CNN design of DAIT aims to learn the image-dependent parameters of the luminance tone mapping function and the affine chrominance transform, rather than learning the end-to-end pixel level mapping as in the mainstream methods of image restoration and enhancement. The proposed DAIT approach reduces the computation complexity of the neural network by two orders of magnitude, which also, as a side benefit, improves the robustness and generalization capability at the inference stage. The high throughput and robustness of DAIT lend itself readily to real-time video enhancement as well after a simple temporal processing. Experiments and a Turing-type test are conducted to evaluate the proposed method and its competitors.