Home
Scholarly Works
Approximation Rates in Besov Norms and...
Preprint

Approximation Rates in Besov Norms and Sample-Complexity of Kolmogorov-Arnold Networks with Residual Connections

Abstract

Inspired by the Kolmogorov-Arnold superposition theorem, Kolmogorov-Arnold Networks (KANs) have recently emerged as an improved backbone for most deep learning frameworks, promising more adaptivity than their multilayer perceptron (MLP) predecessor by allowing for trainable spline-based activation functions. In this paper, we probe the theoretical foundations of the KAN architecture by showing that it can optimally approximate any Besov function in $B^{s}_{p,q}(\mathcal{X})$ on a bounded open, or even fractal, domain $\mathcal{X}$ in $\mathbb{R}^d$ at the optimal approximation rate with respect to any weaker Besov norm $B^α_{p,q}(\mathcal{X})$; where $α< s$. We complement our approximation result with a statistical guarantee by bounding the pseudodimension of the relevant class of Res-KANs. As an application of the latter, we directly deduce a dimension-free estimate on the sample complexity of a residual KAN model when learning a function of Besov regularity from $N$ i.i.d. noiseless samples, showing that KANs can learn the smooth maps which they can approximate.

Authors

Kratsios A; Kim BJ; Furuya T

Publication date

August 6, 2025

DOI

10.48550/arxiv.2504.15110

Preprint server

arXiv
View published work (Non-McMaster Users)

Contact the Experts team