Preprint
Mixture of Experts Softens the Curse of Dimensionality in Operator Learning
Abstract
We study the approximation-theoretic implications of mixture-of-experts architectures for operator learning, where the complexity of a single large neural operator is distributed across many small neural operators (NOs), and each input is routed to exactly one NO via a decision tree. We analyze how this tree-based routing and expert decomposition affect approximation power, sample complexity, and stability. Our main result is a distributed …
Authors
Kratsios A; Furuya T; Benitez JAL; Lassas M; de Hoop M
Publication date
December 1, 2025
DOI
10.48550/arxiv.2404.09101
Preprint server
arXiv