Home
Scholarly Works
Nearly tight sample complexity bounds for learning...
Conference

Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes

Abstract

We prove that Θ(e kd22) samples are necessary and sufficient for learning a mixture of k Gaussians in Rd, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Oe(kd/ε2) samples suffice, matching a known lower bound. The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in Rd has a small-sized sample compression.

Authors

Ashtiani H; Ben-David S; Harvey NJA; Liaw C; Mehrabian A; Plan Y

Volume

2018-December

Pagination

pp. 3412-3421

Publication Date

January 1, 2018

Conference proceedings

Advances in Neural Information Processing Systems

ISSN

1049-5258

Labels

Fields of Research (FoR)

Contact the Experts team