The Cardinality Bound on the Information Bottleneck Representations is Tight
Abstract
The information bottleneck (IB) method aims to find compressed
representations of a variable $X$ that retain the most relevant information
about a target variable $Y$. We show that for a wide family of distributions --
namely, when $Y$ is generated by $X$ through a Hamming channel, under mild
conditions -- the optimal IB representations require an alphabet strictly
larger than that of $X$. This implies that, despite several recent works, the
cardinality bound first identified by Witsenhausen and Wyner in 1975 is tight.
At the core of our finding is the observation that the IB function in this
setting is not strictly concave, similar to the deterministic case, even though
the joint distribution of $X$ and $Y$ is of full support. Finally, we provide a
complete characterization of the IB function, as well as of the optimal
representations for the Hamming case.