Comprehensible Counterfactual Explanation on Kolmogorov-Smirnov Test
Abstract
The Kolmogorov-Smirnov (KS) test is popularly used in many applications, such
as anomaly detection, astronomy, database security and AI systems. One
challenge remained untouched is how we can obtain an explanation on why a test
set fails the KS test. In this paper, we tackle the problem of producing
counterfactual explanations for test data failing the KS test. Concept-wise, we
propose the notion of most comprehensible counterfactual explanations, which
accommodates both the KS test data and the user domain knowledge in producing
explanations. Computation-wise, we develop an efficient algorithm MOCHE (for
MOst CompreHensible Explanation) that avoids enumerating and checking an
exponential number of subsets of the test set failing the KS test. MOCHE not
only guarantees to produce the most comprehensible counterfactual explanations,
but also is orders of magnitudes faster than the baselines. Experiment-wise, we
present a systematic empirical study on a series of benchmark real datasets to
verify the effectiveness, efficiency and scalability of most comprehensible
counterfactual explanations and MOCHE.