Exact and Consistent Interpretation of Piecewise Linear Models Hidden behind APIs: A Closed Form Solution
Abstract
More and more AI services are provided through APIs on cloud where predictive
models are hidden behind APIs. To build trust with users and reduce potential
application risk, it is important to interpret how such predictive models
hidden behind APIs make their decisions. The biggest challenge of interpreting
such predictions is that no access to model parameters or training data is
available. Existing works interpret the predictions of a model hidden behind an
API by heuristically probing the response of the API with perturbed input
instances. However, these methods do not provide any guarantee on the exactness
and consistency of their interpretations. In this paper, we propose an elegant
closed form solution named OpenAPI to compute exact and consistent
interpretations for the family of Piecewise Linear Models (PLM), which includes
many popular classification models. The major idea is to first construct a set
of overdetermined linear equation systems with a small set of perturbed
instances and the predictions made by the model on those instances. Then, we
solve the equation systems to identify the decision features that are
responsible for the prediction on an input instance. Our extensive experiments
clearly demonstrate the exactness and consistency of our method.