### abstract ###
% After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data
Thus, we get an answer to the question  what  is the most likely label of a given unseen data point
However, most methods will provide no answer  why  the model predicted the particular label for a single instance and what features were most influential for that particular instance
The only method that is currently able to provide such explanations are decision trees
This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of  any  classification method
### introduction ###
Automatic nonlinear classification is a common and powerful tool in data analysis
Machine learning research has created methods that are practically useful and that can classify unseen data after being trained on a limited training set of labeled examples
Nevertheless, most of the algorithms do not  explain  their decision
However in practical data analysis it is essential to obtain an instance based explanation, i e we would like to gain an understanding what input features made the nonlinear machine give its answer for each individual data point
Typically, explanations are provided jointly for all instances of the training set, for example feature selection methods  (including Automatic Relevance Determination) find out which inputs are salient for a good generalization  CITATION
While this can give a coarse impression about the global usefulness of each input dimension, it is still an ensemble view and does not provide an answer on an instance basis
In the neural network literature also solely an ensemble view was taken in algorithms like input pruning  CITATION
The only classification which does provide individual explanations are decision trees  CITATION
This paper proposes a simple framework that provides local explanation vectors applicable to  any  classification method in order to help understanding prediction results for single data instances
The local explanation yields the features being relevant for the prediction at the very points of interest in the data space and is able to spot local peculiarities which are neglected in the global view eg due to cancellation effects
The paper is organized as follows: We define local explanation vectors as class probability gradients in Section  and give an illustration for Gaussian Process Classification (GPC)
Some methods output a prediction without a direct probability interpretation
For these we propose in Section  a way to estimate local explanations
In Section  we will apply our methodology to learn distinguishing properties of Iris flowers by estimating explanation vectors for a k-NN classifier applied to the classic Iris data set
Section  will discuss how our approach applied to a SVM classifier allows us to explain how digits "two" are distinguished from digit "8" in the USPS data set
In Section  we discuss a more real-world application scenario where the proposed explanation capabilities prove useful in drug discovery: Human experts regularly decide how to modify existing lead compounds in order to obtain new compounds with improved properties
Models capable of explaining predictions can help in the process of choosing promising modifications
Our automatically generated explanations match with chemical domain knowledge about toxifying functional groups of the compounds in question
Section  contrasts our approach with related work and Section  discusses characteristic properties and limitations of our approach, before we conclude the paper in Section
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                iris_knn
