Model Explainability

Neural networks are becoming the most frequently used machine learning method, not only for image analysis and natural language processing but also for more “classic” problems such as loan grading. However, the tooling around neural network explainability is still severely lacking.

While there is no theoretical reason that makes neural networks harder to explain than other nonlinear machine learning methods like random forests or gradient boosting, the most popular deep learning packages for R do not come with any built-in support for interpretability such as feature importance. With topics such as “cheating” Machine Learning systems and algorithmic bias getting more recognition, it becomes ever more important to be able to understand the reasoning behind a model’s predictions.

“Cheating” models exploit structures in the training data or loss function that provide unintended clues. A now famous example for this effect is a prediction on recognizing huskies that are correctly classified only because of their snowy surroundings (see http://iphome.hhi.de/samek/pdf/LapNCOMM19.pdf for more examples). It should be noted that using this information is not necessarily wrong - the context should bias your estimate of a dog’s breed to account for circumstances such as cold weather - but if the surrounding is the only input actually used we can be sure that the model will be easily fooled in the real world. Algorithmic bias is in essence also caused by problematic selections of training data. If the input data is biased in its labelling, so will the outputs of the resulting models. An image classification algorithm fed with unweighted stock photos would probably have a hard time ever predicting a woman to be a doctor. In some jurisdictions such as the EU, it is now required for companies to explain their models’ decisions [http://www.privacy-regulation.eu/en/recital-71-GDPR.htm].

IML and keras

Christoph Molnar recently released the iml package for R to address explainability in a model agnostic way. Out of the box iml so far supported several machine learning methods (especially those wrapped by mlr and caret), but was lacking support for keras. Exago Data Scientists recently pushed a pull request to iml in order to support keras models out of the box, which is going to be included in the next release (github https://github.com/christophM/iml/pull/74). Let’s take this new support for a test drive.

Example: The Lending Club loan data

We use the lending club loan data from kaggle to train a small neural network that predicts whether a given loan will be paid back or default. We just use a few of the more than 100 features in the data for this example and focus on loans that have a status of “Charged Off” or “Fully Paid”. Additionally, we convert the sub grade levels to a numeric scale and make sure all of our variables are on a similar scale. Neural network training does not cope well with variables with different variance. Here we opt for a simple manual scaling. We convert the term variable into a simple numeric “longterm” that indicates the presence of the longer of the two terms possible. We filter by home ownership and keep only the major categories mortgage, rent, and own.

Data preparation

library(dplyr)
library(data.table)
loans = fread("~/Downloads/lending-club-loan-data/loan.csv.gz") %>% as_data_frame
loans = filter(loans, loan_status %in% c("Charged Off", "Fully Paid") & home_ownership %in% c("MORTGAGE", "RENT", "OWN"))
sub_grade_levels = c("A1", "A2", "A3", "A4", "A5", "B1", "B2", "B3", "B4", "B5", "C1", "C2", "C3", "C4", "C5", "D1", "D2", "D3", "D4", "D5", "E1", "E2", "E3", "E4", "E5", "F1", "F2", "F3", "F4", "F5", "G1", "G2", "G3", "G4", "G5")
loans = loans %>% mutate(sub_grade = as.numeric(factor(sub_grade, levels = sub_grade_levels)) / 10,
                         loan_amnt = loan_amnt / 10000,
                         int_rate = int_rate / 10,
                         annual_inc = pmin(200000, annual_inc) / 100000,
                         longterm = as.numeric(term == "60 months")) %>%
  select(loan_status, loan_amnt, longterm, int_rate, sub_grade, home_ownership, annual_inc, verification_status)
head(loans)

## # A tibble: 6 x 8
##   loan_status loan_amnt longterm int_rate sub_grade home_ownership
##   <chr>           <dbl>    <dbl>    <dbl>     <dbl> <chr>         
## 1 Fully Paid      3            0    2.24        2   MORTGAGE      
## 2 Fully Paid      4            1    1.61        1.4 MORTGAGE      
## 3 Fully Paid      2            0    0.756       0.3 MORTGAGE      
## 4 Fully Paid      0.45         0    1.13        0.8 RENT          
## 5 Fully Paid      0.842        0    2.73        2.5 MORTGAGE      
## 6 Fully Paid      2            1    1.80        1.6 RENT          
## # … with 2 more variables: annual_inc <dbl>, verification_status <chr>

In the next step we build out two input matrices (tensors) using model.matrix to dummy encode the factor variables (we use make.names here to sanitise the columns names) and a simple 1/0 encoding for our target.

x = model.matrix(~ loan_amnt + longterm + int_rate + sub_grade + home_ownership + annual_inc + verification_status - 1, data = loans)
colnames(x) = make.names(colnames(x))
y = if_else(loans$loan_status == "Fully Paid", 1, 0)
head(x)

Building a neural network

Now we are ready to define our model architecture. We use a simple 3 layer network with tanh nonlinearities and a sigmoidal output. We let keras create a 25% percent validation set to check for possible overfitting and fit the model using rmsprob.

#install iml from github
set.seed(42)
library("iml")
library("keras")

model = keras_model_sequential() %>% 
  layer_dense(units = 4, activation = 'tanh', input_shape = ncol(x)) %>% 
  layer_dense(units = 3, activation = 'tanh') %>% 
  layer_dense(units = 1, activation = 'sigmoid') %>%
  compile(loss = 'binary_crossentropy',
          optimizer = optimizer_rmsprop(),
          metrics = c('accuracy'))

model %>% fit(x = x, y = y, epochs = 10 , batch_size = 5000, validation_split = 0.25)

Now let’s take a look at feature importance. The iml package calculates feature importance using permutation based error. The basic idea is to randomly scramble the feature columns one by one and then observe how the model accuracy drops. We only use a sample of 10000 rows to speed things up. To get the feature importance and plot it we simply create a predictor object by passing the model and the x and y data to the constructor. Because iml works on data.frames and keras on matrices, we convert our x matrix to a data.frame. The keras support in the latest version of iml on github will take care of converting this back to a matrix.

Feature importance

samp = sample(nrow(x), 10000)
predictor = Predictor$new(model, data =  as.data.frame(x[samp, ]), y = y[samp], type = "prob")
imp = FeatureImp$new(predictor, loss = "mae")
plot(imp)

Shapley values

We can immediately see that the most important feature for this model is the grade of the loan - which agrees with intuition, basically the grade is a model of loan risk. Interestingly, the second most important feature is the duration of the loan, followed by the dummy encoded home ownership feature. While it is already quite enlightening to get a breakdown of how the individual features contribute to our model on a global scale, we can’t understand why our model makes certain predictions for single examples. To get an explanation of a single observation using a local decomposition, iml offers shapley values, a game theoretic approach that divvies up the feature contributions to the predicted outcome of the model by running simulations with scrambled features. We take a look at the first example in our data.

shapley = Shapley$new(predictor, x.interest = as.data.frame(x)[1,])
plot(shapley)

This loan was actually paid back in full, and our model assigns a relatively high score of 0.71 to this example. The largest contributions to this high score comes from the fact that this applicant has a mortgage and an annual income of 100k USD (encoded as 1 in our data preparation). The strongest factors causing our model to lower the predicted score are a loan amount of 30k (encoded as 3), and the fact that the lending club assigned a grade of only D5 (encoded as 2).

Even more enlightening than getting a breakdown of a successful prediction are misclassified examples. The next loan was actually charged off, but our model assigns it a high confidence. According to the shapley plot, the largest contributor to our high confidence was that the loan was graded as A1 (sub_grade = 0.1) in our encoding. The only factor lowering our score was the high interest rate.

Next steps

One problem with creating a matrix of the independent variables upfront and letting iml work on it is that we loose the grouping of dummy encoded factor columns. iml will independently scramble these (in our example: RENT and MORTGAGE), leading to nonsense combinations such as both columns being 0 or 1 for one loan. In the next article in this series we will show how to deal with that.

Learnings

Model explainability with keras and iml