An introduction to AI explainability; why should you trust your model?
Explainable artificial intelligence (XAI) is the process of understanding how and why a machine learning model makes its predictions. It can also help machine learning (ML) developers and data scientists to better understand and interpret a models’ behavior. A major challenge for machine learning systems is that their effectiveness in real-world applications is limited by the current inability of the machine to explain its decisions and actions to human users. There are many real-world applications of artificial intelligence. Many industries, such as Finance, Legal, Military, Transportation, Security, and Medicine, make use of AI.
This blog post will provide a high-level overview of explainable artificial intelligence as a concept and summarize a few of the key techniques involved.
Explainable AI is used to describe an AI model, its expected impact, and potential biases. One reason explainability is important is because it is a subfield within responsible AI (along with fairness, privacy, and security), which is the practice of ensuring that AI benefits society and doesn’t reinforce unfair bias. Explainability is one way to ensure fairness; understanding the reasons behind a model’s predictions can help ensure models are treating all users fairly. Explainable machine learning models are important outside the context of fairness as well. For example, to do something like explain why a fraud detection model flagged a particular transaction as fraudulent. When using machine learning models in any automation or decision-making process, it is essential to understand why and how they are making predictions. The healthcare industry is starting to use AI systems for use cases such as diagnosing cancer and other illnesses. For medical diagnoses with life and death consequences, it is crucial for medical practitioners to understand the decision-making process of the models.
In data science and machine learning, transparency is becoming more important when designing and training models. AI explainability is a trending topic on google, and for good reason. It is crucial for an organization and its stakeholders to build trust and confidence when putting AI models into production.
AI explainability is a broad concept related to a variety of other concepts depending on who is involved. For example, an executive who is looking at AI investments see explainability as involving transparency, accountability, or governance. Explainability is also related to, but not the same as interpretability. Model interpretability also refers to how easy it is for humans to understand the processes involved in model predictions. Interpretability has to do with how accurate a machine learning model can associate a cause to an effect. Explainability has to do with the ability of the parameters, often hidden in Deep Nets, to justify the result. Some models, like logistic regressions, are relatively straightforward and therefore highly interpretable. As more features are added, or more complicated machine learning models are used, such as deep learning, interpretability becomes more difficult. Explainability relates to being able to understand and explain more complicated deep learning models.
“black box” models vs. interpretable models
When discussing AI models and explainability, it is common to hear the models referred to as “black box” models. But what is a “black box” model?
When it comes to machine learning models, more complex problems require complex algorithms. Typically, complex algorithms are not as interpretable. A complex model is built on non-linear, non-smooth relationships, and requires long computation times. A highly interpretable model is built on linear and smooth relationships and is easy to compute.
Deep neural networks can have millions of parameters, which exceed human capabilities of understanding. Therefore, to understand those types of machine learning models there are two options:
1) Build interpretable models (a.k.a. “model based”)
2) Derive explanations of complex models (a.k.a. “post-hoc”)
Interpretable models are also known as “Glassbox” models, because you can look inside the model, know what is going on, and understand model performance. Examples of interpretable models are decision trees and logistic regressions (linear classifiers).
For the “post-hoc” option, there are two approaches; the black-box approach, and the white-box approach.
With the white-box approach, we have access to the model internals, meaning that we can access gradients or weights of a neural network.
With the black-box approach, we don’t know anything about the model. We only use the relationship between inputs and outputs (the response function) to derive explanations.
The advancement of AI and machine learning has led to the development of new and creative ways to use data. The complexity of many machine learning algorithms has become incomprehensible for AI experts. This is why they are often referred to as “black box models”. The field of explainable AI entails the whole psychology about what makes good model explanations and deciphering which explanation types are the best for humans.
How does explainable artificial intelligence work?
Some of the main questions that explainable models seek to answer are :
- What are the most important features ?
- How can you explain a single prediction ?
- How can you explain the whole model ?
The image below depicts the flow of an explainable model
XAI is a concept, but it is also a set of best practices, design principles, and techniques/methodologies.
Best practices: XAI leverages some of the best rules that data scientists use to help others understand how models are trained. Knowing how a model was trained, and on what training dataset, helps users understand when it makes sense to use a model.
Design Principles: Machine learning engineers are more focused on simplifying the building of AI systems to make them inherently easier to understand.
Methodology: There are a variety of methods of explainable AI that have been developed. This methodology is beyond the scope of this article, however below is a list of common techniques:
- Feature Importance: This technique focuses on which features have the biggest impact on predictions. There are several ways to compute feature importance such as permutation importance, which is fast to compute and widely used.
- Individual Conditional Expectation (ICE): ICE is a plot that shows how a change in an individual feature changes the outcome of each individual prediction (one line per prediction).
- Partial Dependence Plots (PDP): Similar to ICE, PDPs show how a feature affects predictions. They are however more powerful since they can plot joint effects of 2 features on the output.
- Shapley Values (SHAP Values): Shapley values are used to break down a single prediction. SHAP (shapley additive explanation) values show the impact of having a certain value for a given feature in comparison to the prediction we would make if that feature took some baseline value.
- Approximation (Surrogate) Models: With this technique, you train a black-box model, and then train a surrogate model which explains the predictions of the black-box model. The surrogate model is interpretable and acts as the explainer model. You keep the original training data to use as targets for the predictions made by the black-box algorithm.
- Local Interpretable Model-agnostic Explanations (LIME): LIME is a python library that tries to solve for model interpretability by producing locally faithful explanations. Instead of training an interpretable model to approximate a black box model, LIME focuses on training local explainable models to explain individual predictions. We want the explanation to reflect the behavior of the model “around” the instance that we predict. This is called “local fidelity”.
Learn more about decrypting your machine learning model using LIME in this tutorial.
When it comes to machine learning-powered predictive analytics, it is important for users and decision-makers alike to understand how and why models are making decisions. It is crucial for an organization to have a full understanding of the AI decision-making processes and not to trust black-box models blindly. Explainable AI can help humans understand and explain machine learning algorithms, deep learning, and neural networks. It is one of the key requirements for implementing responsible AI.