January 17th, 2019

What is Your AI Thinking? Part 1

RSS icon RSS Category: Data Science, Driverless AI, Explainable AI, Financial Services, Machine Learning Interpretability

Explaining AI to the Business Person

Explainable AI is in the news, and for good reason. Financial services companies have cited the ability to explain AI-based decisions as one of the critical roadblocks to further adoption of AI for their industry. Moreover, interpretability, fairness, and transparency of data-driven decision support systems based on AI and machine learning, or more traditional statistical or rule-based approaches, are serious regulatory mandates in banking, insurance, healthcare, and other industries. Some of the major regulatory statutes potential governing these industries’ use of AI include the Civil Rights Acts of 1964 and 1991, the Americans with Disabilities Act, the Genetic Information Nondiscrimination Act, the Health Insurance Portability and Accountability Act, the Equal Credit Opportunity Act, the Fair Credit Reporting Act, the Fair Housing Act, Federal Reserve SR 11-7, and European Union (EU) Greater Data Privacy Regulation (GDPR) Article 22. So, if you are a decision maker using or planning to use AI in your company then understanding how you create trust in AI is going to be crucial to your success. This post provides a high-level overview of the new field of interpretable machine learning and some of the most promising related techniques.

Gentle Introduction to Interpretable Machine Learning

As a business person who is using machine learning to make decisions, there are a few fundamental concepts that you need to understand to have intelligent discussions with executives and make smart choices when using machine learning models to make decisions.

First of all, let’s clarify the terms “AI” and “machine learning”. AI is the broader field which encompasses the study and practice of enabling computers to have human-level (or better) intelligence. Machine learning is probably currently the most practical and popular subfield of AI, where computer systems learn from past data how to make decisions about certain topics like: will someone pay their credit card bill or what diagnosis a medical patient should have. The fad today is to use “AI” and “machine learning” somewhat interchangeably, and that’s what we will do in the rest of this post.

Second, there is an entire field of research dedicated to the area of machine learning interpretability and “interpretability” itself is basically a loosely-defined (or over-defined) umbrella term that encompasses at least:

  • Directly transparent “white-box” models
  • Explanation of “black-box” models to enhance transparency
  • Debugging models to increase trust
  • Ensuring fairness in algorithmic decision-making
  • Model documentation

Third, there is plenty of real work already happening in this field. There is an area known as FAT, for Fairness, Accountability, and Transparency, and XAI for explainable AI, and numerous other outlets that are surfacing applicable technologies, such as the 2018 AI for Finance Services Workshop at the NeurlPS conference. Depending on your industry you will likely hear or see other terms, or conferences, workshops, or white papers on similar topics. Generally speaking, researchers and product vendors are already working on the problem of how to build understanding and trust in AI and machine learning systems. Bottom line: a lot of brilliant people are thinking about this today and some of their work is already usable! (If you are reasonably technical and interested in this topic in more depth, I suggest reading – An Introduction to Machine Learning Interpretability, An Applied Perspective on Fairness, Accountability, Transparency, and Explainable AI by Patrick Hall and Navdeep Gill, published by O’Reilly Media).

Some Techniques

To actually get started with machine learning interpretability, I would begin by understanding the goal. You want to use AI and machine learning models because they will help you make more objective, data-driven, higher-value decisions that can help your business. You need to trust that these models are making the right choices because you, and people like you, are ultimately accountable for the decisions. Generally speaking, business people feel they can trust a machine learning model when they have an understanding of why that model is making a particular prediction and when it behaves as expected in realistic test or simulation scenarios. When a model deviates from our knowledge of our business, we also want to understand why. Below are some of the best-known techniques you can use to develop understanding and trust in AI so that you can scale your AI adoption.

General Approaches

Exploratory Data Analysis (EDA, i.e. “Know Thy Data”) – The mantra of garbage in, garbage out still applies. The model learns from the data you put into it. The better you know what you are feeding your model, the more likely you are to understand and make sense of the outputs. You can use a host of visualization tools to gain a better understanding of your data. The best ones will highlight those areas of the data that could cause problems. These problem areas include incomplete data, distant outliers, missing values, strong correlations between variables, and other data quality issues.

Accurate and Interpretable Models – In general, the more directly interpretable your machine learning model is to begin with, the less explanation, compliance, documentation, and potential regulation headaches you will have in the future. Some interpretable modeling techniques are oldie-goldies, like statistical regression models and decision trees. These can be great to get started with! For those looking for more cutting edge and accurate interpretable models, you might want to check out techniques with names like scalable Bayesian rule lists or monotonic gradient boosting machines.

Global Explanations

A complex machine learning model is like a multidimensional topographic map with highs and lows where different variables have different influence at various locations. Global explanatory techniques help you get a 30,000 ft. perspective on that landscape.

Variable Importance Chart – This allows you to see what variables were the most important in all of the model’s predictions taken as a whole. You may be surprised by which variables had the most impact, but at least you can see what those are and compare them against your domain knowledge and how you would make the decision. If you find there are only a few variables that really matter, that could also help you build a simpler model with fewer inputs which is typically faster for production use cases and easier to understand.

Partial Dependence Chart – Once you know what variables are the main drivers of the business problem you’re modeling, partial dependence charts can show you how these variables behave inside the model. Partial dependence plots show the average prediction of the model for the values of an input variable. For instance, these charts would allow you to verify that as customers’ savings account balances increase their overall probability of paying their credit card bill also increases according to your model.

Surrogate Tree Models – When you can’t understand the sophisticated version of a model, a simple version can help explain what is going on. Decision trees are particularly suitable for this as they show the splits and decision points in a way that is easy for people to understand, like a flowchart. For example, how does the model decide what drug dosage to provide? First, it looks at the person’s age, then their gender, and then at related conditions. This simple representation helps model stakeholders and users understand the fundamental decisions the model is making and this flowchart view can help you understand the interactions between important variables found by your machine learning model. (These interactions are important insights too and might be really hard for people to find on their own).

Local Explanations and Reason Codes

The next tool helps you zoom into a local area of your data and model and really get a handle on what the model is doing for an individual or a small group. Your goal is to understand why a model provided any single prediction. In particular, you want to understand the factors that influenced the decision.

Many new explanation techniques focus on understanding small areas of the data and their corresponding predictions. The methods you should look for have names like LIME and Shapley. The vital thing to know is that these techniques will help you, or your data team, explain what is going on in a small area of your input data and model outputs and will help generate the information you need for trustworthy real-time decision making. LIME, for example, allows you to zoom in on a small area of the model where it describes that area with a trend line, which is relatively easy to understand. This technique provides a simple view of which variables had the most significant influence on that area of the model. In practice, this information is often used to create reason codes.

Reason Codes – When an AI system provides a predictive score based on the model, it should also offer individual reason codes. Reason codes are required in industries like financial services when providing credit decisions and can be very helpful to clinicians when using AI-driven diagnostic assistants. These automated data points show how the key variables influenced the model’s prediction for just a single individual. For example, when deciding to decline a new credit card for one specific consumer, the reason codes could show that missing a recent payment, short credit history, and low credit score were the critical factors in the decision.

Up until this point, we have covered an introduction to interpretable AI and machine learning as well as some of the best-known techniques you can use to explain AI systems. In part 2 of this blog series, we explore more techniques for enhancing trust in AI and machine learning models and systems.

Continue reading part 2 of this blog series here.

Leave a Reply

H2O.ai logra gran posicionamiento en integridad de visión en el cuadrante Visionarios del Cuadrante Mágico de Gartner 2021 para Data Science y Machine Learning

En H2O.ai, nuestra misión es democratizar la IA y creemos que impulsar el valor de

April 11, 2021 - by Read Maloney, SVP of Marketing
Safer Sailing with AI

In the last week, the world watched as responders tried to free a cargo ship

April 1, 2021 - by Ana Visneski, Jo-Fai Chow and Kim Montgomery
H2O AI Hybrid Cloud: Democratizing AI for Every Person and Every Organization

Harnessing AI's true potential by enabling every employee, customer, and citizen with sophisticated AI technology

March 24, 2021 - by Parul Pandey
H2O.ai é a mais avançada por sua capacidade de execução no quadrante dos visionários no relatório do Gartner de Ciências de Dados e Machine Learning em 2021

*Este artigo foi originalmente escrito em inglês pelo SVP de Marketing, Read Maloney, e traduzido

March 16, 2021 - by Read Maloney, SVP of Marketing
H2O.ai Placed Furthest in Completeness of Vision in 2021 Gartner Data Science and Machine Learning Magic Quadrant in the Visionaries Quadrant.

At H2O.ai, our mission is to democratize AI, and we believe driving value from data

March 9, 2021 - by Read Maloney, SVP of Marketing
Learning from others is imperative to success on Kaggle says this Turkish GrandMaster

In conversation with Fatih Öztürk: A Data Scientist and a Kaggle Competition Grandmaster. In this series

February 15, 2021 - by Parul Pandey

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img