July 8th, 2019

Toward AutoML for Regulated Industry with H2O Driverless AI

RSS icon RSS Category: AutoML, Data Science, Driverless AI, Explainable AI, Machine Learning Interpretability

Predictive models in financial services must comply with a complex regime of regulations including the Equal Credit Opportunity Act (ECOA), the Fair Credit Reporting Act (FCRA), and the Federal Reserve’s S.R. 11-7 Guidance on Model Risk Management. Among many other requirements, these and other applicable regulations stipulate predictive models must be interpretable, exhibit minimal disparate impact, be carefully documented, and be carefully monitored. Can the productivity and accuracy advantages of new automatic machine learning (AutoML) predictive modeling technologies be leveraged in these highly regulated spaces? While H2O cannot provide compliance advice, we think the answer is likely “yes”. Here’s some quick pointers on how you could get started using H2O Driverless AI.

Interpretable and Constrained Models

H2O Driverless AI is an AutoML system that visualizes data, engineers features, trains models, and explains models all with minimal user input. AutoML systems are great for boosting productivity and maximizing accuracy because they try exhaustive sets of features, modeling algorithms, and hyperparameters that human data scientists usually just don’t have time to consider. Because of their trial-and-error abilities, AutoML systems can create extremely complex models, and all that complexity can be a barrier to interpretability. Fortunately, Driverless AI provides a number of ways for users to take back control and train interpretable models.

Constraining Interactions and Feature Engineering

Figure 1 displays main system settings that, when combined with expert settings like those in Figure 2, will train a monotonic and reproducible model. Reproducibility, often a fundamentally necessary quality of regulated models, is ensured by clicking the REPRODUCIBLE button highlighted in Figure 1. The INTERPRETABILITY knob, also highlighted in Figure 1, is the main control for the simplicity or complexity of automatic feature engineering in Driverless AI. Check out the most recent  INTERPRETABILITY  documentation to see exactly how INTERPRETABILITY settings affect feature engineering. Also, when the INTERPRETABILITY knob is set to 7 or higher, monotonicity constraints are used in XGBoost. Monotonicity means that as an input feature’s value increases, the output of the model can only increase or as an input feature’s value increases the model output can only decrease. Monotonicity is often a desired property in regulated models, and it is a powerful constraint for making models more interpretable in general.

The automated feature engineering process in Driverless AI can create many different types of complex features. Currently the system can try dozens of feature transformations on the original data to increase model accuracy. If any of these transformations seem objectionable, you can turn them off (or “blacklist” them) using the expert settings menu or the system configuration file. Additional expert settings, such as those related to the Driverless AI Feature Brain and Interaction Depth hyperparameters can prevent the system from using any complex features from past models and can restrict the number of original features combined to create any new features.

Figure 1: Highlighted main system settings that enable monotonicity and reproducibility.

Interpretable Models in Driverless AI

Driverless AI offers several types of interpretable models, including generalized linear models (GLMs) and higher-capacity, but directly interpretable, models such as RuleFit and monotonic gradient boosting machines (GBMs). By default, Driverless AI will use an ensemble of several types of models, which is likely undesirable from an interpretability perspective. However you can use the expert settings to manually enable and disable model types and set the number of models included in any ensembles. In Figure 2, all model types are disabled except XGBoost GBMs and the ensemble level hyperparameter is set to use only one type of model, or no ensembling. It would also be possible to disable all models except one linear model or one RuleFit model using settings similar to those displayed in Figure 2.

Figure 2: Highlighted expert settings that would enable the training of a single XGBoost GBM.

Compliant Mode

Driverless AI also offers a one-click compliant mode setting. Compliant mode switches on numerous interpretability settings such as using only single, interpretable models and severely restricting feature engineering and feature interactions. For more details, see the most recent pipeline mode  documentation.

Disparate Impact Testing

Assuring minimal disparate impact is another typical aspect of regulated predictive modeling. Near future versions of Driverless AI will enable disparate impact testing with numerous disparity formulas and user-defined disparity thresholds. Figure 3 displays several metrics for a straightforward analysis of disparate impact across genders. Because disparate impact testing is integrated with modeling in Driverless AI, users can also select the model with the least disparate impact from numerous alternative models for deployment. Like most features in Driverless AI, disparate impact analysis can also be conducted and customized using the Python API.

Figure 3: Basic disparate impact testing across genders.

Generating Adverse Action Notices

Adverse action notices are a set of possible reasons that explain why a lender or employer (or a few other types of regulated organizations) has taken negative action against an applicant or customer. If machine learning is used for specific employment or lending purposes in the U.S., it must be able to generate adverse action notices for any prediction, and those adverse action notices must be specific to a given applicant or customer. Driverless AI provides the raw data for generating customer- or applicant-specific adverse action notices. Specific information for each model decision is provided with several techniques including leave one covariate out (LOCO) feature importance, local interpretable model-agnostic explanations (LIME), individual conditional expectation (ICE), and Shapley additive explanations (SHAP).

Figure 4: Locally-accurate Shapley contributions which can be used to rank the features that led to any model outcome.

Figure 4 displays highly accurate Tree SHAP values for a high risk of default customer in a sample data set. The grey bars are the drivers of the model decision for this specific individual and the green bars are the overall importance of the corresponding feature. These values are available for inspection in a dashboard or in a spreadsheet and are also available when scoring new, unseen data using the Driverless AI Python API and Python scoring package.

Model Documentation

In an effort to simplify model documentation, Driverless AI creates numerous text and graphical artifacts automatically with every model it trains. The text and charts are grouped into two main aspects of the software, AutoDoc and the machine learning interpretability (MLI) module.


As its name suggests, AutoDoc records valuable information for every model trained in Driverless AI automatically. As displayed in Figure 5, recorded information currently includes data dictionaries, methodologies, alternative models, partial dependence plots and more.  AutoDoc is currently available in Word format so that you can either edit the generated document directly or copy and paste the pieces you need into your model documentation template.


Figure 5: Table of contents for the automatically generated report that accompanies each Driverless AI model.

Machine Learning Interpretability (MLI)

The MLI module creates several charts and tables that are often necessary for the documentation of newer machine learning models, such as ICE, LIME, and surrogate decision trees. Figure 6 is a cross-validated surrogate decision tree which forms an accurate and stable summary flowchart of a more complex Driverless AI model. All information from the MLI dashboard is available as static PNG images or excel spreadsheets for easy incorporation into your model documentation template.

Figure 6: An approximate overall flowchart of a Driverless AI model constructed with a surrogate decision tree.

Model Monitoring

Currently Driverless AI offers standalone Python and Java packages for scoring new data in real-time with your selected model. These scoring pipelines can be used from Rest endpoints, in common cloud deployment architectures like Amazon Lambda, or incorporated into your own custom applications. Today the Java scoring pipeline, known as a model-optimized Java object (MOJO),  when deployed on the MOJO Rest Server, allows for monitoring of scoring latency, analytical errors during scoring, and data drift.

In upcoming versions of Driverless AI and H2O, we’re focusing on more robust model monitoring capabilities that will capture all relevant model metrics and metadata in real-time and generate alerts based on drift from training measurements. These planned features will also allow for model accuracy degradation monitoring once the actual labels are received so that model retraining can be triggered automatically based on model performance.


H2O Driverless AI offers cutting-edge automated machine learning with features for adverse action reporting, disparate impact testing, automated model documentation, and model monitoring. With help from our customers and community, H2O is committed to further development of functionality for the responsible and transparent use of automated machine learning.

About the Authors

patrick hall
Patrick Hall

Patrick Hall is a senior director for data science products at H2O.ai where he focuses mainly on model interpretability. Patrick is also currently an adjunct professor in the Department of Decision Sciences at George Washington University, where he teaches graduate classes in data mining and machine learning. Prior to joining H2O.ai, Patrick held global customer facing roles and R & D research roles at SAS Institute. He holds multiple patents in automated market segmentation using clustering and deep neural networks. Patrick was the 11th person worldwide to become a Cloudera certified data scientist. He studied computational chemistry at the University of Illinois before graduating from the Institute for Advanced Analytics at North Carolina State University.

navdeep gill
Navdeep Gill

Navdeep Gill is a Senior Data Scientist/Software Engineer at H2O.ai where he focuses mainly on machine learning interpretability and previously focused on GPU accelerated machine learning, automated machine learning, and the core H2O-3 platform. Prior to joining H2O.ai, Navdeep worked at Cisco focusing on data science and software development. Before that Navdeep was a researcher/analyst in several neuroscience labs at the following institutions: California State University, East Bay, University of California, San Francisco, and Smith Kettlewell Eye Research Institute. Navdeep graduated from California State University, East Bay with a M.S. in Computational Statistics, a B.S. in Statistics, and a B.A. in Psychology (minor in Mathematics).

Leave a Reply

5 Key Considerations for Machine Learning in Fair Lending

This month, we hosted a virtual panel with industry leaders and explainable AI experts from

September 21, 2020 - by Benjamin Cox
The Benefits of Budget Allocation with AI-driven Marketing Mix Models

Excerpt of the white paper: “The Latest in AI Technologies Reinvent Media and Marketing Analytics

September 17, 2020 - by Michael Proksch
What it is like to intern at H2O.ai

Blog post by Jasmine Parekh Let’s be honest, 2020 is not going to go down as

September 15, 2020 - by Jo-Fai Chow
My Experience at the World’s Best AI Company

Blog post by Spencer Loggia When H2O announced that remote work would continue through the summer

September 15, 2020 - by Jo-Fai Chow
Desmistificando a Inteligência Artificial e seu papel no sucesso dos negócios

A Inteligência Artificial tem sido um termo bastante utilizado atualmente, mas será que todos sabem,

September 14, 2020 - by Bruna Smith
Modèles NLP avec BERT

H2O Driverless AI 1.9 vient de sortir, et je vous propose une série d'articles sur les

September 2, 2020 - by Badr Chentouf

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img