November 14th, 2019

Novel Ways To Use Driverless AI

RSS icon RSS Category: Driverless AI, Machine Learning Interpretability

I am biased when I write that Driverless AI is amazing, but what’s more amazing is how I see customers using it. As a Sales Engineer, my job has been to help our customers and prospects use our flagship product. In return, they give us valuable feedback and talk about how they used it. 

Feedback is gold to us. Driverless AI has evolved into its current iteration because of feedback. Customers and prospects tell us what they like and what they want to see in the product. It then takes a few Github requests and it shows up in the product after a few weeks (Makers Gonna Make – thanks Dev team)!

Relative Feature Importance Chart

One of the big items in Driverless AI is its feature engineering. Our ‘secret sauce’ extracts out more model performance when combined with model tuning and selection. You can see these new features list in the lower middle part of the UI. This feature importance list gets updated with every model iteration. If ensembling is on, the final list is the ensemble feature importance list. I had a prospect that had a model already in production and had identified the top 5 features in their model. They loaded the same training data set into Driverless, hit run, and inspected the top 5 features that it created. 

Not all top 5 features from the production model were in the Driverless AI feature plot, they ended up in the top 10! Instead, there were new interactions between the original top 5 features! Not only did Driverless AI do this, but it also showed that the production model can be squeezed for more performance! The prospect was quite impressed!

Benchmarking

Some customers and prospects like to use Driverless AI as a benchmarking tool. They use it to come up with a model and feature pipeline with a specific performance metric (AUC, MAPE, etc) and then try to beat it with their code.

They spend time looking through the log and trace files to see how Driverless AI hyper-parameter tuned a specific model. Then they look at the feature engineering and see what interactions it came up with. The last step is for them to use their Domain Expertise to code up their custom pipeline from scratch. Sometimes they beat Driverless AI, sometimes they don’t!

MLI

It goes without saying that our MLI module is invaluable for Model Explainable way. The dashboard is packed with Shapley, K-Lime, and Disparate Impact Analysis. It allows you to download a scoring pipeline so you can get the reason codes for each prediction. It’s a very complex but feature-rich module, and our customers love it. 

One quiet superpower it has is the ability to take a training and prediction set from another black box model and run MLI on it! You can generate reason codes, use K-Lime or K-Sup, and look inside your model to make sure it makes sense!

The best part? The dashboard, scoring pipeline, and the ability to check other models outside of Driverless AI come with every licensed instance!

Want to learn more about Driverless AI capabilities?  Check out this free MLI tutorial.

AutoDoc

Some business managers and data scientists sit up in their chairs when I demo this feature. Their eyes light up and the questions start. “Can I add my own logos?” Yes. “Does it show the final ensemble model?” Yes. “Can this report be generated for every experiment?” Yes, if you want it to. 

The reason AutoDoc is a hit is because it saves time. A LOT OF TIME.  More often than not, large companies need to document the models in production. This can be for many reasons (i.e. regulatory) and writing the docs can be a time-consuming process. Click on “Download Autoreport” after the Experiment finishes and 85% of your work is ready. The best part? We include AutoDoc in every Driverless AI license. 

Scoring Pipelines

Last but not least, the scoring pipelines. Some prospects and customers love this feature. In the early days of Machine Learning, people wrestled with building and running models in production. While we still wrestle with this problem today, we’ve since ‘streamlined’ the process into pipelines. 

A pipeline is bits of code that might munge or transform data into the same shape that the model trained on. Then it might pass to another piece of the pipeline that contains a tuned model for inferencing/scoring, and then out come the predictions. 

You can, of course, write all this by hand – and data scientists often do – but it takes a great deal of time!  With a click of a button, Driverless AI lets you generate a Python, C++, or Java pipeline. All you need to do of all the feature transformations, model tuning, and selection. All you need to do is pass your scoring data outcome the predictions. If you couple it with the MLI scoring pipeline, you can also get reason codes along with your predictions. 

End Notes

Over time, I’ve noticed that everyone who touches H2O Driverless AI gravitates to a specific feature of it. The main reason? It saves them time. Need to try new use cases? Give it to Driverless AI to run through while you work on other things. Or you a startup that’s looking be nimble and build your competitive advantage? Let Driverless AI build your models to generate your scoring pipelines.  

The possibilities are endless. What will you Make today?

Want to give it a try? Check out the free 2-hour Driverless AI test drive with the tutorials.

About the Author

Thomas Ott

Tom is a Senior Customer Solutions Engineer for H2O.ai. He started his career in Engineering working for large transportation firms around the USA before transitioning to the startup world in 2014. He comes to H2O.ai from his own Data Science consultancy and RapidMiner. A gritty and seasoned Sales Engineer, he worked across Product Marketing and Community Development and learned how to solve complex customer problems. He's a natural born tinkerer that loves figuring things out and getting it done.

Leave a Reply

Scalable AutoML in H2O

Note: I’m grateful to Dr. Erin LeDell for the suggestions, corrections with the writeup. All of

November 27, 2019 - by Sanyam Bhutani
Meet Yauhen Babakhin: The first and the only Kaggle Grandmaster from Belarus

There is more to competitive Data Science than simply applying algorithms to get the best

November 22, 2019 - by Parul Pandey
Climbing the AI and ML Maturity Model Curve

AI/ML Maturity Model Curve/Steps AI/ML Maturity models are published and updated periodically by a lot of

November 19, 2019 - by Karthik Guruswamy
How to write a Transformer Recipe for Driverless AI

What is a transformer recipe? A transformer (or feature) recipe is a collection of programmatic steps,

November 18, 2019 - by Ashrith Barthur
Useful Machine Learning Sessions from the H2O World New York

Conferences not only help us learn new skills but also enable us to build brand

November 13, 2019 - by Parul Pandey
Fallback Featured Image
Accelerate Machine Learning workflows with H2O.ai Driverless AI on Red Hat OpenShift, Enterprise Kubernetes Platform

Organizations globally are operationalizing containers and Kubernetes to accelerate Machine Learning lifecycles as these technologies

November 12, 2019 - by Nicholas Png

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img