June 21st, 2017

Scalable Automatic Machine Learning: Introducing H2O’s AutoML

RSS icon RSS Category: AutoML, Ensembles, H2O Release, Technical
Machine for peneteration

Prepared by: Erin LeDell, Navdeep Gill & Ray Peck
Machine for peneteration
In recent years, the demand for machine learning experts has outpaced the supply, despite the surge of people entering the field. To address this gap, there have been big strides in the development of user-friendly machine learning software that can be used by non-experts and experts, alike. The first steps toward simplifying machine learning involved developing simple, unified interfaces to a variety of machine learning algorithms (e.g. H2O).
Although H2O has made it easy for non-experts to experiment with machine learning, there is still a fair bit of knowledge and background in data science that is required to produce high-performing machine learning models. Deep Neural Networks in particular are notoriously difficult for a non-expert to tune properly. We have designed an easy-to-use interface which automates the process of training a large, diverse, selection of candidate models and training a stacked ensemble on the resulting models (which often leads to an even better model). Making it’s debut in the latest “Preview Release” of H2O, version 3.12.0.1 (aka “Vapnik”), we introduce H2O’s AutoML for Scalable Automatic Machine Learning.
H2O’s AutoML can be used for automating a large part of the machine learning workflow, which includes automatic training and tuning of many models within a user-specified time-limit. The user can also use a performance metric-based stopping criterion for the AutoML process rather than a specific time constraint. Stacked Ensembles will be automatically trained on the collection individual models to produce a highly predictive ensemble model which, in most cases, will be the top performing model in the AutoML Leaderboard.

AutoML Interface

We provide a simple function that performs a process that would typically require many lines of code. This frees up users to focus on other aspects of the data science pipeline tasks such as data-preprocessing, feature engineering and model deployment.
R:

aml <- h2o.automl(x = x, y = y, training_frame = train,
                  max_runtime_secs = 3600)

Python:

aml = H2OAutoML(max_runtime_secs = 3600)
aml.train(x = x, y = y, training_frame = train)

Flow (H2O’s Web GUI):
Run AutoML

AutoML Leaderboard

Each AutoML run returns a “Leaderboard” of models, ranked by a default performance metric. Here is an example leaderboard for a binary classification task:
Model Id auc data
More information, and full R and Python code examples are available on the H2O 3.12.0.1 AutoML docs page in the H2O User Guide.

Leave a Reply

How a Passion for Numbers Turned This Mechanical Engineer into a Kaggle Grandmaster

In conversation with Sudalai Rajkumar: A Kaggle Double Grandmaster and a Data Scientist at H2O.ai It

January 23, 2020 - by Parul Pandey
How H2O propels data scientists ahead of itself: enhancing Driverless AI models with advanced options, recipes and visualizations

H2O.ai engineers continually innovate and introduce new techniques by adopting latest research, working on cutting

January 6, 2020 - by Gregory Kanevsky
H2O Release 3.28 (Yu)

There’s a new major release of H2O, and it’s packed with new features and fixes! Among

December 20, 2019 - by Michal Kurka
Why you should care about debugging machine learning models

This blog post was originally published here. Authors: Patrick Hall and Andrew Burt For all the excitement about machine learning

December 12, 2019 - by Patrick Hall
Interview with Arno Candel | AutoML | Physics | H2O.ai | CTDS.Show

In this episode, Sanyam Bhutani interviews Dr. Arno Candel: CTO at H2O.ai They talk about Arno’s

December 12, 2019 - by Sanyam Bhutani
How to Effectively Employ an AI Strategy in your Business

Artificial Intelligence has evolved from being a buzz word to a reality today. Companies with

December 11, 2019 - by Parul Pandey

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img