November 13th, 2019

Useful Machine Learning Sessions from the H2O World New York

RSS icon RSS Category: H2O World, Machine Learning Interpretability, Makers

Conferences not only help us learn new skills but also enable us to build brand new relationships and networks along the way. H2O World is one such interactive community event featuring advancements in AI, machine learning, and explainable AI. It is a platform where people not only get to connect with the fantastic community but also learn to harness the full value of AI, machine learning, explainable AI, deep learning, and data science from industry-recognized speakers.

The recent edition of H2O World was held in New York and was attended by who’s who of the Data Science industry. With a stellar line up of speakers, experts, and Kaggle Grandmasters, the event was a huge success. There were several sessions and training conducted during the event. This article lists some of the important talks related to Driverless AI and H2O-3 from this H2O World conference. These talks/sessions are a great way to understand how Driverless AI and libraries like H2O can be utilized to solve some of the critical business problems with AI.

H2O.ai Products and Solutions

H2O.ai offers a range of AI and data science platforms from industry-leading open-source, H2O platform, to integrations for Apache Spark with Sparkling water, acceleration for NVIDIA GPU with H2O4GPU, and the award-winning H2O Driverless AI platform that delivers an expert data scientist in a box. H2O.ai provides leading AI technologies and products for every organization.

Follow along 💻

In case you want to follow along, you can take the 21 day trial of the Driverless AI called the Test Drive. This eliminates the need to download software, and you can learn how to build machine learning models using H2O Driverless AI on the AWS cloud.

There is also a great set of tutorials to help you get started easily with Driverless AI and H2O-3, through a step by step process. You can find them here: https://h2oai.github.io/tutorials/

Intro to Driverless AI

If you are thinking of getting your feet wet in Driverless AI, these videos are just what you need. David Whiting, Head of Global Training at H2O.ai.gives a hands-on demonstration using the Credit Card dataset to predict which customer will default on the payment. This is a classic classification problem and a great use case for machine learning.

Time Series in H2O Driverless AI

Time series is a unique field in predictive modeling where standard feature engineering techniques and models are employed to get the most accurate results.

Dmitry Larko, a Senior Data Scientist at H2O.ai, gives a brief overview of Time Series data with Driverless AI. Dmitry showcases the capabilities of Driverless AI’s Time series recipe while also covering certain validation, feature engineering, feature selection, and modeling strategies.

Scalable Automatic Machine Learning with H2O

In this presentation, Erin LeDell, Chief Machine Learning Scientist, H2O.ai, provides a history and overview of the field of “Automatic Machine Learning” (AutoML), followed by a detailed look inside H2O’s open-source AutoML algorithm. The presentation also showcases a demo of H2O AutoML in R and Python, including a handful of code examples to get one started in automatic machine learning quickly.

Get Started with Driverless AI Recipes

Recipes are customizations and extensions to the Driverless AI platform. These can be custom machine learning models, transformers, or scorers (classification or regression), written in Python. Data scientists can bring their own recipes or leverage the open-source recipes available by the community and curated by H2O.ai data science experts.

Michelle Tanco, a Customer Solutions Engineer & Data Scientist for H2O.ai. In this Hands-on session, she goes into the details of the recipes feature in Driverless AI and how can one create a recipe and integrate it with the Driverless AI platform.

TensorBoard integration and Image Recognition with Driverless AI

Yauhen Babakhina Data Scientist at H2O, shows us the capability of Driverless AI with image data. He demos an Image Recognition task using Driverless AI, where the task is to classify an image in one of the given six categories.

NLP with H2O Driverless AI

Sudalai Rajkumar, (aka SRK) is a Senior Data Scientist at H2O.ai. In his talk, he throws light on the Natural Language capabilities of Driverless AI w.r.t text classification and regression problems. He discusses how Driverless AI can address a whole new set of problems in the text space like automatic document classification, sentiment analysis, emotion detection, and so on, using the textual data.

AutoDoc with H2O Driverless AI

Megan Kurka, Customer Data Scientist at H2O, talks about Driverless AI’s AutoDoc. AutoDoc enables a user to automatically document and explains the processes used by the Driverless AI platform. In a way, it frees up the user from summarizing their workflow while building machine learning models. The documentation includes details about the data used, the validation schema selected, model and feature tuning, and the final model created.

data.table for R and Python

Matt Dowle is the main author of the famous data.table package in R. data.table provides a high-performance version of R’s data.frame with syntax and feature enhancements for ease of use, convenience, and programming speed. In this presentation, Matt talks about the data.table package in R in detail. H2O has ported data.table to Python as datatable or Pydatatable, which like R’ data.table is also completely open-sourced.

The Case for Model Debugging

Patrick Hall is the Senior Director of Product at H2O.ai where he focuses mainly on model interpretability. This presentation outlines several standards and newer model debugging techniques and proposes several potential remediation methods for any discovered bugs. Discussed debugging techniques include adversarial examples, benchmark models, partial dependence and individual conditional expectation, random attacks, Shapley explanations of predictions and residuals, and models of residuals. Proposed remediation approaches include alternate models, editing of deployable model artifacts, missing value injection, prediction assertions, and regularization methods.

Additional Links

About the Author

Parul Pandey

Parul is a Data Science Evangelist here at H2O.ai. She combines Data Science, evangelism, and community in her work. She is also a Kaggle Grandmaster in the notebooks category and was one of Linkedin’s Top Voice in the Software Development category in 2019.

Leave a Reply

What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on H2O.ai
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of H2O.ai’s academic program

“H2O.ai provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow
Regístrese para su prueba gratuita y podrá explorar H2O AI Hybrid Cloud

Recientemente, lanzamos nuestra prueba gratuita de 14 días de H2O AI Hybrid Cloud, lo que

May 17, 2021 - by Ana Visneski and Jo-Fai Chow
How Much is My Property Worth?

Note: this is a guest blog post by Jaafar Almusaad. How Much is My Property Worth? This

May 12, 2021 - by Jo-Fai Chow
What it takes to become a World No 1 on Kaggle

In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster, and a Ph.D. in

May 3, 2021 - by Parul Pandey

Start your 14-day free trial today