July 16th, 2015

useR! Aalborg 2015 conference

Category: Uncategorized
matt_dowle

The H2O team spent most of the useR! Aalborg 2015 conference at the booth giving demos and discussing H2O. Amy had a 16 node EC2 cluster running with 8 cores per node, making a total of 128 CPUs. The demo consisted of loading large files in parallel and then running our distributed machine learning algos in parallel.
At an R conference, most people wanted to script H2O from R, which is of course built-in (as is Python) but we also conveyed the benefits that our user interface Flow can provide in this space (even for programmers) by automating and accelerating common tasks. We enjoyed discussing future directions with and bouncing ideas off of the attendees. There is nothing like seeing people’s first reaction to the product, live and in person! As an open source platform, H2O thrives on suggestions and contributions from our community.
All components of H2O are developed-in-the-open on GitHub.

H2O contributed 3 talks:

Matt Dowle on Scalable Radix Sorting

Matt Dowle presented on the details and benchmarks of the fast and stable radix sort implementation in data.table:::forderv. On 500 million random numerics (4 GB), base R takes approximately 22 minutes vs forder at 2 minutes. He discussed the pros and cons of most-significant-digit (forwards) and least-significant-digit (backwards) as well as application to all types: integer with large range (>1e5), numeric and character. We hope to find a sponsor from the R core team to help us include this method in base R where it could benefit the community automatically. The work builds on articles by Terdiman, 2000 and Herf, 2001 and is joint work with Arun Srinivasan.
Slides: Fast, stable and scalable true radix sorting with Matt Dowle at useR! Aalborg

matt_dowle
Photo courtesty of flickr user Rhaen

Erin LeDell on h2oEnsemble

Erin presented an overview of scalable ensemble learning in R using the h2oEnsemble R package. Practitioners may prefer ensemble algorithms when model performance is valued above other factors such as model complexity or training time. This R interface provides easy access to scalable ensemble learning using H2O. The H2O Ensemble software implements the Super Learner, or stacking, ensemble algorithm, using distributed base learning algorithms from the open source machine learning platform, H2O. The following base learner algorithms are currently supported in h2oEnsemble: Generalized linear models with elastic net regularization, Gradient Boosting (GBM) with regression and classification trees, Random Forest and Deep Learning (multi-layer feed-forward neural networks). Erin provided code examples and some simple benchmarks.
Slides: h2oensemble with Erin Ledell at useR! Aalborg

erin_ledell
Photo courtesty of flickr user Rhaen

Amy Wang on H2O Architecture

Amy presented H2O at the useR! sponsor talk and went over the architecture of our product. Her live demo showed the speed and scale of H2O through an R interface. On top of reading in data and aggregating columnar data at lightning fast speed, H2O also comes with a suite of sophisticated models with all the parameters exposed to the front end for ease of use. This attracted discussion at our booth even as the conference came to a close and we began packing up our banners. Many academics expressed interest in using H2O to teach students Machine Learning algorithms, while people in the industry discussed partnerships and use cases. The emphasis of the talk is to encourage R users to try H2O and build a community of users with interesting questions, ideas, and feedback who can ultimately help provide a better open source H2O experience for everyone.
Slides: H2O Overview with Amy Wang at useR! Aalborg

amy_wang
Photo courtesty of Matt Dowle

Matt also stopped by Copenhagen to give a talk at the R Summit. You can find his R Summit slides on our Slideshare

Want to try one of the demos we ran at the useR! booth?

Check out our Github page for instructions, scripts, and datasets.
Click here for R demos
Special thanks to the useR! organizing committee and all the people who stopped by our booth!

Leave a Reply

New features in H2O 3.22

Xia Release (H2O 3.22) There's a new major release of H2O and it's packed with new

November 12, 2018 - by Jo-Fai Chow
Top 5 things you should know about H2O AI World London

We had a blast at H2O AI World London last week! With a record number

November 6, 2018 - by Bruna Smith
Fallback Featured Image
Anomaly Detection with Isolation Forests using H2O

Introduction Anomaly detection is a common data science problem where the goal is to identify odd

November 6, 2018 - by angela
Fallback Featured Image
Launching the Academic Program … OR … What Made My First Four Weeks at H2O.ai so Special!

We just launched the H2O.ai Academic Program at our sold-out H2O AI World London. With

October 30, 2018 - by Conrad
Welcome H2O.ai’s new Driverless AI Community!

I am very excited to announce the formation of the inaugural community for H2O Driverless

October 30, 2018 - by Rafael Coss
Fallback Featured Image
How This AI Tool Breathes New Life Into Data Science

Ask any data scientist in your workplace. Any Data Science Supervised Learning ML/AI project will

October 16, 2018 - by Saurabh Kumar

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img