March 27th, 2014

Google-scale Machine Learning & Deep Learning gets principal platform in Apache Mahout with Spark and H2O

RSS icon RSS Category: Uncategorized

H2O’s vision is direct and simple: scaling machine learning for powering intelligent applications. Our focus is distributed machine learning and a fully-featured set of industrial grade algorithms.
Apache Mahout is where people learn their chops in Machine Learning. Like R, It’s the “hello world” first place many new users get exposed to algorithms on big data. Making that experience beautiful, accessible and value-driven will make machine-learning ubiquitous and Mahout a movement to rival the success & utility of say, lucene and hadoop.
Apache Spark has great developer momentum and in-memory makes it ideal for implementing and extending algorithms.
Our vision and motivation is to re-ignite the community & double down on the identical founding visions of Mahout and H2O. Under one umbrella, Mahout can power intelligent applications for the enterprises and users.
Creating great software is hard, creating passionate communities is harder. Our belief is that a product is not complete without it’s community. This convergence will make Mahout the principal platform for integrating multiple ways of mining insights from data.
These are exciting times for Mahout. These initiatives will drive momentum to the Mahout as the umbrella platform for Machine Learning. It’s success will drive wide-scale adoption of scalable machine learning algorithms in the enterprise & H2O is committed to that unified vision. Spark is a terrific in-memory platform for that. Stratosphere will be another. Scala, R, Python, JS, Java and the Matrix APIs make it a polyglot modeling & programming universe. This will be fun.
We are excited at the possibilities of this convergence. A fan of Mahout ‘s vision and how it captured the imagination of machine learning enthusiasts over the years.. (Still fondly recollect Isabel’s spirited talk at ApacheCon years ago!) A real product, hacker and an open source developer culture is the need. The R community has also been looking for a package that solved distributed frames (in-memory) & parallel packages for the algorithms behind. Our team has executed on a lots of these inspirations fast & furiously in open source over the past two years. We hope to enrich & fulfill the day-to-day workflows of the Machine Learning users world-wide through Apache Mahout.
It all starts with the end (ml) user experience and how we can make it better.

Leave a Reply

AI-Driven Predictive Maintenance with H2O Hybrid Cloud

According to a study conducted by Wall Street Journal, unplanned downtime costs industrial manufacturers an

August 2, 2021 - by Parul Pandey
What are we buying today?

Note: this is a guest blog post by Shrinidhi Narasimhan. It’s 2021 and recommendation engines are

July 5, 2021 - by Rohan Rao
The Emergence of Automated Machine Learning in Industry

This post was originally published by K-Tech, Centre of Excellence for Data Science and AI,

June 30, 2021 - by Parul Pandey
What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of’s academic program

“ provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow

Start your 14-day free trial today