March 27th, 2014

Google-scale Machine Learning & Deep Learning gets principal platform in Apache Mahout with Spark and H2O

RSS icon RSS Category: Uncategorized

H2O’s vision is direct and simple: scaling machine learning for powering intelligent applications. Our focus is distributed machine learning and a fully-featured set of industrial grade algorithms.
Apache Mahout is where people learn their chops in Machine Learning. Like R, It’s the “hello world” first place many new users get exposed to algorithms on big data. Making that experience beautiful, accessible and value-driven will make machine-learning ubiquitous and Mahout a movement to rival the success & utility of say, lucene and hadoop.
Apache Spark has great developer momentum and in-memory makes it ideal for implementing and extending algorithms.
Our vision and motivation is to re-ignite the community & double down on the identical founding visions of Mahout and H2O. Under one umbrella, Mahout can power intelligent applications for the enterprises and users.
Creating great software is hard, creating passionate communities is harder. Our belief is that a product is not complete without it’s community. This convergence will make Mahout the principal platform for integrating multiple ways of mining insights from data.
These are exciting times for Mahout. These initiatives will drive momentum to the Mahout as the umbrella platform for Machine Learning. It’s success will drive wide-scale adoption of scalable machine learning algorithms in the enterprise & H2O is committed to that unified vision. Spark is a terrific in-memory platform for that. Stratosphere will be another. Scala, R, Python, JS, Java and the Matrix APIs make it a polyglot modeling & programming universe. This will be fun.
We are excited at the possibilities of this convergence. A fan of Mahout ‘s vision and how it captured the imagination of machine learning enthusiasts over the years.. (Still fondly recollect Isabel’s spirited talk at ApacheCon years ago!) A real product, hacker and an open source developer culture is the need. The R community has also been looking for a package that solved distributed frames (in-memory) & parallel packages for the algorithms behind. Our team has executed on a lots of these inspirations fast & furiously in open source over the past two years. We hope to enrich & fulfill the day-to-day workflows of the Machine Learning users world-wide through Apache Mahout.
It all starts with the end (ml) user experience and how we can make it better.

Leave a Reply

An Introduction to Time Series Modeling: Traditional Time Series Models and Their Limitations

In the first article in this series, we broke down the preprocessing and feature engineering

December 3, 2021 - by Adam Murphy
Announcing the Fully Managed H2O AI Cloud

The H2O AI Cloud is the leading platform to make and access your own AI

December 1, 2021 - by Michelle Tanco Tools for a Beginner

Note: this is a community blog post by Shamil Dilshan Prematunga. It was first published

November 30, 2021 - by Jo-Fai Chow
Amazon Redshift Integration for Model Scoring

We consistently work with our partners on innovative ways to use models in production here

November 22, 2021 - by Eric Gudgion
Building Resilient Supply Chains with AI

A global pandemic, a fundamental shift in the demand for goods and services worldwide, and

November 11, 2021 - by Adam Murphy
Introducing the Wildfire Challenge

We are excited to announce our first AI competition for good - Wildfire Challenge. We’ve

November 5, 2021 - by Jo-Fai Chow

Start your 14-day free trial today