October 25th, 2013

Strata NYC & Hadoop World: How to Stop Worrying and Start Modeling Big Data with Better Algorithms and H2O

Category: Uncategorized
Fallback Featured Image


How to Stop Worrying and Start Modeling Big Data with Better Algorithms and H2O
Srisatish Ambati (0xdata Inc), Cliff Click (0xdata Inc)
5:05pm Tuesday, 10/29/2013
Data Science Beekman Parlor – Sutton North
Data Modeling has been constrained through scale; Sampling still rules the day for Adhoc Analytics. Scale brings much needed change to the modeling world. In this talk we present the predictive power of using sophisticated algorithms on big datasets. With large data sizes comes the particularly hard problem of unbalanced data with multiple asymmetrically rare classes. Missing features pose unique problems for most Classification and Regression algorithms and proper handling can lead to greater predictive power. In the race for Better Predictions, H2O makes practical techniques accessible to manyone through an easy-to-use software product.
H2O is an open source math & machine learning engine for big data that brings distribution and parallelism to powerful algorithms while keeping the widely used languages of R and JSON as an API. And integrates neatly into popular data ecosystems of hadoop, amazon s3, nosql and sql. We briefly discuss design choices in the implementation of Distributed Random Forest and Generalized Linear Modeling and bringing speed and scale to vox populi of Data Science, R. We take a peek at the elegant lego-like infrastructure that brings fine grained parallelism to math over simple distributed arrays.
A short hacking data demo presents the life cycle of Data Science: Powerful Data Manipulation via R at scale, Interactive Summarization over large datasets, Modeling using Elastic Net (GLM), Grid Search for best parameters & low-latency scoring.

Leave a Reply

H2O.ai Automatic Machine Learning on Red Hat OpenShift Container Platform Delivers Data Science Ease and Flexibility at Scale

Last week at Red Hat Summit in Boston, Sri Ambati, CEO and Founder, demonstrated how

May 14, 2019 - by Vinod Iyengar
6 Tips to Having it All

I posted this blog on Medium two years ago, thought I'd share a slight rework

May 12, 2019 - by Ingrid Burton
AI/ML Projects — Don’t get stymied in the last mile

Data Scientists build AI/ML models from data, and then deploy it to production – in

May 3, 2019 - by Karthik Guruswamy
Hortifrut uses AI to Determine the Freshness of Blueberries

Who doesn’t love sweet, delicious blueberries? Providing a steady supply of beautiful, tasty berries to the

May 2, 2019 - by Ingrid Burton
Fallback Featured Image
Can Your Machine Learning Model Be Hacked?!

I recently published a longer piece on security vulnerabilities and potential defenses for machine learning

May 2, 2019 - by Patrick Hall
Fallback Featured Image
H2O Driverless AI Updates

We are excited to announce the new release of H2O Driverless AI with lots of improved

April 25, 2019 - by Venkatesh Yadav, VP Customer Success

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img