July 9th, 2013

Running analysis on the right data!

RSS icon RSS Category: Uncategorized

All in the day:
Anqi Fu, our wickedly smart Math & Data Science hacker-intern from Stanford this summer, was characterizing GLMNet in R on sparse data and comparing with other tools. We were using a data sets predicting Two Bedroom median rent based on neighborhoods from huduser.org.
DATA: http://www.huduser.org/portal/datasets/fmr/CensusRentData/index.html

She found the analysis brisk and surprisingly fast.. Until we got around to checking the data matrix and the factor
call. Most of the data was missing! So she exclaimed:
[Credits to Addletters.org & Matt Groenig for the Simpsons]

Results of her work “Characterizing GLMNet on Sparse Matrices”, will have to wait for a future post!

Leave a Reply

A Beginner’s View of H2O MLOps

Note: this is a community blog post by Shamil Dilshan Prematunga. It was first published

January 15, 2022 - by Jo-Fai Chow
Shapley Values – A Gentle Introduction

If you can't explain it to a six-year-old, you don't understand it yourself. - Albert

January 11, 2022 - by Adam Murphy
The Bond Market & AI: How MarketAxess Brings it All Together

The vast majority of the equities market trades electronically while the bond market is still

January 11, 2022 - by Ian Gomez
H2O Release 3.36 (Zorn)

There’s a new major release of H2O, and it’s packed with new features and fixes! Among

January 7, 2022 - by Michal Kurka
1st Place Winner’s Blog – Kaggle 2021 Data Science and Machine Learning Survey

Kaggle, the largest global community of data scientists, conducted the 5th annual industry-wide survey that

January 4, 2022 - by Shivam Bansal and KunHao Yeh
Why Companies Need to Think About MLOps

For years machine learning (ML) researchers have focused on building outstanding models and figuring out

December 14, 2021 - by Adam Murphy

Start your 14-day free trial today