Search Button
RSS icon Sort by:
Time Series Forecasting Best Practices
by Jo-fai Chow October 15, 2021 H2O AI Hybrid Cloud Technical Posts Time Series

Earlier this year, my colleague Vishal Sharma gave a talk about time series forecasting best practices. The talk was well-received so we decided to turn it into a blog post. Below are some of the highlights from his talk. You can also follow the two software demos and try it yourself using our H2O AI […]

Read More
Improving NLP Model Performance with Context-Aware Feature Extraction
by Jo-fai Chow October 8, 2021 H2O AI Hybrid Cloud NLP Technical Posts

I would like to share with you a simple yet very effective trick to improve feature engineering for text analytics. After reading this article, you will be able to follow the exact steps and try it yourself using our H2O AI Hybrid Cloud. First of all, let’s have a look at the off-the-shelf natural language […]

Read More
Combining the power of KNIME and in a single integrated workflow
by Bruna Smith October 14, 2020 AutoML Community H2O Driverless AI Partners Technical Posts Tutorials

KNIME and, the two data science pioneers known for their open source platforms, have partnered to further democratize AI. Our approaches are about being open, transparent, and pushing the leading edge of AI. We believe strongly that AI is not for the select few but for everyone. We are taking another step in democratizing […]

Read More
Parallel Grid Search in H2O
by Erika Kamholz February 4, 2020 Data Science H2O Machine Learning Open Source Python R R-Bloggers Recommendations Technical Technical Posts

H2O-3 is, at its core, a platform for distributed, in-memory computing. On top of the distributed computation platform, the machine learning algorithms are implemented. At, we design every operation, be it data transformation, training of machine learning models or even parsing to utilize the distributed computation model. In order to work with big data […]

Read More
An Overview of Python’s Datatable package
by Patrick Moran June 4, 2019 Data Science H2O H2O Driverless AI Python Technical Technical Posts

This blog originally appeared on “There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days”: Eric Schmidt If you are an R user, chances are that you have already been using the data.table package. Data.table is an extension of the data.frame package in R. It’s also […]

Read More
Building AI/ML models on Lending Club Data, with — Part 1
by Saurabh Kumar March 28, 2019 Beginners Community Data Journalism Data Science Technical Posts Tutorials

Lending Club publishes its basic loan databases to the public and a full version to its customers — anonymized of course. You can find the download page from this link (screenshot below): The publicly downloadable loan data has various attributes — roughly 150+ columns that have categorical, numeric, text and date fields. It also has a ‘loan_status’ text column […]

Read More
Finally, You Can Plot H2O Decision Trees in R
by Jo-fai Chow January 15, 2019 Data Science Machine Learning R Technical Technical Posts Tutorials

Creating and plotting decision trees (like one below) for the models created in H2O will be the main objective of this post: Figure 1. Decision Tree Visualization in R Decision Trees with H2O With release H2O-3 (a.k.a. open source H2O or simply H2O) added to its family of tree-based algorithms (which already included DRF, […]

Read More
The Making of H2O Driverless AI – Automatic Machine Learning
by Arno Candel December 5, 2018 AutoML Community H2O Driverless AI H2O World H2O4GPU Makers Technical Technical Posts

It is my pleasure to share with you some never before exposed nuggets and insights from the making of H2O Driverless AI, our latest automatic machine learning product on our mission to democratize AI. This has been truly a team effort, and I couldn’t be more proud of our brilliant makers who continue to relentlessly […]

Read More
All data stays on GPU
H2O announces GPU Open Analytics Initiative with MapD & Continuum
by Vinod Iyengar May 8, 2017 Community GPU Technical Technical Posts, Continuum Analytics, and MapD Technologies have announced the formation of the GPU Open Analytics Initiative (GOAI) to create common data frameworks enabling developers and statistical researchers to accelerate data science on GPUs. GOAI will foster the development of a data science ecosystem on GPUs by allowing resident applications to interchange data seamlessly and efficiently. […]

Read More