Search Button
RSS icon Sort by:
Introducing DatatableTon – Python Datatable Tutorials & Exercises
by Jo-fai Chow September 20, 2021 datatable Open Source Python Tutorials

Datatable is a python library for manipulating tabular data. It supports out-of-memory datasets, multi-threaded data processing and has a flexible API. If this reminds you of R’s data.table, you are spot on because Python’s datatable package is closely related to and inspired by the R library. The release of v1.0.0 was done on 1st July, 2021 and it’s probably […]

Read More
Introducing H2O Wave
by Jo-fai Chow December 15, 2020 Open Source Product Updates Python Wave

For almost a decade, H2O.ai has worked to build open source and commercial products that are on the leading edge of innovation in machine learning, from AutoML to Explainable AI. We are thrilled to announce the release of what we believe to be the future of AI Applications: H2O Wave. Wave is an open source, […]

Read More
Brain Pattern
Summary of a Responsible Machine Learning Workflow
by Patrick Moran March 20, 2020 Data Science Deep Learning Machine Learning Machine Learning Interpretability Neural Networks Python Responsible AI

A paper resulting from a collaboration between H2O.AI and BLDS, LLC was recently published in a special “Machine Learning with Python” issue of the journal, Information (https://www.mdpi.com/2078-2489/11/3/137). In “A Responsible Machine Learning Workflow with Focus on Interpretable Models, Post-hoc Explanation, and Discrimination Testing,” coauthors, Navdeep Gill, Patrick Hall, Kim Montgomery, and Nicholas Schmidt compare model accuracy […]

Read More
Blink: Data to AI/ML Production Pipeline Code in Just a Few Clicks
by Erika Kamholz February 11, 2020 H2O Driverless AI Machine Learning Python Technical

You have the data and now want to build a really really good AI/ML model and deliver to production. There are three options available today: Write the code yourself in a Jupyter notebook/R Studio etc., for training/validation and dev-ops model handoff. You decided to do the feature engineering also. Build your own features like above, […]

Read More
Parallel Grid Search in H2O
by Erika Kamholz February 4, 2020 Data Science H2O Machine Learning Open Source Python R R-Bloggers Recommendations Technical Technical Posts

H2O-3 is, at its core, a platform for distributed, in-memory computing. On top of the distributed computation platform, the machine learning algorithms are implemented. At H2O.ai, we design every operation, be it data transformation, training of machine learning models or even parsing to utilize the distributed computation model. In order to work with big data […]

Read More
How H2O propels data scientists ahead of itself: enhancing Driverless AI models with advanced options, recipes and visualizations
by Jo-fai Chow January 6, 2020 Data Science H2O Driverless AI Python R Recipes

H2O.ai engineers continually innovate and introduce new techniques by adopting latest research, working on cutting edge use cases, and participating in and winning machine learning competitions like Kaggle. But thanks to the explosion of AI research and applications even the most advanced automated machine learning platform like H2O Driverless AI cannot come with all bells and whistles to satisfy every […]

Read More
An Overview of Python’s Datatable package
by Patrick Moran June 4, 2019 Data Science H2O H2O Driverless AI Python Technical Technical Posts

This blog originally appeared on Towardsdatascience.com “There were 5 Exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days”: Eric Schmidt If you are an R user, chances are that you have already been using the data.table package. Data.table is an extension of the data.frame package in R. It’s also […]

Read More
H2O New Year releases
by Jo-fai Chow January 18, 2019 H2O H2O Release Python R

There were two releases shortly after each other. First, on December 21st, there was a minor (fix) release 3.22.0.3. Immediately followed by a more major release (but still on 3.22 branch) codename Xu, named after mathematician Jinchao Xu, whose work is focused on deep neural networks, besides many other fields of research. Of course, the […]

Read More
How This AI Tool Breathes New Life Into Data Science
How This AI Tool Breathes New Life Into Data Science
by Saurabh Kumar October 16, 2018 Beginners Data Journalism Data Science Deep Learning Driverless Explainable AI GPU H2O Driverless AI Machine Learning NLP Python R Technical

Ask any data scientist in your workplace. Any Data Science Supervised Learning ML/AI project will go through many steps and iterations before it can be put in production. Starting with the question of “Are we solving for a regression or classification problem?” Data Collection & Curation Are there Outliers? What is the Distribution? What do […]

Read More
ensemble
Stacked Ensembles and Word2Vec now available in H2O!
by Erin LeDell February 8, 2017 Data Munging Ensembles H2O Release NLP Python R Technical

Prepared by: Erin LeDell and Navdeep Gill Stacked Ensembles H2O’s new Stacked Ensemble method is a supervised ensemble machine learning algorithm that finds the optimal combination of a collection of prediction algorithms using a process called stacking or “Super Learning.” This method currently supports regression and binary classification, and multiclass support is planned for a […]

Read More
1 2