Blogs

Be the first person to recommend this.
H2O Driverless AI is an automatic machine learning platform designed to create highly accurate modeling pipelines from tabular training data. The predictive performance of the pipeline is a function of both the training data and the parameters of the pipeline (details of feature engineering and modeling). During an experiment, Driverless AI automatically tunes these parameters by scoring candidate pipelines on held out (“validation”) data. This important validation data is either provided by the user (for experts) or automatically created (random, time-based or fold-based) by Driverless AI. Once a final pipeline has been created, it should be scored on yet another ...
0 comments
Be the first person to recommend this.
H2O Driverless AI allows you to customize every experiment in great detail via the expert settings. The most important controls however are the three knobs for accuracy, time and interpretability. A higher accuracy setting results in a better estimate of the model generalization performance, usually through using more data, more holdout sets, more parameter tuning rounds and other advanced techniques. Higher time settings means the experiment is given more time to converge to an optimal solution. Higher interpretability settings reduces the model’s complexity through less feature engineering and using simpler models. In general, a setting of 1/1/10 will lead ...
0 comments

Driverless AI Scorer Tips

Be the first person to recommend this.
A core capability of H2O Driverless AI is the creation of automatic machine learning modeling pipelines for supervised problems. In addition to the data and the target column to be predicted, the user can pick a scorer. A scorer is a function that takes actual and predicted values for a dataset and returns a number. Looking at this single number is the most common way to estimate the generalization performance of a predictive model on unseen data by comparing the model’s predictions on the dataset with its actual values. There are more detailed ways to estimate the performance of a machine learning model such as residual plots (available on the Diagnostics page ...
0 comments
Be the first person to recommend this.
H2O Driverless AI handles time-series forecasting problems out of the box. All you need to do when starting a time-series experiment is to provide a regular columnar dataset containing your features. Then pick a target column and also pick a “time column” - a designated column containing time stamps for every record (row) such as “April 10 2019 09:13:41” or “2019/04/10”. If you have a test set for which you want predictions for every record, make sure to provide future time stamps and features as well. In most cases, that’s it. You can launch the experiment and let Driverless AI do the rest. It will even auto-detect multiple time series in the same dataset ...
0 comments
Be the first person to recommend this.
Given training data and a target column to predict, H2O Driverless AI produces an end-to-end pipeline tuned for high predictive performance (and/or high interpretability) for general classification and regression tasks. The pipeline has only one purpose: to take a test set, row by row, and turn its feature values into predictions. A typical pipeline creates dozens or even hundreds of derived features from the user-given dataset. Those transformations are often based on precomputed lookup tables and parameterized mathematical operations that were selected and optimized during training. It then feeds all these derived features to one or several machine learning ...
0 comments