June 12th, 2018

Time is Money! Automate Your Time-Series Forecasts with Driverless AI

RSS icon RSS Category: Driverless AI
Details about H2o ai experiment

Time-series forecasting is one of the most common and important tasks in business analytics. There are many real-world applications like sales, weather, stock market, energy demand, just to name a few. We strongly believe that automation can help our users deliver business value in a timely manner. Therefore, once again we translated our Kaggle Grand Masters’ time-series recipes into our automatic machine learning platform Driverless AI (version 1.2). This blog post introduces the new time-series functionality with a simple sales forecasting example.
The key features/recipes that make automation possible are:

  • Automatic handling of time groups (e.g. different stores and departments)
  • Robust time-series validation
    • Accounts for gaps and forecast horizon
    • Uses past information only (i.e. no data leakage)
  • Time-series specific feature engineering recipes
    • Date features like day of week, day of month etc.
    • AutoRegressive features like optimal lag and lag-features interaction
    • Different types of exponentially weighted moving averages
    • Aggregation of past information (different time groups and time intervals)
    • Target transformations and differentiation
  • Integration with existing feature engineering functions (recipes and optimization)
  • Automatic pipelines generation (see this blog post)

A Typical Example: Sales Forecasting

Below is a typical example of sales forecasting based on Walmart competition on Kaggle. In order to frame it as a machine learning problem, we formulate the historical sales data and additional attributes as shown below:
Raw data:
Table for store sales
Data formulated for machine learning:
Table for departement_strore
Once you have your data prepared in tabular format (see raw data above), Driverless AI can formulate it for machine learning and sort out the rest. If this is your very first session, the Driverless AI assistant (new feature in version 1.2) will guide you through the journey.
Alert for driverless ai
Similar to previous Driverless AI examples, users need to select the dataset for training/test and define the target. For time-series, users need to define the time column (by choosing AUTO or selecting the date column manually). If weighted scoring is required (like the Walmart Kaggle competition), users can select the column with specific weights for different samples.
Details about H2o ai experiment
If users prefer to use automatic handling of time groups, they can leave the setting for time groups columns as AUTO.
simple settings
Expert users can define specific time groups and change other settings as shown below.
Data about simple settings
Once the experiment is finished, users can make new predictions and download the scoring pipeline just like any other Driverless AI experiments.
Walmart demo data
Seeing is believing. Try Driverless AI yourself today. Sign up here for a free 21-day trial license.
Until next time,
Joe
Bonus fact: The masterminds behind our time-series recipes are Marios Michailidis and Mathias Müller so internally we call this feature AutoM&M.

About the Author

Jo-Fai Chow

Jo-fai (or Joe) has multiple roles (data scientist / evangelist / community manager) at H2O.ai. Since joining the company in 2016, Joe has delivered H2O talks/workshops in 40+ cities around Europe, US, and Asia. Nowadays, he is best known as the H2O #360Selfie guy. He is also the co-organiser of H2O's EMEA meetup groups including London Artificial Intelligence & Deep Learning - one of the biggest data science communities in the world with more than 11,000 members.

Leave a Reply

What are we buying today?

Note: this is a guest blog post by Shrinidhi Narasimhan. It’s 2021 and recommendation engines are

July 5, 2021 - by Rohan Rao
The Emergence of Automated Machine Learning in Industry

This post was originally published by K-Tech, Centre of Excellence for Data Science and AI,

June 30, 2021 - by Parul Pandey
What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on H2O.ai
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of H2O.ai’s academic program

“H2O.ai provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow
Regístrese para su prueba gratuita y podrá explorar H2O AI Hybrid Cloud

Recientemente, lanzamos nuestra prueba gratuita de 14 días de H2O AI Hybrid Cloud, lo que

May 17, 2021 - by Ana Visneski and Jo-Fai Chow

Start your 14-day free trial today