Are All Your AI and ML Models Wrong?

Published: May 05, 2020

min read

Written by: James Orton

We are living in unprecedented times. Our society and economy are experiencing shocks beyond anything we have seen in living history. Beyond the human cost, there is a data science and machine learning elephant in the room (hopefully 2 meters away): Are your predictive models still doing the job you expect them to do?

The challenge here is greatly complicated by:

Huge government spending already committed, with likely further fiscal stimulus in the future instantly altering market dynamics.
Short and long term shocks to the economy from social distancing measures. Will consumer behavior be altered permanently in some scenarios?
An uneven impact, from the types of businesses struggling to how people and their families are affected, varying by each state’s individual regulations. Us Victorians are starting to feel hard done by as other states start to relax their lockdowns.
The feedback delay. In this fast-moving environment, we won’t know immediately what has changed, which will make both accurate modeling and monitoring model performance challenging.
New data might look very different from past data as people’s behavior changes. Perhaps people looking for financial assistance or credit who never have in the past.
Australia’s last recession was 28 years ago. This lack of relevant data could create a small data problem and further difficulty building stable models.

What do you do? One option is do nothing, accept higher error rates in your predictive modeling and AI Systems. Curl up in a dark room and hope things return to “normal”. But this is a very risky approach. Just take a look at the media, and many are predicting a ‘new normal ’.

It might be much better to be proactive and work out what we can do and what we can’t. Sure the devastating impact of COVID-19 is unprecedented, but we see similar changes on a smaller scale all the time. Like when Virgin Australia entered (rather than exited) the domestic flying market or more recently with the advent of the neo-banks and fintechs. What can we learn from these and other shocks to the system?

I think we need to look beyond performance on the test set, search ways to rapidly develop AI models, and apply model monitoring techniques at scale. You can check some of H2O Driverless AI’s AutoML capabilities here.

I am interested in hearing others’ thoughts. Are you rebuilding?