May 12th, 2021

How Much is My Property Worth?

RSS icon RSS Category: Community, Deep Learning, Explainable AI, H2O, Open Source, R

Note: this is a guest blog post by Jaafar Almusaad.

How Much is My Property Worth?

This is the million-dollar question – both figuratively and literally.

Traditionally, qualified property valuers are tasked to answer this question. It’s a lengthy and costly process, but more critically, it’s inconsistent and largely subjective. Mind you, valuation is an “art,” not “science.”

In reality, “qualified” valuers often end up having different “opinions” about the value of properties, and it’s up to the customer to pick the “opinion” that better serves their interest.

To address this issue, AVMs (Automated Valuation Models) have been developed. The motive was to use data and hand-crafted computer algorithms to estimate the market value of properties instantaneously and consistently. However, a major caveat with AVMs is that human biases can propagate to the final product, potentially resulting in partial valuations.

A more recent approach is to rely on Machine Learning and AI. Indeed, AI has outperformed human experts in many fields, and more data is now available than we can ever digest.

I was curious if AI can replace current AVMs, so I curated a dataset from different sources. It comprises tens of millions of property transactions and data about locations in the UK (safety, income, education, etc.). 

I typically use R and the data.table library for my data workflows, and these work flawlessly with H2O-3. So, I used H2O to train Deep Learning models to predict property values.

As you may realize, Deep Learning AI is superior but pretty much a “black box.” Therefore, it’s crucial to validate the models, not only mathematically but also with a human’s “common sense”. H2O has a whole arsenal of tools for validating and interpreting trained AI models. In addition to the common statistical tools (RMSE, Deviance, etc.), H2O has recently included Residual Analysis. In layman’s terms, residual analysis picks an observation with a known target value, predicts the value, and compares the prediction with the actual value — a residual of zero means perfect prediction. The analysis is illustrated graphically for additional convenience.

Another couple of tools that I find very helpful are Variable Importance and Partial Dependence Plots (PDPs). As the name suggests, Variable Importance tells us the most influential variables (i.e., predictors). In my case, since the variables are well defined and understood, Variable Importance helps validate the models further. For instance, we know (by experience) that the total floor area has the biggest impact on property value. Fortunately, we can see the floor area at the top of the chart. In other words, the model did well.

However, if we want to dive deeper to see exactly how floor area affects the price, then we can use Partial Dependence Plot. PDP analyzes the impact of a specific variable (i.e., floor area) on the target (i.e., price). Under the hood, it averages out all other variables and focuses on the variable in question. In our case, we can see that the relationship between floor area and price is pretty much linear, which is what we would expect.

As my dataset continues to grow, both in terms of the number of observations and features, I find myself spending more time on the technical side when I should be focusing on growing the business. Thankfully, H2O has automated most of the work with their Driverless AI, which I consider exploring in the next phase.

AccuVal, the property valuation platform is publicly and freely available here https://accuval.co.uk/

Community Contributions

Please let me know if you want to talk about your H2O use cases. We welcome all kinds of community contributions (e.g. blog posts, tech talks, apps, etc.)

About the Author

Jo-Fai Chow

Jo-fai (or Joe) has multiple roles (data scientist / evangelist / community manager) at H2O.ai. Since joining the company in 2016, Joe has delivered H2O talks/workshops in 40+ cities around Europe, US, and Asia. Nowadays, he is best known as the H2O #360Selfie guy. He is also the co-organiser of H2O's EMEA meetup groups including London Artificial Intelligence & Deep Learning - one of the biggest data science communities in the world with more than 11,000 members.

Leave a Reply

What are we buying today?

Note: this is a guest blog post by Shrinidhi Narasimhan. It’s 2021 and recommendation engines are

July 5, 2021 - by Rohan Rao
The Emergence of Automated Machine Learning in Industry

This post was originally published by K-Tech, Centre of Excellence for Data Science and AI,

June 30, 2021 - by Parul Pandey
What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on H2O.ai
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of H2O.ai’s academic program

“H2O.ai provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow
Regístrese para su prueba gratuita y podrá explorar H2O AI Hybrid Cloud

Recientemente, lanzamos nuestra prueba gratuita de 14 días de H2O AI Hybrid Cloud, lo que

May 17, 2021 - by Ana Visneski and Jo-Fai Chow

Start your 14-day free trial today