May 3rd, 2021

What it takes to become a World No 1 on Kaggle

RSS icon RSS Category: Data Science, Kaggle, Machine Learning, Makers

In conversation with Guanshuo Xu: A Data Scientist, Kaggle Competitions Grandmaster, and a Ph.D. in Electrical Engineering.

In this series of interviews, I present the stories of established Data Scientists and Kaggle Grandmasters at H2O.ai, who share their journey, inspirations, and accomplishments. The intention behind these interviews is to motivate and encourage others who want to understand what it takes to be a Kaggle Grandmaster.

In this article, I shall be sharing my interaction with Guanshuo Xu. He is a Kaggle Competitions Grandmaster and a Data Scientist at H2O.ai. Guanshuo obtained his Ph.D. in Electrical & Electronics Engineering at the New Jersey Institute of Technology, focusing on machine learning-based image forensics and steganalysis.

Guanshuo is a man of many accomplishments. His methods for real-world image tampering detection and localization won second place in the First IEEE Image Forensics Challenge. His architectural design of deep neural networks outperformed traditional feature-based methods for the first time in image steganalysis. More recently, Guanshuo also achieved the world number one rank in the competition’s tier on Kaggle with a win in the Alaska2 Image Steganalysis and RSNA STR Pulmonary Embolism Detection competitions.  

Here is also a link to Guanshuo’s interview at CTDS.show where he discusses his achievements on Kaggle.


 

In this interview, we shall know more about his academic background, passion for Kaggle, and his journey to the number one title. Here is an excerpt from my conversation with Gunashuo:

You have a background in Ph.D. in Electrical Engineering. Did it somehow influence your decision to take up Machine Learning as a career?

Guanshuo: Yes, my doctoral research used machine learning techniques to solve problems like image tampering detection and hidden data detection. For example, my last Ph.D. research project was to use deep neural nets on image steganalysis. So my education and research are directly related to machine learning. Hence, machine learning was a natural choice of career for me.

How did your start with Kaggle, and what kept you motivated throughout your grandmaster’s journey?

Guanshuo: From the time I discovered Kaggle, I have been addicted to it. Some of the motivating factors for continuous competing on Kaggle would be the combined satisfaction of winning competitions and prize money, learning new techniques, widening and deepening my understanding of machine learning, and building surprisingly effective models.

How does it feel to be World No 1 in Competitions? Does that bring in an extra amount of pressure while competing?

The top 5 Kagglers in the Competition’s category as on date | Source: Kaggle’s website

Guanshuo: Honestly speaking, there is a lot more pressure to maintain the number one rank than achieve it. This is because it requires “smoother” performance. Sometimes I have to participate in more competitions simultaneously than I used to participate in before.

How do you typically approach a Kaggle problem? 

A glimpse of Guanshuo’s competition’s profile. : source: https://www.kaggle.com/wowfattie/competitions

Guanshuo: My approach varies based on the type of problem and the goal of the competition. Nowadays, what I often do is spend days or even weeks on understanding the data and the problem and thinking of a solution which includes, for instance, guessing the distribution of the private test data, proper validation scheme, detailed modeling steps, etc. Once I have a decent picture of the overall approach, I start coding and modeling. This process helps me to gain more understanding and make corrections or adjustments, if necessary, to the overall approach.

Could you give us a sneak peek into your toolkit like a favorite programming language, IDE, Algorithms, etc

Guanshuo: As far as my toolkit is concerned, I mostly use gedit, Python, and Pytorch for deep learning.

The Data Science domain is rapidly evolving. How do you manage to keep up with all the latest developments?

Guanshuo: I get to know about most of the new stuff and technologies through Kaggle, my colleagues, or even by mere googling. As far as new developments in machine learning are concerned, it depends on the actual needs. I tend to filter out anything not instantly helpful and maybe keep an eye on the potentially exciting stuff. Then I get back to it as and when needed. 

A word of advice for the Data Science aspirants who have just started or wish to start their Data Science journey?

A virtual panel where Guanshuo, along with fellow H2O.ai Kaggle GrandMasters shared his insights on Kaggle

Guanshuo: It basically depends on each person’s background and interests. However, finding a suitable platform to learn and develop skills can make things much easier in general. Additionally, taking part in Kaggle competitions can prove to be an additional helpful resource.


 

To achieve a world no 1 rank is no mean feat and Guanshuo’s relentless attitude and hard work deserve all the credit. A peek into his various winning solutions on Kaggle showcases his structured approach which is an essential element to be inculcated for problem-solving.

About the Author

Parul Pandey

Parul is a Data Science Evangelist here at H2O.ai. She combines Data Science, evangelism, and community in her work. She is also a Kaggle Grandmaster in the notebooks category and was one of Linkedin’s Top Voice in the Software Development category in 2019.

Leave a Reply

What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on H2O.ai
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of H2O.ai’s academic program

“H2O.ai provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow
Regístrese para su prueba gratuita y podrá explorar H2O AI Hybrid Cloud

Recientemente, lanzamos nuestra prueba gratuita de 14 días de H2O AI Hybrid Cloud, lo que

May 17, 2021 - by Ana Visneski and Jo-Fai Chow
How Much is My Property Worth?

Note: this is a guest blog post by Jaafar Almusaad. How Much is My Property Worth? This

May 12, 2021 - by Jo-Fai Chow
Unwrap Deep Neural Networks Using H2O Wave and Aletheia for Interpretability and Diagnostics

The use cases and the impact of machine learning can be observed clearly in almost

April 28, 2021 - by Shivam Bansal

Start your 14-day free trial today