September 3rd, 2008

Death by Decompilation

RSS icon RSS Category: Personal
Fallback Featured Image

I recently got involved in trying to improve performance of a large customer application.  Azul’s JVMs come with an always-on profiling tool – RTPM – so it was easy to spot where the time was going.  The name has been changed to protect the guilty but the name might as well have been “addToSet”.  A quick check of the JIT’d assembly showed that all time was spent in spinning down a long array (an ArrayList actually), checking for duplicates.  In the end, no dup gets found and the new element gets appended.  Adding N elements gives us a classic O(N^2) algorithm.  The array was already >32000 elements in length at the time we profiled it, so time is proportional to (32K^2) ~ 1billion.  Ugly stuff.
Naturally (grumble) this code is in a large 3rd party not-open-source application, where the owning corporation has been recently purchased by an even larger Faceless Corporation.  The “proper approach” would be to file a performance bug and wait and wait and …
We voted to try and work around the problem by doing a little decompilation, hacking the ArrayList into a HashSet, and inserting a new jar file in the bootclasspath.  So we pulled this class file out of the pile-o-jar’s and hit it with the standard decompiler jad (www.kpdus.com/jad.html).  Presto!  About 2000 lines of fairly legible Java code.  Plus a pile-o-warnings about un-decompilable stuff (Ut-oh!) mostly around nested try/catch/finally clauses and inner classes.  I surf the Java code looking for other uses of the offending ArrayList – there are definitely some – and start to see places where jad bailed out.  When I turn javac loose on the new source file, I get lots of error messages.
No problem thinks I, I’ll just get another decompiler.  I immediately google-up cavaj (cavaj-java-decompiler.en.softonic.com).  A few moments later I get another Java file – and whoops!  It’s header mentions jad by name down to the exact same revision.  Sure enough this decompiler is a GUI front-end for jad complete with identically broken Java code.  Turns out there are a number of such GUI wrappers for jad.  So I search some more, discarding mocha (www.brouhaha.com/~eric/software/mocha) for being abandoned, ClassCracker (mayon.actewagl.net.au/index.html) because the demo version didn’t produce any Java code at all, and IceBreaker (www.breakertech.com) because their web site was down.  I also tried Java Decompiler (java.decompiler.free.fr) but it also produced loads of syntax errors although in different places than jad.  DJ (members.fortunecity.com/neshkov/dj.html) also choked on my class file but just barely.  Very long forward jumps spanning nested try/finally blocks would confuse it – but the inner classes came out looking nice.  A colleague also tried JDec (sourceforge.net/projects/jdec) with limited success.  I also stumbled across this nice list of decompilers from google.
In the end, I hand-integrated the code from JDec & DJ to get something that would compile.  I then hand-refactored the code to insert (probably re-insert) generics and Java6 style iterators – this step was entirely mechanical on my part but no decompiler I tried did it.  The resulting code was much smaller and more readable… and made a bug in the code more obvious (there are “before” & “after” Sets that periodically get unioned and during the union the dup-elimination code is broken and allows dups).  The actual replacement of ArrayList with a HashSet took only a few minutes.
In summary: decompiling a single class file to reasonable-looking java code took about 6 hrs (and 7 decompilers looked at and 5 actually turned loose on the class file), while the refactoring to correct the egregious performance bug took maybe 10-minutes (yanked all the search loops and just used ‘contains’ and ‘addAll’) and included fixing the dups-allowed bug as a pleasant side-effect.
Is it time for another Java decompiler bake-off?  Has somebody got a recent comparison of decompilers floating about?
Thanks,
Cliff

Leave a Reply

AI-Driven Predictive Maintenance with H2O Hybrid Cloud

According to a study conducted by Wall Street Journal, unplanned downtime costs industrial manufacturers an

August 2, 2021 - by Parul Pandey
What are we buying today?

Note: this is a guest blog post by Shrinidhi Narasimhan. It’s 2021 and recommendation engines are

July 5, 2021 - by Rohan Rao
The Emergence of Automated Machine Learning in Industry

This post was originally published by K-Tech, Centre of Excellence for Data Science and AI,

June 30, 2021 - by Parul Pandey
What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on H2O.ai
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of H2O.ai’s academic program

“H2O.ai provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow

Start your 14-day free trial today