September 18th, 2008

A Plea For Programs

Category: Personal
Fallback Featured Image

[Update 9/21/2008: I’ve got a simple sample program for people to port, plus I’ve got at least some code for – Clojure, JavaScript/Rhino, JPC & JRuby; missing Scala at least – thanks].
I would like some non-Java Java-bytecode programs to do performance testing, for a talk I’m giving this coming Friday (my bad for starting this late) and I’m hoping my gentle readers can supply some.  I’d like programs in different languages, but ones that are easy to setup and run.  I’m going to do internal JVM profiling, so I’m not all that concerned with the output or “Foo-per-second” results.  Ideally, my programs would be:

  • Non-Java.  Clojure, Scala, JPython, JRuby all come to mind.  The more variation, the merrier!
  • Easy setup.  I’m not an expert in any of these, so the resulting program has to be easy to setup and run.  Perferably a simple “java -cp Weirdo.jar FunnyProgram” command line.
  • Plain JVM.  Note that the ‘java’ command has to be there; I intend to use Azul Systems’ JVM for profiling and we have our own.  Any kind of odd-ball jar or class files should be fine.
  • Long enough.  The program has to run for several minutes at least, without “babysitting”.  Long enough for the JIT to settle down (if it’s going to), and long enough for decent profiling.
  • Little I/O.  Besides DBs being a pain to setup, I’m really looking for CPU-bound programs.  Plain file I/O is fine, if the files are provided and can be scripted easily (e.g. “java -cp Weirdo.jar FunnyProgram < BigInput.dat > /dev/null”).
  • Be multi-threaded.  Not a requirement, but a definite nice-to-have.  Several of these languages support alternative threading & coherency models and I’d like to test these features.
  • Be Open Source, so I can post the collection for others to compare against.  This is NOT a hard requirement; I’m all fine with keeping private anything you request be kept private.  Performance profiling data will be released, as that is what the talk is about!  (I’m also fine with signing NDA’s but that’s probably not going to be an issue with this crowd).
  • An example: A multi-threaded Mandelbrot program would be fine, computing a 1000×1000 grid of points centered around (1.0,1.0) with a spread of (1.0,1.0) – so fill in the grid (0.5,0.5) to (1.5,1.5), using your choice of thread controls.
  • Please include any names, so I can give credit where credit is due.

I hope to discover things like:

  • How close does “plain code” match the JVM/JIT’s expectations?  How well does the JIT turn “plain code” into machine instructions?  I hope to present the JIT’d code for sample language constructs and detailed profiling data.
  • How well does the function-call logic match the JVM/JIT’s expectations?  Can trivial functions be inlined?  What’s the cost of a not-inlined function-call?
  • Other interesting costs?  (e.g., endless new-Class churning, endless new-bytecode churning causing endless JIT’ing; endless new weak-ref or finalizer creation causing GC grief, etc)
  • How well does the alternative threading & coherency scale?  Can Mandelbrot run on a thousand CPUs?  (I expect: trivially yes). How about programs with more interesting coherency requirements?

I put a sample Java program here, if you’d like to port something really simple.  The inner loop of this program looks like: “for( i=0; i<1000000; i++ ) { sum += ((int)(sum^i)/i); }”.  The JIT’d assembly code from HotSpot’s server compiler looks like this, unrolled a few times:

2.83% 243 0x12d93878 add4      r5, r4, 1 `// tmp=i+1; unrolled 8 times, this is #1
`        
0.06% 5 0x12d9387c xor       r3, r5, r1 `// sum in r1, tmp in r5
`        
0.06% 5 0x12d93880 beq       r5, 0, 0x012d93b40 // zero check before divide             
0.35% 30 0x12d93884 div4      r0, r3, r5 // divide, notice cycles on next op      
2.64% 227 0x12d93888 `add4      r1, r0, r1  
| // sum += (sum ^ tmp)/tmp               `        

As expected, there’s a pretty direct mapping from the source code to the machine code.  I’d like to see how other JVM-based languages stack up here. Email me directly with small programs, or post links here.
Thanks!
Cliff

Leave a Reply

H2O.ai Automatic Machine Learning on Red Hat OpenShift Container Platform Delivers Data Science Ease and Flexibility at Scale

Last week at Red Hat Summit in Boston, Sri Ambati, CEO and Founder, demonstrated how

May 14, 2019 - by Vinod Iyengar
6 Tips to Having it All

I posted this blog on Medium two years ago, thought I'd share a slight rework

May 12, 2019 - by Ingrid Burton
AI/ML Projects — Don’t get stymied in the last mile

Data Scientists build AI/ML models from data, and then deploy it to production – in

May 3, 2019 - by Karthik Guruswamy
Hortifrut uses AI to Determine the Freshness of Blueberries

Who doesn’t love sweet, delicious blueberries? Providing a steady supply of beautiful, tasty berries to the

May 2, 2019 - by Ingrid Burton
Fallback Featured Image
Can Your Machine Learning Model Be Hacked?!

I recently published a longer piece on security vulnerabilities and potential defenses for machine learning

May 2, 2019 - by Patrick Hall
Fallback Featured Image
H2O Driverless AI Updates

We are excited to announce the new release of H2O Driverless AI with lots of improved

April 25, 2019 - by Venkatesh Yadav, VP Customer Success

Join the AI Revolution

Subscribe, read the documentation, download or contact us.

Subscribe to the Newsletter

Start Your 21-Day Free Trial Today

Get It Now
Desktop img