January 7th, 2008

Adding Transactions to the Non-BlockingHashMap

RSS icon RSS Category: Personal
Fallback Featured Image

I went to the winter meeting of DaCapo, a small group of senior academics and industrial scientists mostly focussed on GC & Memory-related issues (but with fairly wide interests).  Much of the work is not yet published; work-in-progress stuff (lack of rigorous experiments, or badly broken implementations, or Big-Ideas-No-Code state, etc).  So I won’t repeat what I saw here directly, contact the authors directly if you’re interested in something.  The DaCapo group has decided to come out with a Java Server benchmark, similar to their popular DaCapo Benchmark Suite.  This will be something that’s much much easier to run than SpecJAppServer (that won’t be hard! SJAS is a total nightmare to setup) and much more realistic than SpecJBB2005.
Of the DaCapo material that’s already published, I like Tracking Bad Apples the best.  I’m going to float this one around Azul and see if we want implement it.  I also stumbled over this gem while perusing Emery Burger’s page.
Adding Transactions to the Non-BlockingHashMap
One of the other things that came out was the result of a quick conversation between Rick Hudson, Eliot Moss and myself: adding software transactions to the NonBlockingHashMap.  A transaction would mean that a single thread could do a series of reads & modifications to the HashMap and either abort them all, or commit them all atomically.  If committed, no other thread would see any partial updates and this thread is assured that no other thread modified any K/V pair read during the transaction.  Transactional support would go a long ways towards making a (fast, scalable, non-blocking) in-memory database (A C I but no D).  One of the reasons this idea looks so tractable, is that all the accesses to the transacted memory are already wrapped in function calls – there’s no requirement for a “below the JVM” or “inside the C compiler” implementation.  It can all be done in pure Java.
The basic idea here is to add another State on the Value slot, representing a “transaction in progress”.  Non-transacting threads seeing this state will be required to do more work: at least 1 more layer of indirection to get the actual value, and at least 1 flag check to decide if they need to report the before- or after- transaction-committed value.  The State would be implemented as a variant of box’ing like I do now: an instanceof test reports that the Value is really a Box, and the reader of the Box now reads through the Box to get the actual value.
I believe I can implement this in the State Machine that drives the NBHM, or at least I have a back-of-the-envelope FSM that handles 2 conflicting transactions.  Transactions would be non-blocking  (so no live-lock – always some thread gets to complete a transaction) – but not fair (I couldn’t find a Wikipedia article on the notion of fairness, although there are plenty of articles with the word fairness in their title).  It’s possible for a thread with long transaction to be continuously aborted by other threads with doing repeated short conflicting transactions.  In DB’s, I suspect fairness is a big deal.  In non-blocking algorithms it’s tough to implement: the endless winners need to stop winning at some point and help the endless loser to win every now and then.  The problem is, the winners are different threads than the loser and can’t actually execute the loser’s code directly (other than the little bits inside the NBHM calls).  If I block the winners to let the loser get through, then the algorithm becomes … well…, blocking.  I can ‘pause’ the winners for a fixed period of time and retain the non-blocking title, although this seems messy.  Does somebody have a better idea here?
Another, API-related question: should the transactional calls be under a completely separate API, or should I just have “begin_transaction”, “commit”, and “abort” calls, and assume all the calls in the middle are being transacted over?  What happens if a thread starts a transaction, then does a non-transactional conflicting write (abort the transaction?)?  Do I allow more than one thread to work on the same transaction?  i.e., make a “transaction object” and allow it to be passed around, and require it in transactional calls?

Leave a Reply

What are we buying today?

Note: this is a guest blog post by Shrinidhi Narasimhan. It’s 2021 and recommendation engines are

July 5, 2021 - by Rohan Rao
The Emergence of Automated Machine Learning in Industry

This post was originally published by K-Tech, Centre of Excellence for Data Science and AI,

June 30, 2021 - by Parul Pandey
What does it take to win a Kaggle competition? Let’s hear it from the winner himself.

In this series of interviews, I present the stories of established Data Scientists and Kaggle

June 14, 2021 - by Parul Pandey
Snowflake on H2O.ai
H2O Integrates with Snowflake Snowpark/Java UDFs: How to better leverage the Snowflake Data Marketplace and deploy In-Database

One of the goals of machine learning is to find unknown predictive features, even hidden

June 9, 2021 - by Eric Gudgion
Getting the best out of H2O.ai’s academic program

“H2O.ai provides impressively scalable implementations of many of the important machine learning tools in a

May 19, 2021 - by Ana Visneski and Jo-Fai Chow
Regístrese para su prueba gratuita y podrá explorar H2O AI Hybrid Cloud

Recientemente, lanzamos nuestra prueba gratuita de 14 días de H2O AI Hybrid Cloud, lo que

May 17, 2021 - by Ana Visneski and Jo-Fai Chow

Start your 14-day free trial today