ROOT and sorting

From: George Heintzelman (gah@bnl.gov)
Date: Tue Jan 04 2000 - 17:05:53 MET


Hi Rooters,

Lately it's become evident to me that one of the serious problems with 
ROOT is the way it handles sorting. The use everywhere of the generic 
virtual function TObject::Compare is, it seems to me, the root (no pun 
intended) of the problem. There are three major issues:

1) TCollections can hold a list of arbitrary TObjects. This, together 
with the use of the virtual Compare, means that inside the ObjCompare 
routine, you can have inconsistent behavior, e.g., according to one 
object's Compare routine, it is bigger than the other, but according to 
that other object's compare routine IT is bigger than the first one. 
I'm not sure exactly what will happen inside QSort if this condition 
obtains (need to study my algorithms more), but at best you're going to 
get inconsistent and unintuitive sorting results and at worst QSort may 
never terminate.

This condition is extremely hard to foresee and prevent in the present 
scheme, because the Compare routines may have been written by two 
people at very distinct times, who may not even know each other, and 
sometimes without looking at each others source code (eg, using a 
concrete ROOT object). In contrast, if you had a single routine always 
doing the comparison, you could reasonably expect the programmer of 
that routine to make sure that it was correctly anti-commutative and 
failures could be blamed on the user. :)

Now, you may say that sorting lists of arbitrary TObject-descendants is 
not a good idea in the first place, and you're probably right (which is 
why this should have used templates, but I digress). Even so, with 
lists of things which derive from a common base, you could still run 
into this problem if the hierarchy is more than minimally complex. And 
even if you restrict yourself to lists of a single well defined type, 
there are still two more problems.

2) If you want to be able to sort according to different criteria at 
different times, you have to put in some facility using a data member 
(probably static to avoid potential inconsistencies) to change the 
choice of comparision with a switch inside Compare. This is clunky at 
best, and certainly error-prone, and leads inexorably to the last point:

3) If you want to sort according to a criterion that hasn't been 
thought of before, you have to modify the source code of the object 
you're sorting (or copy all of those object over to a new derived class 
that you just created just for the purpose of sorting them... ugh). 
This may not be straightforward, especially if the objects you are 
using come from a third party (such as the ROOT team). To think about 
this, ask yourself how you would go about sorting a TList of TH1F's by 
the number of entries in each histogram. With the current scheme I and 
probably many others would just hack together my own little bubble sort 
and do it myself -- throwing code reuse out the window.

------------------------------------------------------------------------

Lest this seem like unconstructive carping without solutions, let me 
propose my solution:

TList::Sort() and cousins should take a parameter. This parameter 
should be a functor (function object -- ie, an object which overrides 
operator().) or pointer to function taking two TObject pointers and 
returning +1 0 or -1. I prefer the functor option myself, not least 
because it allows you to use inheritance to promote code reuse, and 
also because it would help make the sorting options documentable under 
the Root documentation system. (You could add a functor constructor 
from pointer to function as well, if you wanted. This could even let 
you transparently encapsulate CINT interactive functions if you did it 
right, though if you're performing a significant sort that's probably a 
performance problem.) For backwards compatibility, if no parameter is 
given, Sort would use the object's Compare function.

What do others think about this?

George Heintzelman



This archive was generated by hypermail 2b29 : Tue Jan 02 2001 - 11:50:16 MET