Re: ROOT and sorting

From: Pasha Murat (630)840-8237 FNAL (630)859-3463 home (murat@murat.fnal.gov)
Date: Tue Jan 04 2000 - 17:42:52 MET


Hi George,

it would be nice if you give an example of inhomogenious collection 
you've been trying to sort. Without the example it is hard to see what 
is the problem you're trying to solve.
					-best, Pasha

George Heintzelman wrote:
> 
> Hi Rooters,
> 
> Lately it's become evident to me that one of the serious problems with
> ROOT is the way it handles sorting. The use everywhere of the generic
> virtual function TObject::Compare is, it seems to me, the root (no pun
> intended) of the problem. There are three major issues:
> 
> 1) TCollections can hold a list of arbitrary TObjects. This, together
> with the use of the virtual Compare, means that inside the ObjCompare
> routine, you can have inconsistent behavior, e.g., according to one
> object's Compare routine, it is bigger than the other, but according to
> that other object's compare routine IT is bigger than the first one.
> I'm not sure exactly what will happen inside QSort if this condition
> obtains (need to study my algorithms more), but at best you're going to
> get inconsistent and unintuitive sorting results and at worst QSort may
> never terminate.
> 
> This condition is extremely hard to foresee and prevent in the present
> scheme, because the Compare routines may have been written by two
> people at very distinct times, who may not even know each other, and
> sometimes without looking at each others source code (eg, using a
> concrete ROOT object). In contrast, if you had a single routine always
> doing the comparison, you could reasonably expect the programmer of
> that routine to make sure that it was correctly anti-commutative and
> failures could be blamed on the user. :)
> 
> Now, you may say that sorting lists of arbitrary TObject-descendants is
> not a good idea in the first place, and you're probably right (which is
> why this should have used templates, but I digress). Even so, with
> lists of things which derive from a common base, you could still run
> into this problem if the hierarchy is more than minimally complex. And
> even if you restrict yourself to lists of a single well defined type,
> there are still two more problems.
> 
> 2) If you want to be able to sort according to different criteria at
> different times, you have to put in some facility using a data member
> (probably static to avoid potential inconsistencies) to change the
> choice of comparision with a switch inside Compare. This is clunky at
> best, and certainly error-prone, and leads inexorably to the last point:
> 
> 3) If you want to sort according to a criterion that hasn't been
> thought of before, you have to modify the source code of the object
> you're sorting (or copy all of those object over to a new derived class
> that you just created just for the purpose of sorting them... ugh).
> This may not be straightforward, especially if the objects you are
> using come from a third party (such as the ROOT team). To think about
> this, ask yourself how you would go about sorting a TList of TH1F's by
> the number of entries in each histogram. With the current scheme I and
> probably many others would just hack together my own little bubble sort
> and do it myself -- throwing code reuse out the window.
> 
> ------------------------------------------------------------------------
> 
> Lest this seem like unconstructive carping without solutions, let me
> propose my solution:
> 
> TList::Sort() and cousins should take a parameter. This parameter
> should be a functor (function object -- ie, an object which overrides
> operator().) or pointer to function taking two TObject pointers and
> returning +1 0 or -1. I prefer the functor option myself, not least
> because it allows you to use inheritance to promote code reuse, and
> also because it would help make the sorting options documentable under
> the Root documentation system. (You could add a functor constructor
> from pointer to function as well, if you wanted. This could even let
> you transparently encapsulate CINT interactive functions if you did it
> right, though if you're performing a significant sort that's probably a
> performance problem.) For backwards compatibility, if no parameter is
> given, Sort would use the object's Compare function.
> 
> What do others think about this?
> 
> George Heintzelman



This archive was generated by hypermail 2b29 : Tue Jan 02 2001 - 11:50:16 MET