ROOT logo
ROOT » CORE » CONT » TBtree

class TBtree: public TSeqCollection


TBtree

B-tree class. TBtree inherits from the TSeqCollection ABC.



B-tree Implementation notes

This implements B-trees with several refinements. Most of them can be found in Knuth Vol 3, but some were developed to adapt to restrictions imposed by C++. First, a restatement of Knuth's properties that a B-tree must satisfy, assuming we make the enhancement he suggests in the paragraph at the bottom of page 476. Instead of storing null pointers to non-existent nodes (which Knuth calls the leaves) we utilize the space to store keys. Therefore, what Knuth calls level (l-1) is the bottom of our tree, and we call the nodes at this level LeafNodes. Other nodes are called InnerNodes. The other enhancement we have adopted is in the paragraph at the bottom of page 477: overflow control.

The following are modifications of Knuth's properties on page 478:

  1. Every InnerNode has at most Order keys, and at most Order+1 sub-trees.
  2. Every LeafNode has at most 2*(Order+1) keys.
  3. An InnerNode with k keys has k+1 sub-trees.
  4. Every InnerNode that is not the root has at least InnerLowWaterMark keys.
  5. Every LeafNode that is not the root has at least LeafLowWaterMark keys.
  6. If the root is a LeafNode, it has at least one key.
  7. If the root is an InnerNode, it has at least one key and two sub-trees.
  8. All LeafNodes are the same distance from the root as all the other LeafNodes.
  9. For InnerNode n with key n[i].key, then sub-tree n[i-1].tree contains all keys < n[i].key, and sub-tree n[i].tree contains all keys >= n[i].key.
  10. Order is at least 3.

The values of InnerLowWaterMark and LeafLowWaterMark may actually be set by the user when the tree is initialized, but currently they are set automatically to:

        InnerLowWaterMark = ceiling(Order/2)
        LeafLowWaterMark  = Order - 1

If the tree is only filled, then all the nodes will be at least 2/3 full. They will almost all be exactly 2/3 full if the elements are added to the tree in order (either increasing or decreasing). [Knuth says McCreight's experiments showed almost 100% memory utilization. I don't see how that can be given the algorithms that Knuth gives. McCreight must have used a different scheme for balancing. [No, he used a different scheme for splitting: he did a two-way split instead of the three way split as we do here. Which means that McCreight does better on insertion of ordered data, but we should do better on insertion of random data.]]

It must also be noted that B-trees were designed for DISK access algorithms, not necessarily in-memory sorting, as we intend it to be used here. However, if the order is kept small (< 6?) any inefficiency is negligible for in-memory sorting. Knuth points out that balanced trees are actually preferable for memory sorting. I'm not sure that I believe this, but it's interesting. Also, deleting elements from balanced binary trees, being beyond the scope of Knuth's book (p. 465), is beyond my scope. B-trees are good enough.

A B-tree is declared to be of a certain ORDER (3 by default). This number determines the number of keys contained in any interior node of the tree. Each interior node will contain ORDER keys, and therefore ORDER+1 pointers to sub-trees. The keys are numbered and indexed 1 to ORDER while the pointers are numbered and indexed 0 to ORDER. The 0th ptr points to the sub-tree of all elements that are less than key[1]. Ptr[1] points to the sub-tree that contains all the elements greater than key[1] and less than key[2]. etc. The array of pointers and keys is allocated as ORDER+1 pairs of keys and nodes, meaning that one key field (key[0]) is not used and therefore wasted. Given that the number of interior nodes is small, that this waste allows fewer cases of special code, and that it is useful in certain of the methods, it was felt to be a worthwhile waste.

The size of the exterior nodes (leaf nodes) does not need to be related to the size of the interior nodes at all. Since leaf nodes contain only keys, they may be as large or small as we like independent of the size of the interior nodes. For no particular reason other than it seems like a good idea, we will allocate 2*(ORDER+1) keys in each leaf node, and they will be numbered and indexed from 0 to 2*ORDER+1. It does have the advantage of keeping the size of the leaf and interior arrays the same, so that if we find allocation and de-allocation of these arrays expensive, we can modify their allocation to use a garbage ring, or something.

Both of these numbers will be run-time constants associated with each tree (each tree at run-time can be of a different order). The variable "order" is the order of the tree, and the inclusive upper limit on the indices of the keys in the interior nodes. The variable "order2" is the inclusive upper limit on the indices of the leaf nodes, and is designed

> (1) to keep the sizes of the two kinds of nodes the same; (2) to keep the expressions involving the arrays of keys looking somewhat the same: lower limit upper limit for inner nodes: 1 order for leaf nodes: 0 order2 Remember that index 0 of the inner nodes is special.

Currently, order2 = 2*(order+1).

> Picture: (also see Knuth Vol 3 pg 478) +--+--+--+--+--+--... | | | | | | parent--->| | | | | | | | +*-+*-+*-+--+--+--... | | | +----+ | +-----+ | +-----+ | V | V +----------+ | +----------+ | | | | | this->| | | | |<--sib +----------+ | +----------+ V data

It is conceptually VERY convenient to think of the data as being the very first element of the sib node. Any primitive that tells sib to perform some action on n nodes should include this 'hidden' element. For InnerNodes, the hidden element has (physical) index 0 in the array, and in LeafNodes, the hidden element has (virtual) index -1 in the array. Therefore, there are two 'size' primitives for nodes:

> Psize - the physical size: how many elements are contained in the array in the node. Vsize - the 'virtual' size; if the node is pointed to by element 0 of the parent node, then Vsize == Psize; otherwise the element in the parent item that points to this node 'belongs' to this node, and Vsize == Psize+1;

Parent nodes are always InnerNodes.

These are the primitive operations on Nodes:

> Append(elt) - adds an element to the end of the array of elements in a node. It must never be called where appending the element would fill the node. Split() - divide a node in two, and create two new nodes. SplitWith(sib) - create a third node between this node and the sib node, divvying up the elements of their arrays. PushLeft(n) - move n elements into the left sibling PushRight(n) - move n elements into the right sibling BalanceWithRight() - even up the number of elements in the two nodes. BalanceWithLeft() - ditto

To allow this implementation of btrees to also be an implementation of sorted arrays/lists, the overhead is included to allow O(log n) access of elements by their rank (`give me the 5th largest element'). Therefore, each Item keeps track of the number of keys in and below it in the tree (remember, each item's tree is all keys to the RIGHT of the item's own key).

> [ [ < 0 1 2 3 > 4 < 5 6 7 > 8 < 9 10 11 12 > ] 13 [ < 14 15 16 > 17 < 18 19 20 > ] ] 4 1 1 1 1 4 1 1 1 5 1 1 1 1 7 3 1 1 1 4 1 1 1

Function Members (Methods)

public:
TBtree(Int_t ordern = 3)
virtual~TBtree()
voidTObject::AbstractMethod(const char* method) const
virtual voidAdd(TObject* obj)
virtual voidAddAfter(const TObject*, TObject* obj)
virtual voidTCollection::AddAll(const TCollection* col)
virtual voidAddAt(TObject* obj, Int_t)
virtual voidAddBefore(const TObject*, TObject* obj)
virtual voidAddFirst(TObject* obj)
virtual voidAddLast(TObject* obj)
voidTCollection::AddVector(TObject* obj1)
virtual TObject*After(const TObject* obj) const
virtual voidTObject::AppendPad(Option_t* option = "")
Bool_tTCollection::AssertClass(TClass* cl) const
virtual TObject*At(Int_t i) const
virtual TObject*Before(const TObject* obj) const
virtual voidTCollection::Browse(TBrowser* b)
Int_tTCollection::Capacity() const
static TClass*Class()
virtual const char*TObject::ClassName() const
virtual voidClear(Option_t* option = "")
virtual TObject*TCollection::Clone(const char* newname = "") const
virtual Int_tTCollection::Compare(const TObject* obj) const
Bool_tTCollection::Contains(const char* name) const
Bool_tTCollection::Contains(const TObject* obj) const
virtual voidTObject::Copy(TObject& object) const
virtual voidDelete(Option_t* option = "")
virtual Int_tTObject::DistancetoPrimitive(Int_t px, Int_t py)
virtual voidTCollection::Draw(Option_t* option = "")
virtual voidTObject::DrawClass() constMENU
virtual TObject*TObject::DrawClone(Option_t* option = "") constMENU
virtual voidTCollection::Dump() const
static voidTCollection::EmptyGarbageCollection()
virtual voidTObject::Error(const char* method, const char* msgfmt) const
virtual voidTObject::Execute(const char* method, const char* params, Int_t* error = 0)
virtual voidTObject::Execute(TMethod* method, TObjArray* params, Int_t* error = 0)
virtual voidTObject::ExecuteEvent(Int_t event, Int_t px, Int_t py)
virtual voidTObject::Fatal(const char* method, const char* msgfmt) const
virtual TObject*FindObject(const char* name) const
virtual TObject*FindObject(const TObject* obj) const
virtual TObject*First() const
static voidTCollection::GarbageCollect(TObject* obj)
static TCollection*TCollection::GetCurrentCollection()
virtual Option_t*TObject::GetDrawOption() const
static Long_tTObject::GetDtorOnly()
virtual Int_tTCollection::GetEntries() const
virtual const char*TObject::GetIconName() const
virtual Int_tTSeqCollection::GetLast() const
virtual const char*TCollection::GetName() const
virtual char*TObject::GetObjectInfo(Int_t px, Int_t py) const
virtual TObject**GetObjectRef(const TObject*) const
static Bool_tTObject::GetObjectStat()
virtual Option_t*TObject::GetOption() const
virtual Int_tTCollection::GetSize() const
virtual const char*TObject::GetTitle() const
virtual UInt_tTObject::GetUniqueID() const
virtual Int_tTCollection::GrowBy(Int_t delta) const
virtual Bool_tTObject::HandleTimer(TTimer* timer)
virtual ULong_tTCollection::Hash() const
virtual Int_tTSeqCollection::IndexOf(const TObject* obj) const
virtual voidTObject::Info(const char* method, const char* msgfmt) const
virtual Bool_tTObject::InheritsFrom(const char* classname) const
virtual Bool_tTObject::InheritsFrom(const TClass* cl) const
virtual voidTObject::Inspect() constMENU
voidTObject::InvertBit(UInt_t f)
virtual TClass*IsA() const
Bool_tTCollection::IsArgNull(const char* where, const TObject* obj) const
virtual Bool_tTCollection::IsEmpty() const
virtual Bool_tTObject::IsEqual(const TObject* obj) const
virtual Bool_tTCollection::IsFolder() const
Bool_tTObject::IsOnHeap() const
Bool_tTCollection::IsOwner() const
virtual Bool_tTCollection::IsSortable() const
virtual Bool_tTSeqCollection::IsSorted() const
Bool_tTObject::IsZombie() const
virtual TObject*Last() const
Int_tTSeqCollection::LastIndex() const
virtual voidTCollection::ls(Option_t* option = "") const
virtual TIterator*MakeIterator(Bool_t dir = kIterForward) const
virtual TIterator*TCollection::MakeReverseIterator() const
voidTObject::MayNotUse(const char* method) const
Long64_tTSeqCollection::Merge(TCollection* list)
virtual Bool_tTObject::Notify()
static Int_tTSeqCollection::ObjCompare(TObject* a, TObject* b)
voidTObject::Obsolete(const char* method, const char* asOfVers, const char* removedFromVers) const
static voidTObject::operator delete(void* ptr)
static voidTObject::operator delete(void* ptr, void* vp)
static voidTObject::operator delete[](void* ptr)
static voidTObject::operator delete[](void* ptr, void* vp)
void*TObject::operator new(size_t sz)
void*TObject::operator new(size_t sz, void* vp)
void*TObject::operator new[](size_t sz)
void*TObject::operator new[](size_t sz, void* vp)
TObject*TCollection::operator()(const char* name) const
TObject&TObject::operator=(const TObject& rhs)
TObject*operator[](Int_t i) const
Int_tOrder()
virtual voidTCollection::Paint(Option_t* option = "")
virtual voidTObject::Pop()
virtual voidTCollection::Print(Option_t* option = "") const
virtual voidTCollection::Print(Option_t* option, Int_t recurse) const
virtual voidTCollection::Print(Option_t* option, const char* wildcard, Int_t recurse = 1) const
virtual voidTCollection::Print(Option_t* option, TPRegexp& regexp, Int_t recurse = 1) const
static voidTSeqCollection::QSort(TObject** a, Int_t first, Int_t last)
static voidTSeqCollection::QSort(TObject** a, TObject** b, Int_t first, Int_t last)
static voidTSeqCollection::QSort(TObject** a, Int_t nBs, TObject*** b, Int_t first, Int_t last)
Int_tRank(const TObject* obj) const
virtual Int_tTObject::Read(const char* name)
virtual voidTCollection::RecursiveRemove(TObject* obj)
virtual TObject*Remove(TObject* obj)
virtual voidTSeqCollection::RemoveAfter(TObject* after)
voidTCollection::RemoveAll()
virtual voidTCollection::RemoveAll(TCollection* col)
virtual TObject*TSeqCollection::RemoveAt(Int_t idx)
virtual voidTSeqCollection::RemoveBefore(TObject* before)
virtual voidTSeqCollection::RemoveFirst()
virtual voidTSeqCollection::RemoveLast()
voidTObject::ResetBit(UInt_t f)
virtual voidTObject::SaveAs(const char* filename = "", Option_t* option = "") constMENU
virtual voidTObject::SavePrimitive(ostream& out, Option_t* option = "")
voidTObject::SetBit(UInt_t f)
voidTObject::SetBit(UInt_t f, Bool_t set)
voidTCollection::SetCurrentCollection()
virtual voidTObject::SetDrawOption(Option_t* option = "")MENU
static voidTObject::SetDtorOnly(void* obj)
voidTCollection::SetName(const char* name)
static voidTObject::SetObjectStat(Bool_t stat)
virtual voidTCollection::SetOwner(Bool_t enable = kTRUE)
virtual voidTObject::SetUniqueID(UInt_t uid)
virtual voidShowMembers(TMemberInspector&)
static voidTCollection::StartGarbageCollection()
virtual voidStreamer(TBuffer&)
voidStreamerNVirtual(TBuffer& ClassDef_StreamerNVirtual_b)
virtual voidTObject::SysError(const char* method, const char* msgfmt) const
Bool_tTObject::TestBit(UInt_t f) const
Int_tTObject::TestBits(UInt_t f) const
voidTSeqCollection::UnSort()
virtual voidTObject::UseCurrentStyle()
virtual voidTObject::Warning(const char* method, const char* msgfmt) const
virtual Int_tTCollection::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0)
virtual Int_tTCollection::Write(const char* name = 0, Int_t option = 0, Int_t bufsize = 0) const
protected:
virtual voidTSeqCollection::Changed()
voidDecrNofKeys()
virtual voidTObject::DoError(int level, const char* location, const char* fmt, va_list va) const
virtual const char*TCollection::GetCollectionEntryName(TObject* entry) const
Int_tIdxAdd(const TObject& obj)
voidIncrNofKeys()
voidTObject::MakeZombie()
virtual voidTCollection::PrintCollectionEntry(TObject* entry, Option_t* option, Int_t recurse) const
virtual voidTCollection::PrintCollectionHeader(Option_t* option) const
private:
voidInit(Int_t i)
voidRootIsEmpty()
voidRootIsFull()

Data Members

protected:
TStringTCollection::fNamename of the collection
Int_tTCollection::fSizenumber of elements in collection
Bool_tTSeqCollection::fSortedtrue if collection has been sorted
private:
Int_tfInnerLowWaterMarkinner node low water mark
Int_tfInnerMaxIndexmaximum inner node index
Int_tfLeafLowWaterMarkleaf low water mark
Int_tfLeafMaxIndexmaximum leaf index
Int_tfOrderthe order of the tree (should be > 2)
Int_tfOrder2order*2+1 (assumes a memory access is
TBtNode*fRootroot node of btree

Class Charts

Inheritance Inherited Members Includes Libraries
Class Charts

Function documentation

TBtree(Int_t ordern = 3)
 Create a B-tree of certain order (by default 3).
~TBtree()
 Delete B-tree. Objects are not deleted unless the TBtree is the
 owner (set via SetOwner()).
void Add(TObject* obj)
 Add object to B-tree.
TObject * After(const TObject* obj) const
 Cannot use this method since B-tree decides order.
TObject * Before(const TObject* obj) const
 May not use this method since B-tree decides order.
void Clear(Option_t* option = "")
 Remove all objects from B-tree. Does NOT delete objects unless the TBtree
 is the owner (set via SetOwner()).
void Delete(Option_t* option = "")
 Remove all objects from B-tree AND delete all heap based objects.
TObject * FindObject(const char* name) const
 Find object using its name (see object's GetName()). Requires sequential
 search of complete tree till object is found.
TObject * FindObject(const TObject* obj) const
 Find object using the objects Compare() member function.
Int_t IdxAdd(const TObject& obj)
 Add object and return its index in the tree.
void Init(Int_t i)
 Initialize a B-tree.
TIterator * MakeIterator(Bool_t dir = kIterForward) const
 Returns a B-tree iterator.
Int_t Rank(const TObject* obj) const
 Returns the rank of the object in the tree.
TObject * Remove(TObject* obj)
 Remove an object from the tree.
void RootIsFull()
 The root of the tree is full. Create an InnerNode that
 points to it, and then inform the InnerNode that it is full.
void RootIsEmpty()
 If root is empty clean up its space.
void Streamer(TBuffer& )
 Stream all objects in the btree to or from the I/O buffer.
TObject * operator[](Int_t i) const
TObject * At(Int_t i) const
TObject * First() const
TObject * Last() const
void IncrNofKeys()
{ fSize++; }
void DecrNofKeys()
{ fSize--; }
TObject ** GetObjectRef(const TObject* ) const
{ return 0; }
void AddFirst(TObject* obj)
{ Add(obj); }
void AddLast(TObject* obj)
{ Add(obj); }
void AddAt(TObject* obj, Int_t )
{ Add(obj); }
void AddAfter(const TObject* , TObject* obj)
{ Add(obj); }
void AddBefore(const TObject* , TObject* obj)
{ Add(obj); }
Int_t Order()
void PrintOn(ostream &os) const;
{ return fOrder; }