Logo ROOT  
Reference Guide
 
Loading...
Searching...
No Matches
TTreeCache.cxx
Go to the documentation of this file.
1// @(#)root/tree:$Id$
2// Author: Rene Brun 04/06/2006
3
4/*************************************************************************
5 * Copyright (C) 1995-2018, Rene Brun and Fons Rademakers. *
6 * All rights reserved. *
7 * *
8 * For the licensing terms see $ROOTSYS/LICENSE. *
9 * For the list of contributors see $ROOTSYS/README/CREDITS. *
10 *************************************************************************/
11
12/** \class TTreeCache
13\ingroup tree
14\brief A cache to speed-up the reading of ROOT datasets
15
16# A cache to speed-up the reading of ROOT datasets
17
18## Table of Contents
19- [Motivation](\ref motivation)
20- [General Description](\ref description)
21- [Changes in behaviour](\ref changesbehaviour)
22- [Self-optimization](\ref cachemisses)
23- [Examples of usage](\ref examples)
24- [Check performance and stats](\ref checkPerf)
25
26\anchor motivation
27## Motivation: why having a cache is needed?
28
29When writing a TTree, the branch buffers are kept in memory.
30A typical branch buffersize (before compression) is typically 32 KBytes.
31After compression, the zipped buffer may be just a few Kbytes.
32The branch buffers cannot be much larger in case of TTrees with several
33hundred or thousand branches.
34
35When writing, this does not generate a performance problem because branch
36buffers are always written sequentially and, thanks to OS optimisations,
37content is flushed to the output file when a few MBytes of data are available.
38On the other hand, when reading, one may hit performance problems because of
39latencies e.g imposed by network.
40For example in a WAN with 10ms latency, reading 1000 buffers of 10 KBytes each
41with no cache will imply 10s penalty where a local read of the 10 MBytes would
42take about 1 second.
43
44The TreeCache tries to prefetch all the buffers for the selected branches
45in order to transfer a few multi-Megabytes large buffers instead of many
46multi-kilobytes small buffers. In addition, TTreeCache can sort the blocks to
47be read in increasing order such that the file is read sequentially.
48
49Systems like xrootd, dCache or httpd take advantage of the TTreeCache in
50reading ahead as much data as they can and return to the application
51the maximum data specified in the cache and have the next chunk of data ready
52when the next request comes.
53
54### Are there cases for which the usage of TTreeCache is detrimental for performance?
55Yes, some corner cases. For example, when reading only a small fraction of all
56entries such that not all branch buffers are read.
57
58\anchor description
59## General Description
60This class acts as a file cache, registering automatically the baskets from
61the branches being processed via direct manipulation of TTrees or with tools
62such as TTree::Draw, TTree::Process, TSelector, TTreeReader and RDataFrame
63when in the learning phase. The learning phase is by default 100 entries.
64It can be changed via TTreeCache::SetLearnEntries.
65
66The usage of a TTreeCache can considerably improve the runtime performance at
67the price of a modest investment in memory, in particular when the TTree is
68accessed remotely, e.g. via a high latency network.
69
70For each TTree being processed a TTreeCache object is created.
71This object is automatically deleted when the Tree is deleted or
72when the file is deleted.
73The user can change the size of the cache with the TTree::SetCacheSize method
74(by default the size is 30 Megabytes). This feature can be controlled with the
75environment variable `ROOT_TTREECACHE_SIZE` or the TTreeCache.Size option.
76The entry range for which the cache is active can also be set with the
77SetEntryRange method.
78
79\anchor changesbehaviour
80## Changes of behavior when using TChain and TEventList
81
82The usage of TChain or TEventList have influence on the behaviour of the cache:
83
84- Special case of a TChain
85 Once the training is done on the first Tree, the list of branches
86 in the cache is kept for the following files.
87
88- Special case of a TEventlist
89 if the Tree or TChain has a TEventlist, only the buffers
90 referenced by the list are put in the cache.
91
92The learning phase is started or restarted when:
93 - TTree automatically creates a cache.
94 - TTree::SetCacheSize is called with a non-zero size and a cache
95 did not previously exist
96 - TTreeCache::StartLearningPhase is called.
97 - TTreeCache::SetEntryRange is called
98 * and the learning is not yet finished
99 * and has not been set to manual
100 * and the new minimun entry is different.
101
102The learning period is stopped (and prefetching is started) when:
103 - TTreeCache::StopLearningPhase is called.
104 - An entry outside the 'learning' range is requested
105 The 'learning range is from fEntryMin (default to 0) to
106 fEntryMin + fgLearnEntries.
107 - A 'cached' TChain switches over to a new file.
108
109
110\anchor cachemisses
111## Self-optimization in presence of cache misses
112
113The TTreeCache can optimize its behavior on a cache miss. When
114miss optimization is enabled (see the SetOptimizeMisses method),
115it tracks all branches utilized after the learning phase which caused a cache
116miss.
117When one cache miss occurs, all the utilized branches are be prefetched
118for that event. This optimization utilizes the observation that infrequently
119accessed branches are often accessed together.
120An example scenario where such behavior is desirable, is an analysis where
121a set of collections are read only for a few events in which a certain
122condition is respected, e.g. a trigger fired.
123
124### Additional memory and CPU usage when optimizing for cache misses
125When this mode is enabled, the memory dedicated to the cache can increase
126by at most a factor two in the case of cache miss.
127Additionally, on the first miss of an event, we must iterate through all the
128"active branches" for the miss cache and find the correct basket.
129This can be potentially a CPU-expensive operation compared to, e.g., the
130latency of a SSD. This is why the miss cache is currently disabled by default.
131
132\anchor examples
133## Example usages of TTreeCache
134
135A few use cases are discussed below. A cache may be created with automatic
136sizing when a TTree is used:
137
138In some applications, e.g. central processing workflows of experiments, the list
139of branches to read is known a priori. For these cases, the TTreeCache can be
140instructed about the branches which will be read via explicit calls to the TTree
141or TTreeCache interfaces.
142In less streamlined applications such as analysis, predicting the branches which
143will be read can be difficult. In such cases, ROOT I/O flags used branches
144automatically when a branch buffer is read during the learning phase.
145
146In the examples below, portions of analysis code are shown.
147The few statements involving the TreeCache are marked with `//<<<`
148
149### ROOT::RDataFrame and TTreeReader Examples
150
151If you use RDataFrame or TTreeReader, the system will automatically cache the
152best set of branches: no action is required by the user.
153
154### TTree::Draw Example
155
156The TreeCache is automatically used by TTree::Draw. The method knows
157which branches are used in the query and it puts automatically these branches
158in the cache. The entry range is also inferred automatically.
159
160### TTree::Process and TSelectors Examples
161
162The user must enable the cache and tell the system which branches to cache
163and also specify the entry range. It is important to specify the entry range
164in case only a subset of the events is processed to avoid wasteful caching.
165
166#### Reading all branches
167
168~~~ {.cpp}
169 TTree *T;
170 f->GetObject(T, "mytree");
171 auto nentries = T->GetEntries();
172 auto cachesize = 10000000U; // 10 MBytes
173 T->SetCacheSize(cachesize); //<<<
174 T->AddBranchToCache("*", true); //<<< add all branches to the cache
175 T->Process("myselector.C+");
176 // In the TSelector::Process function we read all branches
177 T->GetEntry(i);
178 // ... Here the entry is processed
179~~~
180
181#### Reading a subset of all branches
182
183In the Process function we read a subset of the branches.
184Only the branches used in the first entry will be put in the cache
185~~~ {.cpp}
186 TTree *T;
187 f->GetObject(T, "mytree");
188 // We want to process only the 200 first entries
189 auto nentries=200UL;
190 auto efirst = 0;
191 auto elast = efirst+nentries;
192 auto cachesize = 10000000U; // 10 MBytes
193 TTreeCache::SetLearnEntries(1); //<<< we can take the decision after 1 entry
194 T->SetCacheSize(cachesize); //<<<
195 T->SetCacheEntryRange(efirst,elast); //<<<
196 T->Process("myselector.C+","",nentries,efirst);
197 // In the TSelector::Process we read only 2 branches
198 auto b1 = T->GetBranch("branch1");
199 b1->GetEntry(i);
200 if (somecondition) return;
201 auto b2 = T->GetBranch("branch2");
202 b2->GetEntry(i);
203 ... Here the entry is processed
204~~~
205### Custom event loop
206
207#### Always using the same two branches
208
209In this example, exactly two branches are always used: those need to be
210prefetched.
211~~~ {.cpp}
212 TTree *T;
213 f->GetObject(T, "mytree");
214 auto b1 = T->GetBranch("branch1");
215 auto b2 = T->GetBranch("branch2");
216 auto nentries = T->GetEntries();
217 auto cachesize = 10000000U; //10 MBytes
218 T->SetCacheSize(cachesize); //<<<
219 T->AddBranchToCache(b1, true); //<<< add branch1 and branch2 to the cache
220 T->AddBranchToCache(b2, true); //<<<
221 T->StopCacheLearningPhase(); //<<< we do not need the system to guess anything
222 for (auto i : TSeqL(nentries)) {
223 T->LoadTree(i); //<<< important call when calling TBranch::GetEntry after
224 b1->GetEntry(i);
225 if (some condition not met) continue;
226 b2->GetEntry(i);
227 if (some condition not met) continue;
228 // Here we read the full event only in some rare cases.
229 // There is no point in caching the other branches as it might be
230 // more economical to read only the branch buffers really used.
231 T->GetEntry(i);
232 ... Here the entry is processed
233 }
234~~~
235#### Always using at least the same two branches
236
237In this example, two branches are always used: in addition, some analysis
238functions are invoked and those may trigger the reading of other branches which
239are a priori not known.
240There is no point in prefetching branches that will be used very rarely: we can
241rely on the system to cache the right branches.
242~~~ {.cpp}
243 TTree *T;
244 f->GetObject(T, "mytree");
245 auto nentries = T->GetEntries();
246 auto cachesize = 10000000; //10 MBytes
247 T->SetCacheSize(cachesize); //<<<
248 T->SetCacheLearnEntries(5); //<<< we can take the decision after 5 entries
249 auto b1 = T->GetBranch("branch1");
250 auto b2 = T->GetBranch("branch2");
251 for (auto i : TSeqL(nentries)) {
252 T->LoadTree(i);
253 b1->GetEntry(i);
254 if (some condition not met) continue;
255 b2->GetEntry(i);
256 // At this point we may call a user function where a few more branches
257 // will be read conditionally. These branches will be put in the cache
258 // if they have been used in the first 10 entries
259 if (some condition not met) continue;
260 // Here we read the full event only in some rare cases.
261 // There is no point in caching the other branches as it might be
262 // more economical to read only the branch buffers really used.
263 T->GetEntry(i);
264 .. process the rare but interesting cases.
265 ... Here the entry is processed
266 }
267~~~
268
269\anchor checkPerf
270## How can the usage and performance of TTreeCache be verified?
271
272Once the event loop terminated, the number of effective system reads for a
273given file can be checked with a code like the following:
274~~~ {.cpp}
275 printf("Reading %lld bytes in %d transactions\n",myTFilePtr->GetBytesRead(), f->GetReadCalls());
276~~~
277
278Another handy command is:
279~~~ {.cpp}
280myTreeOrChain.GetTree()->PrintCacheStats();
281~~~
282
283*/
284
285#include "TSystem.h"
286#include "TEnv.h"
287#include "TTreeCache.h"
288#include "TChain.h"
289#include "TList.h"
290#include "TBranch.h"
291#include "TBranchElement.h"
292#include "TEventList.h"
293#include "TObjArray.h"
294#include "TObjString.h"
295#include "TRegexp.h"
296#include "TLeaf.h"
297#include "TFriendElement.h"
298#include "TFile.h"
299#include "TMath.h"
300#include "TBranchCacheInfo.h"
301#include "TVirtualPerfStats.h"
302#include <limits.h>
303
305
307
308////////////////////////////////////////////////////////////////////////////////
309/// Default Constructor.
310
311TTreeCache::TTreeCache() : TFileCacheRead(), fPrefillType(GetConfiguredPrefillType())
312{
313}
314
315////////////////////////////////////////////////////////////////////////////////
316/// Constructor.
317
319 : TFileCacheRead(tree->GetCurrentFile(), buffersize, tree), fEntryMax(tree->GetEntriesFast()), fEntryNext(0),
320 fBrNames(new TList), fTree(tree), fPrefillType(GetConfiguredPrefillType())
321{
323 Int_t nleaves = tree->GetListOfLeaves()->GetEntriesFast();
324 fBranches = new TObjArray(nleaves);
325}
326
327////////////////////////////////////////////////////////////////////////////////
328/// Destructor. (in general called by the TFile destructor)
329
331{
332 // Informe the TFile that we have been deleted (in case
333 // we are deleted explicitly by legacy user code).
334 if (fFile) fFile->SetCacheRead(0, fTree);
335
336 delete fBranches;
337 if (fBrNames) {fBrNames->Delete(); delete fBrNames; fBrNames=0;}
338}
339
340////////////////////////////////////////////////////////////////////////////////
341/// Add a branch discovered by actual usage to the list of branches to be stored
342/// in the cache this function is called by TBranch::GetBasket
343/// If we are not longer in the training phase this is an error.
344/// Returns:
345/// - 0 branch added or already included
346/// - -1 on error
347
348Int_t TTreeCache::LearnBranch(TBranch *b, Bool_t subbranches /*= kFALSE*/)
349{
350 if (!fIsLearning) {
351 return -1;
352 }
353
354 // Reject branch that are not from the cached tree.
355 if (!b || fTree->GetTree() != b->GetTree()) return -1;
356
357 // Is this the first addition of a branch (and we are learning and we are in
358 // the expected TTree), then prefill the cache. (We expect that in future
359 // release the Prefill-ing will be the default so we test for that inside the
360 // LearnPrefill call).
362
363 return AddBranch(b, subbranches);
364}
365
366////////////////////////////////////////////////////////////////////////////////
367/// Add a branch to the list of branches to be stored in the cache
368/// this function is called by the user via TTree::AddBranchToCache.
369/// The branch is added even if we are outside of the training phase.
370/// Returns:
371/// - 0 branch added or already included
372/// - -1 on error
373
374Int_t TTreeCache::AddBranch(TBranch *b, Bool_t subbranches /*= kFALSE*/)
375{
376 // Reject branch that are not from the cached tree.
377 if (!b || fTree->GetTree() != b->GetTree()) return -1;
378
379 //Is branch already in the cache?
380 Bool_t isNew = kTRUE;
381 for (int i=0;i<fNbranches;i++) {
382 if (fBranches->UncheckedAt(i) == b) {isNew = kFALSE; break;}
383 }
384 if (isNew) {
385 fTree = b->GetTree();
387 const char *bname = b->GetName();
388 if (fTree->IsA() == TChain::Class()) {
389 // If we have a TChain, we will need to use the branch name
390 // and we better disambiguate them (see atlasFlushed.root for example)
391 // in order to cache all the requested branches.
392 // We do not do this all the time as GetMother is slow (it contains
393 // a linear search from list of top level branch).
394 TString build;
395 const char *mothername = b->GetMother()->GetName();
396 if (b != b->GetMother() && mothername[strlen(mothername)-1] != '.') {
397 // Maybe we ought to prefix the name to avoid ambiguity.
398 auto bem = dynamic_cast<TBranchElement*>(b->GetMother());
399 if (bem->GetType() < 3) {
400 // Not a collection.
401 build = mothername;
402 build.Append(".");
403 if (strncmp(bname,build.Data(),build.Length()) != 0) {
404 build.Append(bname);
405 bname = build.Data();
406 }
407 }
408 }
409 }
410 fBrNames->Add(new TObjString(bname));
411 fNbranches++;
412 if (gDebug > 0) printf("Entry: %lld, registering branch: %s\n",b->GetTree()->GetReadEntry(),b->GetName());
413 }
414
415 // process subbranches
416 Int_t res = 0;
417 if (subbranches) {
418 TObjArray *lb = b->GetListOfBranches();
419 Int_t nb = lb->GetEntriesFast();
420 for (Int_t j = 0; j < nb; j++) {
421 TBranch* branch = (TBranch*) lb->UncheckedAt(j);
422 if (!branch) continue;
423 if (AddBranch(branch, subbranches)<0) {
424 res = -1;
425 }
426 }
427 }
428 return res;
429}
430
431////////////////////////////////////////////////////////////////////////////////
432/// Add a branch to the list of branches to be stored in the cache
433/// this is to be used by user (thats why we pass the name of the branch).
434/// It works in exactly the same way as TTree::SetBranchStatus so you
435/// probably want to look over there for details about the use of bname
436/// with regular expressions.
437/// The branches are taken with respect to the Owner of this TTreeCache
438/// (i.e. the original Tree)
439/// NB: if bname="*" all branches are put in the cache and the learning phase stopped
440/// Returns:
441/// - 0 branch added or already included
442/// - -1 on error
443
444Int_t TTreeCache::AddBranch(const char *bname, Bool_t subbranches /*= kFALSE*/)
445{
446 TBranch *branch, *bcount;
447 TLeaf *leaf, *leafcount;
448
449 Int_t i;
450 Int_t nleaves = (fTree->GetListOfLeaves())->GetEntriesFast();
451 TRegexp re(bname,kTRUE);
452 Int_t nb = 0;
453 Int_t res = 0;
454
455 // first pass, loop on all branches
456 // for leafcount branches activate/deactivate in function of status
457 Bool_t all = kFALSE;
458 if (!strcmp(bname,"*")) all = kTRUE;
459 for (i=0;i<nleaves;i++) {
460 leaf = (TLeaf*)(fTree->GetListOfLeaves())->UncheckedAt(i);
461 branch = (TBranch*)leaf->GetBranch();
462 TString s = branch->GetName();
463 if (!all) { //Regexp gives wrong result for [] in name
464 TString longname;
465 longname.Form("%s.%s",fTree->GetName(),branch->GetName());
466 if (strcmp(bname,branch->GetName())
467 && longname != bname
468 && s.Index(re) == kNPOS) continue;
469 }
470 nb++;
471 if (AddBranch(branch, subbranches)<0) {
472 res = -1;
473 }
474 leafcount = leaf->GetLeafCount();
475 if (leafcount && !all) {
476 bcount = leafcount->GetBranch();
477 if (AddBranch(bcount, subbranches)<0) {
478 res = -1;
479 }
480 }
481 }
482 if (nb==0 && strchr(bname,'*')==0) {
483 branch = fTree->GetBranch(bname);
484 if (branch) {
485 if (AddBranch(branch, subbranches)<0) {
486 res = -1;
487 }
488 ++nb;
489 }
490 }
491
492 //search in list of friends
493 UInt_t foundInFriend = 0;
494 if (fTree->GetListOfFriends()) {
495 TIter nextf(fTree->GetListOfFriends());
496 TFriendElement *fe;
498 while ((fe = (TFriendElement*)nextf())) {
499 TTree *t = fe->GetTree();
500 if (t==0) continue;
501
502 // If the alias is present replace it with the real name.
503 char *subbranch = (char*)strstr(bname,fe->GetName());
504 if (subbranch!=bname) subbranch = 0;
505 if (subbranch) {
506 subbranch += strlen(fe->GetName());
507 if ( *subbranch != '.' ) subbranch = 0;
508 else subbranch ++;
509 }
510 if (subbranch) {
511 name.Form("%s.%s",t->GetName(),subbranch);
512 if (name != bname && AddBranch(name, subbranches)<0) {
513 res = -1;
514 }
515 ++foundInFriend;
516 }
517 }
518 }
519 if (!nb && !foundInFriend) {
520 if (gDebug > 0) printf("AddBranch: unknown branch -> %s \n", bname);
521 Error("AddBranch", "unknown branch -> %s", bname);
522 return -1;
523 }
524 //if all branches are selected stop the learning phase
525 if (*bname == '*') {
526 fEntryNext = -1; // We are likely to have change the set of branches, so for the [re-]reading of the cluster.
528 }
529 return res;
530}
531
532////////////////////////////////////////////////////////////////////////////////
533/// Remove a branch to the list of branches to be stored in the cache
534/// this function is called by TBranch::GetBasket.
535/// Returns:
536/// - 0 branch dropped or not in cache
537/// - -1 on error
538
539Int_t TTreeCache::DropBranch(TBranch *b, Bool_t subbranches /*= kFALSE*/)
540{
541 if (!fIsLearning) {
542 return -1;
543 }
544
545 // Reject branch that are not from the cached tree.
546 if (!b || fTree->GetTree() != b->GetTree()) return -1;
547
548 //Is branch already in the cache?
549 if (fBranches->Remove(b)) {
550 --fNbranches;
551 if (gDebug > 0) printf("Entry: %lld, un-registering branch: %s\n",b->GetTree()->GetReadEntry(),b->GetName());
552 }
553 delete fBrNames->Remove(fBrNames->FindObject(b->GetName()));
554
555 // process subbranches
556 Int_t res = 0;
557 if (subbranches) {
558 TObjArray *lb = b->GetListOfBranches();
559 Int_t nb = lb->GetEntriesFast();
560 for (Int_t j = 0; j < nb; j++) {
561 TBranch* branch = (TBranch*) lb->UncheckedAt(j);
562 if (!branch) continue;
563 if (DropBranch(branch, subbranches)<0) {
564 res = -1;
565 }
566 }
567 }
568 return res;
569}
570
571////////////////////////////////////////////////////////////////////////////////
572/// Remove a branch to the list of branches to be stored in the cache
573/// this is to be used by user (thats why we pass the name of the branch).
574/// It works in exactly the same way as TTree::SetBranchStatus so you
575/// probably want to look over there for details about the use of bname
576/// with regular expressions.
577/// The branches are taken with respect to the Owner of this TTreeCache
578/// (i.e. the original Tree)
579/// NB: if bname="*" all branches are put in the cache and the learning phase stopped
580/// Returns:
581/// - 0 branch dropped or not in cache
582/// - -1 on error
583
584Int_t TTreeCache::DropBranch(const char *bname, Bool_t subbranches /*= kFALSE*/)
585{
586 TBranch *branch, *bcount;
587 TLeaf *leaf, *leafcount;
588
589 Int_t i;
590 Int_t nleaves = (fTree->GetListOfLeaves())->GetEntriesFast();
591 TRegexp re(bname,kTRUE);
592 Int_t nb = 0;
593 Int_t res = 0;
594
595 // first pass, loop on all branches
596 // for leafcount branches activate/deactivate in function of status
597 Bool_t all = kFALSE;
598 if (!strcmp(bname,"*")) all = kTRUE;
599 for (i=0;i<nleaves;i++) {
600 leaf = (TLeaf*)(fTree->GetListOfLeaves())->UncheckedAt(i);
601 branch = (TBranch*)leaf->GetBranch();
602 TString s = branch->GetName();
603 if (!all) { //Regexp gives wrong result for [] in name
604 TString longname;
605 longname.Form("%s.%s",fTree->GetName(),branch->GetName());
606 if (strcmp(bname,branch->GetName())
607 && longname != bname
608 && s.Index(re) == kNPOS) continue;
609 }
610 nb++;
611 if (DropBranch(branch, subbranches)<0) {
612 res = -1;
613 }
614 leafcount = leaf->GetLeafCount();
615 if (leafcount && !all) {
616 bcount = leafcount->GetBranch();
617 if (DropBranch(bcount, subbranches)<0) {
618 res = -1;
619 }
620 }
621 }
622 if (nb==0 && strchr(bname,'*')==0) {
623 branch = fTree->GetBranch(bname);
624 if (branch) {
625 if (DropBranch(branch, subbranches)<0) {
626 res = -1;
627 }
628 ++nb;
629 }
630 }
631
632 //search in list of friends
633 UInt_t foundInFriend = 0;
634 if (fTree->GetListOfFriends()) {
635 TIter nextf(fTree->GetListOfFriends());
636 TFriendElement *fe;
638 while ((fe = (TFriendElement*)nextf())) {
639 TTree *t = fe->GetTree();
640 if (t==0) continue;
641
642 // If the alias is present replace it with the real name.
643 char *subbranch = (char*)strstr(bname,fe->GetName());
644 if (subbranch!=bname) subbranch = 0;
645 if (subbranch) {
646 subbranch += strlen(fe->GetName());
647 if ( *subbranch != '.' ) subbranch = 0;
648 else subbranch ++;
649 }
650 if (subbranch) {
651 name.Form("%s.%s",t->GetName(),subbranch);
652 if (DropBranch(name, subbranches)<0) {
653 res = -1;
654 }
655 ++foundInFriend;
656 }
657 }
658 }
659 if (!nb && !foundInFriend) {
660 if (gDebug > 0) printf("DropBranch: unknown branch -> %s \n", bname);
661 Error("DropBranch", "unknown branch -> %s", bname);
662 return -1;
663 }
664 //if all branches are selected stop the learning phase
665 if (*bname == '*') {
666 fEntryNext = -1; // We are likely to have change the set of branches, so for the [re-]reading of the cluster.
667 }
668 return res;
669}
670
671////////////////////////////////////////////////////////////////////////////////
672/// Start of methods for the miss cache.
673////////////////////////////////////////////////////////////////////////////////
674
675////////////////////////////////////////////////////////////////////////////////
676/// Enable / disable the miss cache.
677///
678/// The first time this is called on a TTreeCache object, the corresponding
679/// data structures will be allocated. Subsequent enable / disables will
680/// simply turn the functionality on/off.
682{
683
684 if (opt && !fMissCache) {
686 }
687 fOptimizeMisses = opt;
688}
689
690////////////////////////////////////////////////////////////////////////////////
691/// Reset all the miss cache training.
692///
693/// The contents of the miss cache will be emptied as well as the list of
694/// branches used.
696{
697
698 fLastMiss = -1;
699 fFirstMiss = -1;
700
701 if (!fMissCache) {
702 fMissCache.reset(new MissCache());
703 }
704 fMissCache->clear();
705}
706
707////////////////////////////////////////////////////////////////////////////////
708/// For the event currently being fetched into the miss cache, find the IO
709/// (offset / length tuple) to pull in the current basket for a given branch.
710///
711/// Returns:
712/// - IOPos describing the IO operation necessary for the basket on this branch
713/// - On failure, IOPos.length will be set to 0.
715{
716 if (R__unlikely(b.GetDirectory() == 0)) {
717 // printf("Branch at %p has no valid directory.\n", &b);
718 return IOPos{0, 0};
719 }
720 if (R__unlikely(b.GetDirectory()->GetFile() != fFile)) {
721 // printf("Branch at %p is in wrong file (branch file %p, my file %p).\n", &b, b.GetDirectory()->GetFile(),
722 // fFile);
723 return IOPos{0, 0};
724 }
725
726 // printf("Trying to find a basket for branch %p\n", &b);
727 // Pull in metadata about branch; make sure it is valid
728 Int_t *lbaskets = b.GetBasketBytes();
729 Long64_t *entries = b.GetBasketEntry();
730 if (R__unlikely(!lbaskets || !entries)) {
731 // printf("No baskets or entries.\n");
732 return IOPos{0, 0};
733 }
734 // Int_t blistsize = b.GetListOfBaskets()->GetSize();
735 Int_t blistsize = b.GetWriteBasket();
736 if (R__unlikely(blistsize <= 0)) {
737 // printf("Basket list is size 0.\n");
738 return IOPos{0, 0};
739 }
740
741 // Search for the basket that contains the event of interest. Unlike the primary cache, we
742 // are only interested in a single basket per branch - we don't try to fill the cache.
743 Long64_t basketOffset = TMath::BinarySearch(blistsize, entries, entry);
744 if (basketOffset < 0) { // No entry found.
745 // printf("No entry offset found for entry %ld\n", fTree->GetReadEntry());
746 return IOPos{0, 0};
747 }
748
749 // Check to see if there's already a copy of this basket in memory. If so, don't fetch it
750 if ((basketOffset < blistsize) && b.GetListOfBaskets()->UncheckedAt(basketOffset)) {
751
752 // printf("Basket is already in memory.\n");
753 return IOPos{0, 0};
754 }
755
756 Long64_t pos = b.GetBasketSeek(basketOffset);
757 Int_t len = lbaskets[basketOffset];
758 if (R__unlikely(pos <= 0 || len <= 0)) {
759 /*printf("Basket returned was invalid (basketOffset=%ld, pos=%ld, len=%d).\n", basketOffset, pos, len);
760 for (int idx=0; idx<blistsize; idx++) {
761 printf("Basket entry %d, first event %d, pos %ld\n", idx, entries[idx], b.GetBasketSeek(idx));
762 }*/
763 return IOPos{0, 0};
764 } // Sanity check
765 // Do not cache a basket if it is bigger than the cache size!
767 // printf("Basket size is greater than the cache size.\n");
768 return IOPos{0, 0};
769 }
770
771 return {pos, len};
772}
773
774////////////////////////////////////////////////////////////////////////////////
775/// Given a particular IO description (offset / length) representing a 'miss' of
776/// the TTreeCache's primary cache, calculate all the corresponding IO that
777/// should be performed.
778///
779/// `all` indicates that this function should search the set of _all_ branches
780/// in this TTree. When set to false, we only search through branches that
781/// have previously incurred a miss.
782///
783/// Returns:
784/// - TBranch pointer corresponding to the basket that will be retrieved by
785/// this IO operation.
786/// - If no corresponding branch could be found (or an error occurs), this
787/// returns nullptr.
789{
790 if (R__unlikely((pos < 0) || (len < 0))) {
791 return nullptr;
792 }
793
794 int count = all ? (fTree->GetListOfLeaves())->GetEntriesFast() : fMissCache->fBranches.size();
795 fMissCache->fEntries.reserve(count);
796 fMissCache->fEntries.clear();
797 Bool_t found_request = kFALSE;
798 TBranch *resultBranch = nullptr;
799 Long64_t entry = fTree->GetReadEntry();
800
801 std::vector<std::pair<size_t, Int_t>> basketsInfo;
802 auto perfStats = GetTree()->GetPerfStats();
803
804 // printf("Will search %d branches for basket at %ld.\n", count, pos);
805 for (int i = 0; i < count; i++) {
806 TBranch *b =
807 all ? static_cast<TBranch *>(static_cast<TLeaf *>((fTree->GetListOfLeaves())->UncheckedAt(i))->GetBranch())
808 : fMissCache->fBranches[i];
809 IOPos iopos = FindBranchBasketPos(*b, entry);
810 if (iopos.fLen == 0) { // Error indicator
811 continue;
812 }
813 if (iopos.fPos == pos && iopos.fLen == len) {
814 found_request = kTRUE;
815 resultBranch = b;
816 // Note that we continue to iterate; fills up the rest of the entries in the cache.
817 }
818 // At this point, we are ready to push back a new offset
819 fMissCache->fEntries.emplace_back(std::move(iopos));
820
821 if (R__unlikely(perfStats)) {
822 Int_t blistsize = b->GetWriteBasket();
823 Int_t basketNumber = -1;
824 for (Int_t bn = 0; bn < blistsize; ++bn) {
825 if (iopos.fPos == b->GetBasketSeek(bn)) {
826 basketNumber = bn;
827 break;
828 }
829 }
830 if (basketNumber >= 0)
831 basketsInfo.emplace_back((size_t)i, basketNumber);
832 }
833 }
834 if (R__unlikely(!found_request)) {
835 // We have gone through all the branches in this file and the requested basket
836 // doesn't appear to be in any of them. Likely a logic error / bug.
837 fMissCache->fEntries.clear();
838 }
839 if (R__unlikely(perfStats)) {
840 for (auto &info : basketsInfo) {
841 perfStats->SetLoadedMiss(info.first, info.second);
842 }
843 }
844 return resultBranch;
845}
846
847////////////////////////////////////////////////////////////////////////////////
848///
849/// Process a cache miss; (pos, len) isn't in the buffer.
850///
851/// The first time we have a miss, we buffer as many baskets we can (up to the
852/// maximum size of the TTreeCache) in memory from all branches that are not in
853/// the prefetch list.
854///
855/// Subsequent times, we fetch all the buffers corresponding to branches that
856/// had previously seen misses. If it turns out the (pos, len) isn't in the
857/// list of branches, we treat this as if it was the first miss.
858///
859/// Returns true if we were able to pull the data into the miss cache.
860///
862{
863
864 Bool_t firstMiss = kFALSE;
865 if (fFirstMiss == -1) {
867 firstMiss = kTRUE;
868 }
870 // The first time this is executed, we try to pull in as much data as we can.
871 TBranch *b = CalculateMissEntries(pos, len, firstMiss);
872 if (!b) {
873 if (!firstMiss) {
874 // TODO: this recalculates for *all* branches, throwing away the above work.
876 }
877 if (!b) {
878 // printf("ProcessMiss: pos %ld does not appear to correspond to a buffer in this file.\n", pos);
879 // We have gone through all the branches in this file and the requested basket
880 // doesn't appear to be in any of them. Likely a logic error / bug.
881 fMissCache->fEntries.clear();
882 return kFALSE;
883 }
884 }
885 // TODO: this should be a set.
886 fMissCache->fBranches.push_back(b);
887
888 // OK, sort the entries
889 std::sort(fMissCache->fEntries.begin(), fMissCache->fEntries.end());
890
891 // Now, fetch the buffer.
892 std::vector<Long64_t> positions;
893 positions.reserve(fMissCache->fEntries.size());
894 std::vector<Int_t> lengths;
895 lengths.reserve(fMissCache->fEntries.size());
896 ULong64_t cumulative = 0;
897 for (auto &mcentry : fMissCache->fEntries) {
898 positions.push_back(mcentry.fIO.fPos);
899 lengths.push_back(mcentry.fIO.fLen);
900 mcentry.fIndex = cumulative;
901 cumulative += mcentry.fIO.fLen;
902 }
903 fMissCache->fData.reserve(cumulative);
904 // printf("Reading %lu bytes into miss cache for %lu entries.\n", cumulative, fEntries->size());
905 fNMissReadPref += fMissCache->fEntries.size();
906 fFile->ReadBuffers(&(fMissCache->fData[0]), &(positions[0]), &(lengths[0]), fMissCache->fEntries.size());
908
909 return kTRUE;
910}
911
912////////////////////////////////////////////////////////////////////////////////
913/// Given an IO operation (pos, len) that was a cache miss in the primary TTC,
914/// try the operation again with the miss cache.
915///
916/// Returns true if the IO operation was successful and the contents of buf
917/// were populated with the requested data.
918///
920{
921
922 if (!fOptimizeMisses) {
923 return kFALSE;
924 }
925 if (R__unlikely((pos < 0) || (len < 0))) {
926 return kFALSE;
927 }
928
929 // printf("Checking the miss cache for offset=%ld, length=%d\n", pos, len);
930
931 // First, binary search to see if the desired basket is already cached.
932 MissCache::Entry mcentry{IOPos{pos, len}};
933 auto iter = std::lower_bound(fMissCache->fEntries.begin(), fMissCache->fEntries.end(), mcentry);
934
935 if (iter != fMissCache->fEntries.end()) {
936 if (len > iter->fIO.fLen) {
938 return kFALSE;
939 }
940 auto offset = iter->fIndex;
941 memcpy(buf, &(fMissCache->fData[offset]), len);
942 // printf("Returning data from pos=%ld in miss cache.\n", offset);
943 ++fNMissReadOk;
944 return kTRUE;
945 }
946
947 // printf("Data not in miss cache.\n");
948
949 // Update the cache, looking for this (pos, len).
950 if (!ProcessMiss(pos, len)) {
951 // printf("Unable to pull data into miss cache.\n");
953 return kFALSE;
954 }
955
956 // OK, we updated the cache with as much information as possible. Search again for
957 // the entry we want.
958 iter = std::lower_bound(fMissCache->fEntries.begin(), fMissCache->fEntries.end(), mcentry);
959
960 if (iter != fMissCache->fEntries.end()) {
961 auto offset = iter->fIndex;
962 // printf("Expecting data at offset %ld in miss cache.\n", offset);
963 memcpy(buf, &(fMissCache->fData[offset]), len);
964 ++fNMissReadOk;
965 return kTRUE;
966 }
967
968 // This must be a logic bug. ProcessMiss should return false if (pos, len)
969 // wasn't put into fEntries.
971 return kFALSE;
972}
973
974////////////////////////////////////////////////////////////////////////////////
975/// End of methods for miss cache.
976////////////////////////////////////////////////////////////////////////////////
977
978namespace {
979struct BasketRanges {
980 struct Range {
981 Long64_t fMin; ///< Inclusive minimum
982 Long64_t fMax; ///< Inclusive maximum
983
984 Range() : fMin(-1), fMax(-1) {}
985
986 void UpdateMin(Long64_t min)
987 {
988 if (fMin == -1 || min < fMin)
989 fMin = min;
990 }
991
992 void UpdateMax(Long64_t max)
993 {
994 if (fMax == -1 || fMax < max)
995 fMax = max;
996 }
997
998 Bool_t Contains(Long64_t entry) { return (fMin <= entry && entry <= fMax); }
999 };
1000
1001 std::vector<Range> fRanges;
1002 std::map<Long64_t,size_t> fMinimums;
1003 std::map<Long64_t,size_t> fMaximums;
1004
1005 BasketRanges(size_t nBranches) { fRanges.resize(nBranches); }
1006
1007 void Update(size_t branchNumber, Long64_t min, Long64_t max)
1008 {
1009 Range &range = fRanges.at(branchNumber);
1010 auto old(range);
1011
1012 range.UpdateMin(min);
1013 range.UpdateMax(max);
1014
1015 if (old.fMax != range.fMax) {
1016 if (old.fMax != -1) {
1017 auto maxIter = fMaximums.find(old.fMax);
1018 if (maxIter != fMaximums.end()) {
1019 if (maxIter->second == 1) {
1020 fMaximums.erase(maxIter);
1021 } else {
1022 --(maxIter->second);
1023 }
1024 }
1025 }
1026 ++(fMaximums[max]);
1027 }
1028 }
1029
1030 void Update(size_t branchNumber, size_t basketNumber, Long64_t *entries, size_t nb, size_t max)
1031 {
1032 Update(branchNumber, entries[basketNumber],
1033 (basketNumber < (nb - 1)) ? (entries[basketNumber + 1] - 1) : max - 1);
1034 }
1035
1036 // Check that fMaximums and fMinimums are properly set
1037 bool CheckAllIncludeRange()
1038 {
1039 Range result;
1040 for (const auto &r : fRanges) {
1041 if (result.fMin == -1 || result.fMin < r.fMin) {
1042 if (r.fMin != -1)
1043 result.fMin = r.fMin;
1044 }
1045 if (result.fMax == -1 || r.fMax < result.fMax) {
1046 if (r.fMax != -1)
1047 result.fMax = r.fMax;
1048 }
1049 }
1050 // if (result.fMax < result.fMin) {
1051 // // No overlapping range.
1052 // }
1053
1054 Range allIncludedRange(AllIncludedRange());
1055
1056 return (result.fMin == allIncludedRange.fMin && result.fMax == allIncludedRange.fMax);
1057 }
1058
1059 // This returns a Range object where fMin is the maximum of all the minimun entry
1060 // number loaded for each branch and fMax is the minimum of all the maximum entry
1061 // number loaded for each branch.
1062 // As such it is valid to have fMin > fMax, this is the case where there
1063 // are no overlap between the branch's range. For example for 2 branches
1064 // where we have for one the entry [50,99] and for the other [0,49] then
1065 // we will have fMin = max(50,0) = 50 and fMax = min(99,49) = 49
1066 Range AllIncludedRange()
1067 {
1068 Range result;
1069 if (!fMinimums.empty())
1070 result.fMin = fMinimums.rbegin()->first;
1071 if (!fMaximums.empty())
1072 result.fMax = fMaximums.begin()->first;
1073 return result;
1074 }
1075
1076 // Returns the number of branches with at least one baskets registered.
1077 UInt_t BranchesRegistered()
1078 {
1079 UInt_t result = 0;
1080 for (const auto &r : fRanges) {
1081 if (r.fMin != -1 && r.fMax != -1)
1082 ++result;
1083 }
1084 return result;
1085 }
1086
1087 // Returns true if at least one of the branch's range contains
1088 // the entry.
1089 Bool_t Contains(Long64_t entry)
1090 {
1091 for (const auto &r : fRanges) {
1092 if (r.fMin != -1 && r.fMax != -1)
1093 if (r.fMin <= entry && entry <= r.fMax)
1094 return kTRUE;
1095 }
1096 return kFALSE;
1097 }
1098
1099 void Print()
1100 {
1101 for (size_t i = 0; i < fRanges.size(); ++i) {
1102 if (fRanges[i].fMin != -1 || fRanges[i].fMax != -1)
1103 Printf("Range #%zu : %lld to %lld", i, fRanges[i].fMin, fRanges[i].fMax);
1104 }
1105 }
1106};
1107} // Anonymous namespace.
1108
1109////////////////////////////////////////////////////////////////////////////////
1110/// Fill the cache buffer with the branches in the cache.
1111
1113{
1114
1115 if (fNbranches <= 0) return kFALSE;
1117 Long64_t entry = tree->GetReadEntry();
1118 Long64_t fEntryCurrentMax = 0;
1119
1120 if (entry != -1 && (entry < fEntryMin || fEntryMax < entry))
1121 return kFALSE;
1122
1123 if (fEnablePrefetching) { // Prefetching mode
1124 if (fIsLearning) { // Learning mode
1125 if (fEntryNext >= 0 && entry >= fEntryNext) {
1126 // entry is outside the learn range, need to stop the learning
1127 // phase. Doing so may trigger a recursive call to FillBuffer in
1128 // the process of filling both prefetching buffers
1130 fIsManual = kFALSE;
1131 }
1132 }
1133 if (fIsLearning) { // Learning mode
1134 // The learning phase should start from the minimum entry in the cache
1135 entry = fEntryMin;
1136 }
1137 if (fFirstTime) {
1138 //try to detect if it is normal or reverse read
1139 fFirstEntry = entry;
1140 }
1141 else {
1142 if (fFirstEntry == entry) return kFALSE;
1143 // Set the read direction
1144 if (!fReadDirectionSet) {
1145 if (entry < fFirstEntry) {
1148 }
1149 else if (entry > fFirstEntry) {
1152 }
1153 }
1154
1155 if (fReverseRead) {
1156 // Reverse reading with prefetching
1157 if (fEntryCurrent >0 && entry < fEntryNext) {
1158 // We can prefetch the next buffer
1159 if (entry >= fEntryCurrent) {
1160 entry = fEntryCurrent - tree->GetAutoFlush() * fFillTimes;
1161 }
1162 if (entry < 0) entry = 0;
1163 }
1164 else if (fEntryCurrent >= 0) {
1165 // We are still reading from the oldest buffer, no need to prefetch a new one
1166 return kFALSE;
1167 }
1168 if (entry < 0) return kFALSE;
1170 }
1171 else {
1172 // Normal reading with prefetching
1173 if (fEnablePrefetching) {
1174 if (entry < 0 && fEntryNext > 0) {
1175 entry = fEntryCurrent;
1176 } else if (entry >= fEntryCurrent) {
1177 if (entry < fEntryNext) {
1178 entry = fEntryNext;
1179 }
1180 }
1181 else {
1182 // We are still reading from the oldest buffer,
1183 // no need to prefetch a new one
1184 return kFALSE;
1185 }
1187 }
1188 }
1189 }
1190 }
1191
1192 // Set to true to enable all debug output without having to set gDebug
1193 // Replace this once we have a per module and/or per class debugging level/setting.
1194 static constexpr bool showMore = kFALSE;
1195
1196 static const auto PrintAllCacheInfo = [](TObjArray *branches) {
1197 for (Int_t i = 0; i < branches->GetEntries(); i++) {
1198 TBranch *b = (TBranch *)branches->UncheckedAt(i);
1199 b->PrintCacheInfo();
1200 }
1201 };
1202
1203 if (showMore || gDebug > 6)
1204 Info("FillBuffer", "***** Called for entry %lld", entry);
1205
1206 if (!fIsLearning && fEntryCurrent <= entry && entry < fEntryNext) {
1207 // Check if all the basket in the cache have already be used and
1208 // thus we can reuse the cache.
1209 Bool_t allUsed = kTRUE;
1210 for (Int_t i = 0; i < fNbranches; ++i) {
1212 if (!b->fCacheInfo.AllUsed()) {
1213 allUsed = kFALSE;
1214 break;
1215 }
1216 }
1217 if (allUsed) {
1218 fEntryNext = entry;
1219 if (showMore || gDebug > 5)
1220 Info("FillBuffer", "All baskets used already, so refresh the cache early at entry %lld", entry);
1221 }
1222 if (gDebug > 8)
1223 PrintAllCacheInfo(fBranches);
1224 }
1225
1226 // If the entry is in the range we previously prefetched, there is
1227 // no point in retrying. Note that this will also return false
1228 // during the training phase (fEntryNext is then set intentional to
1229 // the end of the training phase).
1230 if (fEntryCurrent <= entry && entry < fEntryNext) return kFALSE;
1231
1232 // Triggered by the user, not the learning phase
1233 if (entry == -1)
1234 entry = 0;
1235
1236 Bool_t resetBranchInfo = kFALSE;
1237 if (entry < fCurrentClusterStart || fNextClusterStart <= entry) {
1238 // We are moving on to another set of clusters.
1239 resetBranchInfo = kTRUE;
1240 if (showMore || gDebug > 6)
1241 Info("FillBuffer", "*** Will reset the branch information about baskets");
1242 } else if (showMore || gDebug > 6) {
1243 Info("FillBuffer", "*** Info we have on the set of baskets");
1244 PrintAllCacheInfo(fBranches);
1245 }
1246
1247 fEntryCurrentMax = fEntryCurrent;
1248 TTree::TClusterIterator clusterIter = tree->GetClusterIterator(entry);
1249
1250 auto entryCurrent = clusterIter();
1251 auto entryNext = clusterIter.GetNextEntry();
1252
1253 if (entryNext < fEntryMin || fEntryMax < entryCurrent) {
1254 // There is no overlap between the cluster we found [entryCurrent, entryNext[
1255 // and the authorized range [fEntryMin, fEntryMax]
1256 // so we have nothing to do
1257 return kFALSE;
1258 }
1259
1260 // If there is overlap between the found cluster and the authorized range
1261 // update the cache data members with the information about the current cluster.
1262 fEntryCurrent = entryCurrent;
1263 fEntryNext = entryNext;
1264
1265 auto firstClusterEnd = fEntryNext;
1266 if (showMore || gDebug > 6)
1267 Info("FillBuffer", "Looking at cluster spanning from %lld to %lld", fEntryCurrent, fEntryNext);
1268
1270 if (fEntryMax <= 0) fEntryMax = tree->GetEntries();
1272
1273 if ( fEnablePrefetching ) {
1274 if ( entry == fEntryMax ) {
1275 // We are at the end, no need to do anything else
1276 return kFALSE;
1277 }
1278 }
1279
1280 if (resetBranchInfo) {
1281 // We earlier thought we were onto the next set of clusters.
1282 if (fCurrentClusterStart != -1 || fNextClusterStart != -1) {
1283 if (!(fEntryCurrent < fCurrentClusterStart || fEntryCurrent >= fNextClusterStart)) {
1284 Error("FillBuffer", "Inconsistency: fCurrentClusterStart=%lld fEntryCurrent=%lld fNextClusterStart=%lld "
1285 "but fEntryCurrent should not be in between the two",
1287 }
1288 }
1289
1290 // Start the next cluster set.
1292 fNextClusterStart = firstClusterEnd;
1293 }
1294
1295 // Check if owner has a TEventList set. If yes we optimize for this
1296 // Special case reading only the baskets containing entries in the
1297 // list.
1298 TEventList *elist = fTree->GetEventList();
1299 Long64_t chainOffset = 0;
1300 if (elist) {
1301 if (fTree->IsA() ==TChain::Class()) {
1302 TChain *chain = (TChain*)fTree;
1303 Int_t t = chain->GetTreeNumber();
1304 chainOffset = chain->GetTreeOffset()[t];
1305 }
1306 }
1307
1308 //clear cache buffer
1309 Int_t ntotCurrentBuf = 0;
1310 if (fEnablePrefetching){ //prefetching mode
1311 if (fFirstBuffer) {
1313 ntotCurrentBuf = fNtot;
1314 }
1315 else {
1317 ntotCurrentBuf = fBNtot;
1318 }
1319 }
1320 else {
1322 ntotCurrentBuf = fNtot;
1323 }
1324
1325 //store baskets
1326 BasketRanges ranges((showMore || gDebug > 6) ? fNbranches : 0);
1327 BasketRanges reqRanges(fNbranches);
1328 BasketRanges memRanges((showMore || gDebug > 6) ? fNbranches : 0);
1329 Int_t clusterIterations = 0;
1330 Long64_t minEntry = fEntryCurrent;
1331 Int_t prevNtot;
1332 Long64_t maxReadEntry = minEntry; // If we are stopped before the end of the 2nd pass, this marker will where we need to start next time.
1333 Int_t nReadPrefRequest = 0;
1334 auto perfStats = GetTree()->GetPerfStats();
1335
1336 struct collectionInfo {
1337 Int_t fClusterStart{-1}; // First basket belonging to the current cluster
1338 Int_t fCurrent{-1}; // Currently visited basket
1339 Bool_t fLoadedOnce{kFALSE};
1340
1341 void Rewind() { fCurrent = (fClusterStart >= 0) ? fClusterStart : 0; }
1342 };
1343 std::vector<collectionInfo> cursor(fNbranches);
1344
1345 // Main loop to fill the cache, inside each loop we will loop over
1346 // all the cached branch and collect the baskets within the 'current'
1347 // range/cluster. If there is still space in the cache after that, we
1348 // will do another iteration to add one more cluster to the cache.
1349 // i.e. essentially loop over the clusters.
1350 do {
1351 prevNtot = ntotCurrentBuf;
1352 Long64_t lowestMaxEntry = fEntryMax; // The lowest maximum entry in the TTreeCache for each branch for each pass.
1353
1354 Bool_t reachedEnd = kFALSE;
1355 Bool_t skippedFirst = kFALSE;
1356 Bool_t oncePerBranch = kFALSE;
1357 Int_t nDistinctLoad = 0;
1358 Bool_t progress = kTRUE;
1359 enum ENarrow {
1360 kFull = 0,
1361 kNarrow = 1
1362 };
1363 enum EPass {
1364 kStart = 1,
1365 kRegular = 2,
1366 kRewind = 3
1367 };
1368
1369 auto CollectBaskets = [this, elist, chainOffset, entry, clusterIterations, resetBranchInfo, perfStats,
1370 &cursor, &lowestMaxEntry, &maxReadEntry, &minEntry,
1371 &reachedEnd, &skippedFirst, &oncePerBranch, &nDistinctLoad, &progress,
1372 &ranges, &memRanges, &reqRanges,
1373 &ntotCurrentBuf, &nReadPrefRequest](EPass pass, ENarrow narrow, Long64_t maxCollectEntry) {
1374 // The first pass we add one basket per branches around the requested entry
1375 // then in the second pass we add the other baskets of the cluster.
1376 // This is to support the case where the cache is too small to hold a full cluster.
1377 Int_t nReachedEnd = 0;
1378 Int_t nSkipped = 0;
1379 auto oldnReadPrefRequest = nReadPrefRequest;
1380 std::vector<Int_t> potentialVetoes;
1381
1382 if (showMore || gDebug > 7)
1383 Info("CollectBaskets", "Called with pass=%d narrow=%d maxCollectEntry=%lld", pass, narrow, maxCollectEntry);
1384
1385 Bool_t filled = kFALSE;
1386 for (Int_t i = 0; i < fNbranches; ++i) {
1388 if (b->GetDirectory()==0 || b->TestBit(TBranch::kDoNotProcess))
1389 continue;
1390 if (b->GetDirectory()->GetFile() != fFile)
1391 continue;
1392 potentialVetoes.clear();
1393 if (pass == kStart && !cursor[i].fLoadedOnce && resetBranchInfo) {
1394 // First check if we have any cluster that is currently in the
1395 // cache but was not used and would be reloaded in the next
1396 // cluster.
1397 b->fCacheInfo.GetUnused(potentialVetoes);
1398 if (showMore || gDebug > 7) {
1399 TString vetolist;
1400 for(auto v : potentialVetoes) {
1401 vetolist += v;
1402 vetolist.Append(' ');
1403 }
1404 if (!potentialVetoes.empty())
1405 Info("FillBuffer", "*** Potential Vetos for branch #%d: %s", i, vetolist.Data());
1406 }
1407 b->fCacheInfo.Reset();
1408 }
1409 Int_t nb = b->GetMaxBaskets();
1410 Int_t *lbaskets = b->GetBasketBytes();
1411 Long64_t *entries = b->GetBasketEntry();
1412 if (!lbaskets || !entries)
1413 continue;
1414 //we have found the branch. We now register all its baskets
1415 // from the requested offset to the basket below fEntryMax
1416 Int_t blistsize = b->GetListOfBaskets()->GetSize();
1417
1418 auto maxOfBasket = [this, nb, entries](int j) {
1419 return ((j < (nb - 1)) ? (entries[j + 1] - 1) : fEntryMax - 1);
1420 };
1421
1422 if (pass == kRewind)
1423 cursor[i].Rewind();
1424 else if (cursor[i].fCurrent == -1) {
1425 auto start = TMath::BinarySearch(b->GetWriteBasket() + 1, entries, minEntry);
1426 cursor[i].fCurrent = (start < 0) ? 0 : start;
1427 }
1428 for (auto &j = cursor[i].fCurrent; j < nb; j++) {
1429 // This basket has already been read, skip it
1430
1431 if (j < blistsize && b->GetListOfBaskets()->UncheckedAt(j)) {
1432
1433 if (showMore || gDebug > 6) {
1434 ranges.Update(i, entries[j], maxOfBasket(j));
1435 memRanges.Update(i, entries[j], maxOfBasket(j));
1436 }
1437 if (entries[j] <= entry && entry <= maxOfBasket(j)) {
1438 b->fCacheInfo.SetIsInCache(j);
1439 b->fCacheInfo.SetUsed(j);
1440 if (narrow) {
1441 // In narrow mode, we would select 'only' this basket,
1442 // so we are done for this round, let's 'consume' this
1443 // basket and go.
1444 ++nReachedEnd;
1445 ++j;
1446 break;
1447 }
1448 }
1449 continue;
1450 }
1451
1452 // Important: do not try to read maxCollectEntry, otherwise we might jump to the next autoflush
1453 if (entries[j] >= maxCollectEntry) {
1454 ++nReachedEnd;
1455 break; // break out of the for each branch loop.
1456 }
1457
1458 Long64_t pos = b->GetBasketSeek(j);
1459 Int_t len = lbaskets[j];
1460 if (pos <= 0 || len <= 0)
1461 continue;
1462 if (len > fBufferSizeMin) {
1463 // Do not cache a basket if it is bigger than the cache size!
1464 if ((showMore || gDebug > 7) &&
1465 (!(entries[j] < minEntry && (j < nb - 1 && entries[j + 1] <= minEntry))))
1466 Info("FillBuffer", "Skipping branch %s basket %d is too large for the cache: %d > %d",
1467 b->GetName(), j, len, fBufferSizeMin);
1468 continue;
1469 }
1470
1471 if (nReadPrefRequest && entries[j] > (reqRanges.AllIncludedRange().fMax + 1)) {
1472 // There is a gap between this basket and the max of the 'lowest' already loaded basket
1473 // If we are tight in memory, reading this basket may prevent reading the basket (for the other branches)
1474 // that covers this gap, forcing those baskets to be read uncached (because the cache wont be reloaded
1475 // until we use this basket).
1476 // eg. We could end up with the cache contain
1477 // b1: [428, 514[ // 'this' basket and we can assume [321 to 428[ is already in memory
1478 // b2: [400, 424[
1479 // and when reading entry 425 we will read b2's basket uncached.
1480
1481 if (showMore || gDebug > 8)
1482 Info("FillBuffer", "Skipping for now due to gap %d/%d with %lld > %lld", i, j, entries[j],
1483 (reqRanges.AllIncludedRange().fMax + 1));
1484 break; // Without consuming the basket.
1485 }
1486
1487 if (entries[j] < minEntry && (j<nb-1 && entries[j+1] <= minEntry))
1488 continue;
1489
1490 // We are within the range
1491 if (cursor[i].fClusterStart == -1)
1492 cursor[i].fClusterStart = j;
1493
1494 if (elist) {
1495 Long64_t emax = fEntryMax;
1496 if (j<nb-1)
1497 emax = entries[j + 1] - 1;
1498 if (!elist->ContainsRange(entries[j]+chainOffset,emax+chainOffset))
1499 continue;
1500 }
1501
1502 if (b->fCacheInfo.HasBeenUsed(j) || b->fCacheInfo.IsInCache(j) || b->fCacheInfo.IsVetoed(j)) {
1503 // We already cached and used this basket during this cluster range,
1504 // let's not redo it
1505 if (showMore || gDebug > 7)
1506 Info("FillBuffer", "Skipping basket to avoid redo: %d/%d veto: %d", i, j, b->fCacheInfo.IsVetoed(j));
1507 continue;
1508 }
1509
1510 if (std::find(std::begin(potentialVetoes), std::end(potentialVetoes), j) != std::end(potentialVetoes)) {
1511 // This basket was in the previous cache/cluster and was not used,
1512 // let's not read it again. I.e. we bet that it will continue to not
1513 // be used. At worst it will be used and thus read by itself.
1514 // Usually in this situation the basket is large so the penalty for
1515 // (re)reading it uselessly is high and the penalty to read it by
1516 // itself is 'small' (i.e. size bigger than latency).
1517 b->fCacheInfo.Veto(j);
1518 if (showMore || gDebug > 7)
1519 Info("FillBuffer", "Veto-ing cluster %d [%lld,%lld[ in branch %s #%d", j, entries[j],
1520 maxOfBasket(j) + 1, b->GetName(), i);
1521 continue;
1522 }
1523
1524 if (narrow) {
1525 if ((((entries[j] > entry)) || (j < nb - 1 && entries[j + 1] <= entry))) {
1526 // Keep only the basket that contains the entry
1527 if (j == cursor[i].fClusterStart && entry > entries[j])
1528 ++nSkipped;
1529 if (entries[j] > entry)
1530 break;
1531 else
1532 continue;
1533 }
1534 }
1535
1536 if ((ntotCurrentBuf + len) > fBufferSizeMin) {
1537 // Humm ... we are going to go over the requested size.
1538 if (clusterIterations > 0 && cursor[i].fLoadedOnce) {
1539 // We already have a full cluster and now we would go over the requested
1540 // size, let's stop caching (and make sure we start next time from the
1541 // end of the previous cluster).
1542 if (showMore || gDebug > 5) {
1543 Info(
1544 "FillBuffer",
1545 "Breaking early because %d is greater than %d at cluster iteration %d will restart at %lld",
1546 (ntotCurrentBuf + len), fBufferSizeMin, clusterIterations, minEntry);
1547 }
1548 fEntryNext = minEntry;
1549 filled = kTRUE;
1550 break;
1551 } else {
1552 if (pass == kStart || !cursor[i].fLoadedOnce) {
1553 if ((ntotCurrentBuf + len) > 4 * fBufferSizeMin) {
1554 // Okay, so we have not even made one pass and we already have
1555 // accumulated request for more than twice the memory size ...
1556 // So stop for now, and will restart at the same point, hoping
1557 // that the basket will still be in memory and not asked again ..
1558 fEntryNext = maxReadEntry;
1559
1560 if (showMore || gDebug > 5) {
1561 Info("FillBuffer", "Breaking early because %d is greater than 4*%d at cluster iteration "
1562 "%d pass %d will restart at %lld",
1563 (ntotCurrentBuf + len), fBufferSizeMin, clusterIterations, pass, fEntryNext);
1564 }
1565 filled = kTRUE;
1566 break;
1567 }
1568 } else {
1569 // We have made one pass through the branches and thus already
1570 // requested one basket per branch, let's stop prefetching
1571 // now.
1572 if ((ntotCurrentBuf + len) > 2 * fBufferSizeMin) {
1573 fEntryNext = maxReadEntry;
1574 if (showMore || gDebug > 5) {
1575 Info("FillBuffer", "Breaking early because %d is greater than 2*%d at cluster iteration "
1576 "%d pass %d will restart at %lld",
1577 (ntotCurrentBuf + len), fBufferSizeMin, clusterIterations, pass, fEntryNext);
1578 }
1579 filled = kTRUE;
1580 break;
1581 }
1582 }
1583 }
1584 }
1585
1586 ++nReadPrefRequest;
1587
1588 reqRanges.Update(i, j, entries, nb, fEntryMax);
1589 if (showMore || gDebug > 6)
1590 ranges.Update(i, j, entries, nb, fEntryMax);
1591
1592 b->fCacheInfo.SetIsInCache(j);
1593
1594 if (showMore || gDebug > 6)
1595 Info("FillBuffer", "*** Registering branch %d basket %d %s", i, j, b->GetName());
1596
1597 if (!cursor[i].fLoadedOnce) {
1598 cursor[i].fLoadedOnce = kTRUE;
1599 ++nDistinctLoad;
1600 }
1601 if (R__unlikely(perfStats)) {
1602 perfStats->SetLoaded(i, j);
1603 }
1604
1605 // Actual registering the basket for loading from the file.
1606 if (fEnablePrefetching){
1607 if (fFirstBuffer) {
1609 ntotCurrentBuf = fNtot;
1610 }
1611 else {
1613 ntotCurrentBuf = fBNtot;
1614 }
1615 }
1616 else {
1618 ntotCurrentBuf = fNtot;
1619 }
1620
1621 if ( ( j < (nb-1) ) && entries[j+1] > maxReadEntry ) {
1622 // Info("FillBuffer","maxCollectEntry incremented from %lld to %lld", maxReadEntry, entries[j+1]);
1623 maxReadEntry = entries[j+1];
1624 }
1625 if (ntotCurrentBuf > 4 * fBufferSizeMin) {
1626 // Humm something wrong happened.
1627 Warning("FillBuffer", "There is more data in this cluster (starting at entry %lld to %lld, "
1628 "current=%lld) than usual ... with %d %.3f%% of the branches we already have "
1629 "%d bytes (instead of %d)",
1630 fEntryCurrent, fEntryNext, entries[j], i, (100.0 * i) / ((float)fNbranches), ntotCurrentBuf,
1632 }
1633 if (pass == kStart) {
1634 // In the first pass, we record one basket per branch and move on to the next branch.
1635 auto high = maxOfBasket(j);
1636 if (high < lowestMaxEntry)
1637 lowestMaxEntry = high;
1638 // 'Consume' the baskets (i.e. avoid looking at it during a subsequent pass)
1639 ++j;
1640 break;
1641 } else if ((j + 1) == nb || entries[j + 1] >= maxReadEntry || entries[j + 1] >= lowestMaxEntry) {
1642 // In the other pass, load the baskets until we get to the maximum loaded so far.
1643 auto high = maxOfBasket(j);
1644 if (high < lowestMaxEntry)
1645 lowestMaxEntry = high;
1646 // 'Consume' the baskets (i.e. avoid looking at it during a subsequent pass)
1647 ++j;
1648 break;
1649 }
1650 }
1651
1652 if (cursor[i].fCurrent == nb) {
1653 ++nReachedEnd;
1654 }
1655
1656 if (gDebug > 0)
1657 Info("CollectBaskets",
1658 "Entry: %lld, registering baskets branch %s, fEntryNext=%lld, fNseek=%d, ntotCurrentBuf=%d",
1659 minEntry, ((TBranch *)fBranches->UncheckedAt(i))->GetName(), fEntryNext, fNseek, ntotCurrentBuf);
1660 }
1661 reachedEnd = (nReachedEnd == fNbranches);
1662 skippedFirst = (nSkipped > 0);
1663 oncePerBranch = (nDistinctLoad == fNbranches);
1664 progress = nReadPrefRequest - oldnReadPrefRequest;
1665 return filled;
1666 };
1667
1668 // First collect all the basket containing the request entry.
1669 bool full = kFALSE;
1670
1671 full = CollectBaskets(kStart, kNarrow, fEntryNext);
1672
1673 // Then fill out from all but the 'largest' branch to even out
1674 // the range across branches;
1675 while (!full && !reachedEnd && progress) { // used to be restricted to !oncePerBranch
1676 full = CollectBaskets(kStart, kFull, std::min(maxReadEntry, fEntryNext));
1677 }
1678
1679 resetBranchInfo = kFALSE; // Make sure the 2nd cluster iteration does not erase the info.
1680
1681 // Then fill out to the end of the cluster.
1682 if (!full && !fReverseRead) {
1683 do {
1684 full = CollectBaskets(kRegular, kFull, fEntryNext);
1685 } while (!full && !reachedEnd && progress);
1686 }
1687
1688 // The restart from the start of the cluster.
1689 if (!full && skippedFirst) {
1690 full = CollectBaskets(kRewind, kFull, fEntryNext);
1691 while (!full && !reachedEnd && progress) {
1692 full = CollectBaskets(kRegular, kFull, fEntryNext);
1693 }
1694 }
1695
1696 clusterIterations++;
1697
1698 minEntry = clusterIter.Next();
1699 if (fIsLearning) {
1700 fFillTimes++;
1701 }
1702
1703 // Continue as long as we still make progress (prevNtot < ntotCurrentBuf), that the next entry range to be looked
1704 // at,
1705 // which start at 'minEntry', is not past the end of the requested range (minEntry < fEntryMax)
1706 // and we guess that we not going to go over the requested amount of memory by asking for another set
1707 // of entries (fBufferSizeMin > ((Long64_t)ntotCurrentBuf*(clusterIterations+1))/clusterIterations).
1708 // ntotCurrentBuf / clusterIterations is the average size we are accumulated so far at each loop.
1709 // and thus (ntotCurrentBuf / clusterIterations) * (clusterIterations+1) is a good guess at what the next total
1710 // size
1711 // would be if we run the loop one more time. ntotCurrentBuf and clusterIterations are Int_t but can sometimes
1712 // be 'large' (i.e. 30Mb * 300 intervals) and can overflow the numerical limit of Int_t (i.e. become
1713 // artificially negative). To avoid this issue we promote ntotCurrentBuf to a long long (64 bits rather than 32
1714 // bits)
1715 if (!((fBufferSizeMin > ((Long64_t)ntotCurrentBuf * (clusterIterations + 1)) / clusterIterations) &&
1716 (prevNtot < ntotCurrentBuf) && (minEntry < fEntryMax))) {
1717 if (showMore || gDebug > 6)
1718 Info("FillBuffer", "Breaking because %d <= %lld || (%d >= %d) || %lld >= %lld", fBufferSizeMin,
1719 ((Long64_t)ntotCurrentBuf * (clusterIterations + 1)) / clusterIterations, prevNtot, ntotCurrentBuf,
1720 minEntry, fEntryMax);
1721 break;
1722 }
1723
1724 //for the reverse reading case
1725 if (!fIsLearning && fReverseRead) {
1726 if (clusterIterations >= fFillTimes)
1727 break;
1728 if (minEntry >= fEntryCurrentMax && fEntryCurrentMax > 0)
1729 break;
1730 }
1731 fEntryNext = clusterIter.GetNextEntry();
1734 } while (kTRUE);
1735
1736 if (showMore || gDebug > 6) {
1737 Info("FillBuffer", "Mem ranges");
1738 memRanges.Print();
1739 Info("FillBuffer", "Combined ranges");
1740 ranges.Print();
1741 Info("FillBuffer", "Requested ranges");
1742 reqRanges.Print();
1743 PrintAllCacheInfo(fBranches);
1744 }
1745
1746 if (nReadPrefRequest == 0) {
1747 // Nothing was added in the cache. This usually indicates that the baskets
1748 // contains the requested entry are either already in memory or are too large
1749 // on their own to fit in the cache.
1750 if (showMore || gDebug > 5) {
1751 Info("FillBuffer", "For entry %lld, nothing was added to the cache.", entry);
1752 }
1753 } else if (fEntryNext < firstClusterEnd && !reqRanges.Contains(entry)) {
1754 // Something went very wrong and even-though we searched for the baskets
1755 // holding 'entry' we somehow ended up with a range of entries that does
1756 // validate. So we must have been unable to find or fit the needed basket.
1757 // And thus even-though, we know the corresponding baskets wont be in the cache,
1758 // Let's make it official that 'entry' is within the range of this TTreeCache ('s search.)
1759
1760 // Without this, the next read will be flagged as 'out-of-range' and then we start at
1761 // the exact same point as this FillBuffer execution resulting in both the requested
1762 // entry still not being part of the cache **and** the beginning of the cluster being
1763 // read **again**.
1764
1765 if (showMore || gDebug > 5) {
1766 Error("FillBuffer", "Reset the next entry because the currently loaded range does not contains the request "
1767 "entry: %lld. fEntryNext updated from %lld to %lld. %d",
1768 entry, fEntryNext, firstClusterEnd, nReadPrefRequest);
1769 reqRanges.Print();
1770 }
1771
1772 fEntryNext = firstClusterEnd;
1773 } else {
1774 if (showMore || gDebug > 5) {
1775 Info("FillBuffer", "Complete adding %d baskets from %d branches taking in memory %d out of %d",
1776 nReadPrefRequest, reqRanges.BranchesRegistered(), ntotCurrentBuf, fBufferSizeMin);
1777 }
1778 }
1779
1780 fNReadPref += nReadPrefRequest;
1781 if (fEnablePrefetching) {
1782 if (fIsLearning) {
1784 }
1785 if (!fIsLearning && fFirstTime){
1786 // First time we add autoFlush entries , after fFillTimes * autoFlush
1787 // only in reverse prefetching mode
1789 }
1790 }
1792 return kTRUE;
1793}
1794
1795////////////////////////////////////////////////////////////////////////////////
1796/// Return the desired prefill type from the environment or resource variable
1797/// - 0 - No prefill
1798/// - 1 - All branches
1799
1801{
1802 const char *stcp;
1803 Int_t s = 0;
1804
1805 if (!(stcp = gSystem->Getenv("ROOT_TTREECACHE_PREFILL")) || !*stcp) {
1806 s = gEnv->GetValue("TTreeCache.Prefill", 1);
1807 } else {
1808 s = TString(stcp).Atoi();
1809 }
1810
1811 return static_cast<TTreeCache::EPrefillType>(s);
1812}
1813
1814////////////////////////////////////////////////////////////////////////////////
1815/// Give the total efficiency of the primary cache... defined as the ratio
1816/// of blocks found in the cache vs. the number of blocks prefetched
1817/// ( it could be more than 1 if we read the same block from the cache more
1818/// than once )
1819///
1820/// Note: This should eb used at the end of the processing or we will
1821/// get incomplete stats
1822
1824{
1825 if ( !fNReadPref )
1826 return 0;
1827
1828 return ((Double_t)fNReadOk / (Double_t)fNReadPref);
1829}
1830
1831////////////////////////////////////////////////////////////////////////////////
1832/// The total efficiency of the 'miss cache' - defined as the ratio
1833/// of blocks found in the cache versus the number of blocks prefetched
1834
1836{
1837 if (!fNMissReadPref) {
1838 return 0;
1839 }
1840 return static_cast<double>(fNMissReadOk) / static_cast<double>(fNMissReadPref);
1841}
1842
1843////////////////////////////////////////////////////////////////////////////////
1844/// This will indicate a sort of relative efficiency... a ratio of the
1845/// reads found in the cache to the number of reads so far
1846
1848{
1849 if ( !fNReadOk && !fNReadMiss )
1850 return 0;
1851
1852 return ((Double_t)fNReadOk / (Double_t)(fNReadOk + fNReadMiss));
1853}
1854
1855////////////////////////////////////////////////////////////////////////////////
1856/// Relative efficiency of the 'miss cache' - ratio of the reads found in cache
1857/// to the number of reads so far.
1858
1860{
1861 if (!fNMissReadOk && !fNMissReadMiss) {
1862 return 0;
1863 }
1864
1865 return static_cast<double>(fNMissReadOk) / static_cast<double>(fNMissReadOk + fNMissReadMiss);
1866}
1867
1868////////////////////////////////////////////////////////////////////////////////
1869/// Static function returning the number of entries used to train the cache
1870/// see SetLearnEntries
1871
1873{
1874 return fgLearnEntries;
1875}
1876
1877////////////////////////////////////////////////////////////////////////////////
1878/// Print cache statistics. Like:
1879///
1880/// ~~~ {.cpp}
1881/// ******TreeCache statistics for file: cms2.root ******
1882/// Number of branches in the cache ...: 1093
1883/// Cache Efficiency ..................: 0.997372
1884/// Cache Efficiency Rel...............: 1.000000
1885/// Learn entries......................: 100
1886/// Reading............................: 72761843 bytes in 7 transactions
1887/// Readahead..........................: 256000 bytes with overhead = 0 bytes
1888/// Average transaction................: 10394.549000 Kbytes
1889/// Number of blocks in current cache..: 210, total size: 6280352
1890/// ~~~
1891///
1892/// - if option = "a" the list of blocks in the cache is printed
1893/// see also class TTreePerfStats.
1894/// - if option contains 'cachedbranches', the list of branches being
1895/// cached is printed.
1896
1898{
1899 TString opt = option;
1900 opt.ToLower();
1901 printf("******TreeCache statistics for tree: %s in file: %s ******\n",fTree ? fTree->GetName() : "no tree set",fFile ? fFile->GetName() : "no file set");
1902 if (fNbranches <= 0) return;
1903 printf("Number of branches in the cache ...: %d\n",fNbranches);
1904 printf("Cache Efficiency ..................: %f\n",GetEfficiency());
1905 printf("Cache Efficiency Rel...............: %f\n",GetEfficiencyRel());
1906 printf("Secondary Efficiency ..............: %f\n", GetMissEfficiency());
1907 printf("Secondary Efficiency Rel ..........: %f\n", GetMissEfficiencyRel());
1908 printf("Learn entries......................: %d\n",TTreeCache::GetLearnEntries());
1909 if ( opt.Contains("cachedbranches") ) {
1910 opt.ReplaceAll("cachedbranches","");
1911 printf("Cached branches....................:\n");
1912 const TObjArray *cachedBranches = this->GetCachedBranches();
1913 Int_t nbranches = cachedBranches->GetEntriesFast();
1914 for (Int_t i = 0; i < nbranches; ++i) {
1915 TBranch* branch = (TBranch*) cachedBranches->UncheckedAt(i);
1916 printf("Branch name........................: %s\n",branch->GetName());
1917 }
1918 }
1920}
1921
1922////////////////////////////////////////////////////////////////////////////////
1923/// Old method ReadBuffer before the addition of the prefetch mechanism.
1924
1926 //Is request already in the cache?
1927 if (TFileCacheRead::ReadBuffer(buf,pos,len) == 1){
1928 fNReadOk++;
1929 return 1;
1930 }
1931
1932 static const auto recordMiss = [](TVirtualPerfStats *perfStats, TObjArray *branches, Bool_t bufferFilled,
1933 Long64_t basketpos) {
1934 if (gDebug > 6)
1935 ::Info("TTreeCache::ReadBufferNormal", "Cache miss after an %s FillBuffer: pos=%lld",
1936 bufferFilled ? "active" : "inactive", basketpos);
1937 for (Int_t i = 0; i < branches->GetEntries(); ++i) {
1938 TBranch *b = (TBranch *)branches->UncheckedAt(i);
1939 Int_t blistsize = b->GetListOfBaskets()->GetSize();
1940 for (Int_t j = 0; j < blistsize; ++j) {
1941 if (basketpos == b->GetBasketSeek(j)) {
1942 if (gDebug > 6)
1943 ::Info("TTreeCache::ReadBufferNormal", " Missing basket: %d for %s", j, b->GetName());
1944 perfStats->SetMissed(i, j);
1945 }
1946 }
1947 }
1948 };
1949
1950 //not found in cache. Do we need to fill the cache?
1951 Bool_t bufferFilled = FillBuffer();
1952 if (bufferFilled) {
1953 Int_t res = TFileCacheRead::ReadBuffer(buf,pos,len);
1954
1955 if (res == 1)
1956 fNReadOk++;
1957 else if (res == 0) {
1958 fNReadMiss++;
1959 auto perfStats = GetTree()->GetPerfStats();
1960 if (perfStats)
1961 recordMiss(perfStats, fBranches, bufferFilled, pos);
1962 }
1963
1964 return res;
1965 }
1966
1967 if (CheckMissCache(buf, pos, len)) {
1968 return 1;
1969 }
1970
1971 fNReadMiss++;
1972 auto perfStats = GetTree()->GetPerfStats();
1973 if (perfStats)
1974 recordMiss(perfStats, fBranches, bufferFilled, pos);
1975
1976 return 0;
1977}
1978
1979////////////////////////////////////////////////////////////////////////////////
1980/// Used to read a chunk from a block previously fetched. It will call FillBuffer
1981/// even if the cache lookup succeeds, because it will try to prefetch the next block
1982/// as soon as we start reading from the current block.
1983
1985{
1986 if (TFileCacheRead::ReadBuffer(buf, pos, len) == 1){
1987 //call FillBuffer to prefetch next block if necessary
1988 //(if we are currently reading from the last block available)
1989 FillBuffer();
1990 fNReadOk++;
1991 return 1;
1992 }
1993
1994 //keep on prefetching until request is satisfied
1995 // try to prefetch a couple of times and if request is still not satisfied then
1996 // fall back to normal reading without prefetching for the current request
1997 Int_t counter = 0;
1998 while (1) {
1999 if(TFileCacheRead::ReadBuffer(buf, pos, len)) {
2000 break;
2001 }
2002 FillBuffer();
2003 fNReadMiss++;
2004 counter++;
2005 if (counter>1) {
2006 return 0;
2007 }
2008 }
2009
2010 fNReadOk++;
2011 return 1;
2012}
2013
2014////////////////////////////////////////////////////////////////////////////////
2015/// Read buffer at position pos if the request is in the list of
2016/// prefetched blocks read from fBuffer.
2017/// Otherwise try to fill the cache from the list of selected branches,
2018/// and recheck if pos is now in the list.
2019/// Returns:
2020/// - -1 in case of read failure,
2021/// - 0 in case not in cache,
2022/// - 1 in case read from cache.
2023/// This function overloads TFileCacheRead::ReadBuffer.
2024
2026{
2027 if (!fEnabled) return 0;
2028
2030 return TTreeCache::ReadBufferPrefetch(buf, pos, len);
2031 else
2032 return TTreeCache::ReadBufferNormal(buf, pos, len);
2033}
2034
2035////////////////////////////////////////////////////////////////////////////////
2036/// This will simply clear the cache
2037
2039{
2040 for (Int_t i = 0; i < fNbranches; ++i) {
2042 if (b->GetDirectory()==0 || b->TestBit(TBranch::kDoNotProcess))
2043 continue;
2044 if (b->GetDirectory()->GetFile() != fFile)
2045 continue;
2046 b->fCacheInfo.Reset();
2047 }
2048 fEntryCurrent = -1;
2049 fEntryNext = -1;
2051 fNextClusterStart = -1;
2052
2054
2055 if (fEnablePrefetching) {
2056 fFirstTime = kTRUE;
2058 }
2059}
2060
2061////////////////////////////////////////////////////////////////////////////////
2062/// Change the underlying buffer size of the cache.
2063/// If the change of size means some cache content is lost, or if the buffer
2064/// is now larger, setup for a cache refill the next time there is a read
2065/// Returns:
2066/// - 0 if the buffer content is still available
2067/// - 1 if some or all of the buffer content has been made unavailable
2068/// - -1 on error
2069
2071{
2072 Int_t prevsize = GetBufferSize();
2073 Int_t res = TFileCacheRead::SetBufferSize(buffersize);
2074 if (res < 0) {
2075 return res;
2076 }
2077
2078 if (res == 0 && buffersize <= prevsize) {
2079 return res;
2080 }
2081
2082 // if content was removed from the buffer, or the buffer was enlarged then
2083 // empty the prefetch lists and prime to fill the cache again
2084
2086 if (fEnablePrefetching) {
2088 }
2089
2090 fEntryCurrent = -1;
2091 if (!fIsLearning) {
2092 fEntryNext = -1;
2093 }
2094
2095 return 1;
2096}
2097
2098////////////////////////////////////////////////////////////////////////////////
2099/// Set the minimum and maximum entry number to be processed
2100/// this information helps to optimize the number of baskets to read
2101/// when prefetching the branch buffers.
2102
2104{
2105 // This is called by TTreePlayer::Process in an automatic way...
2106 // don't restart it if the user has specified the branches.
2107 Bool_t needLearningStart = (fEntryMin != emin) && fIsLearning && !fIsManual;
2108
2109 fEntryMin = emin;
2110 fEntryMax = emax;
2112 if (gDebug > 0)
2113 Info("SetEntryRange", "fEntryMin=%lld, fEntryMax=%lld, fEntryNext=%lld",
2115
2116 if (needLearningStart) {
2117 // Restart learning
2119 }
2120}
2121
2122////////////////////////////////////////////////////////////////////////////////
2123/// Change the file that is being cached.
2124
2126{
2127 // The infinite recursion is 'broken' by the fact that
2128 // TFile::SetCacheRead remove the entry from fCacheReadMap _before_
2129 // calling SetFile (and also by setting fFile to zero before the calling).
2130 if (fFile) {
2131 TFile *prevFile = fFile;
2132 fFile = 0;
2133 prevFile->SetCacheRead(0, fTree, action);
2134 }
2136}
2137
2138////////////////////////////////////////////////////////////////////////////////
2139/// Static function to set the number of entries to be used in learning mode
2140/// The default value for n is 10. n must be >= 1
2141
2143{
2144 if (n < 1) n = 1;
2145 fgLearnEntries = n;
2146}
2147
2148////////////////////////////////////////////////////////////////////////////////
2149/// Set whether the learning period is started with a prefilling of the
2150/// cache and which type of prefilling is used.
2151/// The two value currently supported are:
2152/// - TTreeCache::kNoPrefill disable the prefilling
2153/// - TTreeCache::kAllBranches fill the cache with baskets from all branches.
2154/// The default prefilling behavior can be controlled by setting
2155/// TTreeCache.Prefill or the environment variable ROOT_TTREECACHE_PREFILL.
2156
2158{
2160}
2161
2162////////////////////////////////////////////////////////////////////////////////
2163/// The name should be enough to explain the method.
2164/// The only additional comments is that the cache is cleaned before
2165/// the new learning phase.
2166
2168{
2170 fIsManual = kFALSE;
2171 fNbranches = 0;
2172 if (fBrNames) fBrNames->Delete();
2174 fEntryCurrent = -1;
2175}
2176
2177////////////////////////////////////////////////////////////////////////////////
2178/// This is the counterpart of StartLearningPhase() and can be used to stop
2179/// the learning phase. It's useful when the user knows exactly what branches
2180/// they are going to use.
2181/// For the moment it's just a call to FillBuffer() since that method
2182/// will create the buffer lists from the specified branches.
2183
2185{
2186 if (fIsLearning) {
2187 // This will force FillBuffer to read the buffers.
2188 fEntryNext = -1;
2190 }
2191 fIsManual = kTRUE;
2192
2193 auto perfStats = GetTree()->GetPerfStats();
2194 if (perfStats)
2195 perfStats->UpdateBranchIndices(fBranches);
2196
2197 //fill the buffers only once during learning
2198 if (fEnablePrefetching && !fOneTime) {
2200 FillBuffer();
2201 fOneTime = kTRUE;
2202 }
2203}
2204
2205////////////////////////////////////////////////////////////////////////////////
2206/// Update pointer to current Tree and recompute pointers to the branches in the cache.
2207
2209{
2210
2211 fTree = tree;
2212
2213 fEntryMin = 0;
2215
2216 fEntryCurrent = -1;
2217
2218 if (fBrNames->GetEntries() == 0 && fIsLearning) {
2219 // We still need to learn.
2221 } else {
2222 // We learnt from a previous file.
2224 fEntryNext = -1;
2225 }
2226 fNbranches = 0;
2227
2228 TIter next(fBrNames);
2229 TObjString *os;
2230 while ((os = (TObjString*)next())) {
2231 TBranch *b = fTree->GetBranch(os->GetName());
2232 if (!b) {
2233 continue;
2234 }
2236 fNbranches++;
2237 }
2238
2239 auto perfStats = GetTree()->GetPerfStats();
2240 if (perfStats)
2241 perfStats->UpdateBranchIndices(fBranches);
2242}
2243
2244////////////////////////////////////////////////////////////////////////////////
2245/// Perform an initial prefetch, attempting to read as much of the learning
2246/// phase baskets for all branches at once
2247
2249{
2250 // This is meant for the learning phase
2251 if (!fIsLearning) return;
2252
2253 // This should be called before reading entries, otherwise we'll
2254 // always exit here, since TBranch adds itself before reading
2255 if (fNbranches > 0) return;
2256
2257 // Is the LearnPrefill enabled (using an Int_t here to allow for future
2258 // extension to alternative Prefilling).
2259 if (fPrefillType == kNoPrefill) return;
2260
2261 Long64_t entry = fTree ? fTree->GetReadEntry() : 0;
2262
2263 // Return early if we are out of the requested range.
2264 if (entry < fEntryMin || entry > fEntryMax) return;
2265
2267
2268
2269 // Force only the learn entries to be cached by temporarily setting min/max
2270 // to the learning phase entry range
2271 // But save all the old values, so we can restore everything to how it was
2272 Long64_t eminOld = fEntryMin;
2273 Long64_t emaxOld = fEntryMax;
2274 Long64_t ecurrentOld = fEntryCurrent;
2275 Long64_t enextOld = fEntryNext;
2276 auto currentClusterStartOld = fCurrentClusterStart;
2277 auto nextClusterStartOld = fNextClusterStart;
2278
2279 fEntryMin = std::max(fEntryMin, fEntryCurrent);
2280 fEntryMax = std::min(fEntryMax, fEntryNext);
2281
2282 // We check earlier that we are within the authorized range, but
2283 // we might still be out of the (default) learning range and since
2284 // this is called before any branch is added to the cache, this means
2285 // that the user's first GetEntry is this one which is outside of the
2286 // learning range ... so let's do something sensible-ish.
2287 // Note: we probably should also fix the learning range but we may
2288 // or may not have enough information to know if we can move it
2289 // (for example fEntryMin (eminOld right now) might be the default or user provided)
2290 if (entry < fEntryMin) fEntryMin = entry;
2291 if (entry > fEntryMax) fEntryMax = entry;
2292
2293 // Add all branches to be cached. This also sets fIsManual, stops learning,
2294 // and makes fEntryNext = -1 (which forces a cache fill, which is good)
2295 AddBranch("*");
2296 fIsManual = kFALSE; // AddBranch sets fIsManual, so we reset it
2297
2298 // Now, fill the buffer with the learning phase entry range
2299 FillBuffer();
2300
2301 // Leave everything the way we found it
2303 DropBranch("*"); // This doesn't work unless we're already learning
2304
2305 // Restore entry values
2306 fEntryMin = eminOld;
2307 fEntryMax = emaxOld;
2308 fEntryCurrent = ecurrentOld;
2309 fEntryNext = enextOld;
2310 fCurrentClusterStart = currentClusterStartOld;
2311 fNextClusterStart = nextClusterStartOld;
2312
2314}
#define R__unlikely(expr)
Definition RConfig.hxx:601
#define b(i)
Definition RSha256.hxx:100
bool Bool_t
Definition RtypesCore.h:63
int Int_t
Definition RtypesCore.h:45
unsigned int UInt_t
Definition RtypesCore.h:46
constexpr Bool_t kFALSE
Definition RtypesCore.h:101
constexpr Ssiz_t kNPOS
Definition RtypesCore.h:124
long long Long64_t
Definition RtypesCore.h:80
unsigned long long ULong64_t
Definition RtypesCore.h:81
constexpr Bool_t kTRUE
Definition RtypesCore.h:100
const char Option_t
Definition RtypesCore.h:66
#define ClassImp(name)
Definition Rtypes.h:377
R__EXTERN TEnv * gEnv
Definition TEnv.h:170
Option_t Option_t option
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t cursor
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h offset
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t r
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t result
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t UChar_t len
Option_t Option_t TPoint TPoint const char GetTextMagnitude GetFillStyle GetLineColor GetLineWidth GetMarkerStyle GetTextAlign GetTextColor GetTextSize void char Point_t Rectangle_t WindowAttributes_t Float_t Float_t Float_t Int_t Int_t UInt_t UInt_t Rectangle_t Int_t Int_t Window_t TString Int_t GCValues_t GetPrimarySelectionOwner GetDisplay GetScreen GetColormap GetNativeEvent const char const char dpyName wid window const char font_name cursor keysym reg const char only_if_exist regb h Point_t winding char text const char depth char const char Int_t count const char ColorStruct_t color const char Pixmap_t Pixmap_t PictureAttributes_t attr const char char ret_data h unsigned char height h Atom_t Int_t ULong_t ULong_t unsigned char prop_list Atom_t Atom_t Atom_t Time_t type
char name[80]
Definition TGX11.cxx:110
Int_t gDebug
Definition TROOT.cxx:585
void Printf(const char *fmt,...)
Formats a string in a circular formatting buffer and prints the string.
Definition TString.cxx:2481
R__EXTERN TSystem * gSystem
Definition TSystem.h:560
A Branch for the case of an object.
A TTree is a list of TBranches.
Definition TBranch.h:89
@ kDoNotProcess
Active bit for branches.
Definition TBranch.h:101
void PrintCacheInfo() const
Print the information we have about which basket is currently cached and whether they have been 'used...
Definition TBranch.cxx:2391
Long64_t fEntries
Number of entries.
Definition TBranch.h:130
A chain is a collection of files containing TTree objects.
Definition TChain.h:33
virtual Int_t GetTreeNumber() const
Definition TChain.h:120
static TClass * Class()
Long64_t * GetTreeOffset() const
Definition TChain.h:121
virtual Int_t GetEntries() const
virtual Int_t GetValue(const char *name, Int_t dflt) const
Returns the integer value for a resource.
Definition TEnv.cxx:491
A TEventList object is a list of selected events (entries) in a TTree.
Definition TEventList.h:31
virtual Bool_t ContainsRange(Long64_t entrymin, Long64_t entrymax)
Return TRUE if list contains entries from entrymin to entrymax included.
A cache when reading files over the network.
virtual Int_t SetBufferSize(Int_t buffersize)
Sets the buffer size.
virtual Int_t ReadBuffer(char *buf, Long64_t pos, Int_t len)
Read buffer at position pos.
virtual void SecondPrefetch(Long64_t, Int_t)
Bool_t fEnablePrefetching
reading by prefetching asynchronously
Int_t fNtot
Total size of prefetched blocks.
virtual void Prefetch(Long64_t pos, Int_t len)
Add block of length len at position pos in the list of blocks to be prefetched.
Int_t fBufferSizeMin
Original size of fBuffer.
Bool_t fIsTransferred
True when fBuffer contains something valid.
TFile * fFile
Pointer to file.
Int_t fNseek
Number of blocks to be prefetched.
void Print(Option_t *option="") const override
Print cache statistics.
virtual Int_t GetBufferSize() const
virtual void SetFile(TFile *file, TFile::ECacheAction action=TFile::kDisconnect)
Set the file using this cache and reset the current blocks (if any).
A ROOT file is a suite of consecutive data records (TKey instances) with a well defined format.
Definition TFile.h:51
virtual void SetCacheRead(TFileCacheRead *cache, TObject *tree=nullptr, ECacheAction action=kDisconnect)
Set a pointer to the read cache.
Definition TFile.cxx:2329
virtual Bool_t ReadBuffers(char *buf, Long64_t *pos, Int_t *len, Int_t nbuf)
Read the nbuf blocks described in arrays pos and len.
Definition TFile.cxx:1785
ECacheAction
TTreeCache flushing semantics.
Definition TFile.h:68
A TFriendElement TF describes a TTree object TF in a file.
virtual TTree * GetTree()
Return pointer to friend TTree.
A TLeaf describes individual elements of a TBranch See TBranch structure in TTree.
Definition TLeaf.h:57
virtual TLeaf * GetLeafCount() const
If this leaf stores a variable-sized array or a multi-dimensional array whose last dimension has vari...
Definition TLeaf.h:121
TBranch * GetBranch() const
Definition TLeaf.h:116
A doubly linked list.
Definition TList.h:38
TObject * FindObject(const char *name) const override
Find an object in this list using its name.
Definition TList.cxx:578
void Add(TObject *obj) override
Definition TList.h:81
TObject * Remove(TObject *obj) override
Remove object from the list.
Definition TList.cxx:822
void Delete(Option_t *option="") override
Remove all objects from the list AND delete all heap based objects.
Definition TList.cxx:470
const char * GetName() const override
Returns name of object.
Definition TNamed.h:47
An array of TObjects.
Definition TObjArray.h:31
Int_t GetEntriesFast() const
Definition TObjArray.h:58
void AddAt(TObject *obj, Int_t idx) override
Add object at position ids.
virtual void AddAtAndExpand(TObject *obj, Int_t idx)
Add object at position idx.
TObject * UncheckedAt(Int_t i) const
Definition TObjArray.h:84
TObject * Remove(TObject *obj) override
Remove object from array.
Collectable string class.
Definition TObjString.h:28
const char * GetName() const override
Returns name of object.
Definition TObjString.h:38
virtual void Warning(const char *method, const char *msgfmt,...) const
Issue warning message.
Definition TObject.cxx:956
virtual void Error(const char *method, const char *msgfmt,...) const
Issue error message.
Definition TObject.cxx:970
virtual void Info(const char *method, const char *msgfmt,...) const
Issue info message.
Definition TObject.cxx:944
Regular expression class.
Definition TRegexp.h:31
Basic string class.
Definition TString.h:139
Ssiz_t Length() const
Definition TString.h:421
void ToLower()
Change string to lower-case.
Definition TString.cxx:1170
Int_t Atoi() const
Return integer value of string.
Definition TString.cxx:1966
const char * Data() const
Definition TString.h:380
TString & ReplaceAll(const TString &s1, const TString &s2)
Definition TString.h:704
TString & Append(const char *cs)
Definition TString.h:576
void Form(const char *fmt,...)
Formats a string using a printf style format descriptor.
Definition TString.cxx:2334
Bool_t Contains(const char *pat, ECaseCompare cmp=kExact) const
Definition TString.h:636
Ssiz_t Index(const char *pat, Ssiz_t i=0, ECaseCompare cmp=kExact) const
Definition TString.h:651
virtual const char * Getenv(const char *env)
Get environment variable.
Definition TSystem.cxx:1666
A cache to speed-up the reading of ROOT datasets.
Definition TTreeCache.h:32
virtual void SetLearnPrefill(EPrefillType type=kNoPrefill)
Set whether the learning period is started with a prefilling of the cache and which type of prefillin...
virtual void UpdateBranches(TTree *tree)
Update pointer to current Tree and recompute pointers to the branches in the cache.
void SetOptimizeMisses(Bool_t opt)
Start of methods for the miss cache.
Double_t GetEfficiencyRel() const
This will indicate a sort of relative efficiency... a ratio of the reads found in the cache to the nu...
Bool_t fReverseRead
! reading in reverse mode
Definition TTreeCache.h:58
Bool_t fLearnPrefilling
! true if we are in the process of executing LearnPrefill
Definition TTreeCache.h:68
Long64_t fLastMiss
! set to the event # of the last miss.
Definition TTreeCache.h:74
Int_t fNMissReadPref
Number of blocks read into the secondary ("miss") cache.
Definition TTreeCache.h:50
Int_t fNMissReadOk
Number of blocks read, not found in the primary cache, and found in the secondary cache.
Definition TTreeCache.h:46
Bool_t fOneTime
! used in the learning phase
Definition TTreeCache.h:57
Long64_t fEntryMin
! first entry in the cache
Definition TTreeCache.h:38
virtual Bool_t FillBuffer()
Fill the cache buffer with the branches in the cache.
Bool_t ProcessMiss(Long64_t pos, int len)
! Given a file read not in the miss cache, handle (possibly) loading the data.
static void SetLearnEntries(Int_t n=10)
Static function to set the number of entries to be used in learning mode The default value for n is 1...
Int_t LearnBranch(TBranch *b, Bool_t subgbranches=kFALSE) override
Add a branch discovered by actual usage to the list of branches to be stored in the cache this functi...
Long64_t fEntryNext
! next entry number where cache must be filled
Definition TTreeCache.h:41
Int_t fNReadMiss
Number of blocks read and not found in the cache.
Definition TTreeCache.h:47
Double_t GetEfficiency() const
Give the total efficiency of the primary cache... defined as the ratio of blocks found in the cache v...
Bool_t fEnabled
! cache enabled for cached reading
Definition TTreeCache.h:63
Bool_t fIsLearning
! true if cache is in learning mode
Definition TTreeCache.h:54
TTree * GetTree() const
Definition TTreeCache.h:149
Long64_t fNextClusterStart
! End+1 of the cluster(s) where the current content was picked out
Definition TTreeCache.h:43
const TObjArray * GetCachedBranches() const
Definition TTreeCache.h:139
virtual ~TTreeCache()
Destructor. (in general called by the TFile destructor)
Bool_t CheckMissCache(char *buf, Long64_t pos, int len)
Check the miss cache for a particular buffer, fetching if deemed necessary.
Int_t fNMissReadMiss
Number of blocks read and not found in either cache.
Definition TTreeCache.h:48
IOPos FindBranchBasketPos(TBranch &, Long64_t entry)
Given a branch and an entry, determine the file location (offset / size) of the corresponding basket.
virtual void SetEntryRange(Long64_t emin, Long64_t emax)
Set the minimum and maximum entry number to be processed this information helps to optimize the numbe...
TTreeCache()
Default Constructor.
std::unique_ptr< MissCache > fMissCache
! Cache contents for misses
Definition TTreeCache.h:105
Bool_t fFirstTime
! save the fact that we processes the first entry
Definition TTreeCache.h:60
void StartLearningPhase()
The name should be enough to explain the method.
Bool_t fReadDirectionSet
! read direction established
Definition TTreeCache.h:62
TBranch * CalculateMissEntries(Long64_t, int, bool)
Given an file read, try to determine the corresponding branch.
Long64_t fCurrentClusterStart
! Start of the cluster(s) where the current content was picked out
Definition TTreeCache.h:42
Double_t GetMissEfficiency() const
The total efficiency of the 'miss cache' - defined as the ratio of blocks found in the cache versus t...
virtual Int_t ReadBufferNormal(char *buf, Long64_t pos, Int_t len)
Old method ReadBuffer before the addition of the prefetch mechanism.
virtual void ResetCache()
This will simply clear the cache.
Bool_t fIsManual
! true if cache is StopLearningPhase was used
Definition TTreeCache.h:55
EPrefillType fPrefillType
Whether a pre-filling is enabled (and if applicable which type)
Definition TTreeCache.h:64
virtual void LearnPrefill()
Perform an initial prefetch, attempting to read as much of the learning phase baskets for all branche...
Long64_t fEntryMax
! last entry in the cache
Definition TTreeCache.h:39
static Int_t fgLearnEntries
number of entries used for learning mode
Definition TTreeCache.h:65
virtual Int_t DropBranch(TBranch *b, Bool_t subbranches=kFALSE)
Remove a branch to the list of branches to be stored in the cache this function is called by TBranch:...
virtual Int_t ReadBufferPrefetch(char *buf, Long64_t pos, Int_t len)
Used to read a chunk from a block previously fetched.
Long64_t fEntryCurrent
! current lowest entry number in the cache
Definition TTreeCache.h:40
void ResetMissCache()
Reset all the miss cache training.
Long64_t fFirstMiss
! set to the event # of the first miss.
Definition TTreeCache.h:73
Int_t ReadBuffer(char *buf, Long64_t pos, Int_t len) override
Read buffer at position pos if the request is in the list of prefetched blocks read from fBuffer.
Double_t GetMissEfficiencyRel() const
Relative efficiency of the 'miss cache' - ratio of the reads found in cache to the number of reads so...
Int_t SetBufferSize(Int_t buffersize) override
Change the underlying buffer size of the cache.
Long64_t fFirstEntry
! save the value of the first entry
Definition TTreeCache.h:61
Int_t fFillTimes
! how many times we can fill the current buffer
Definition TTreeCache.h:59
Int_t fNReadPref
Number of blocks that were prefetched.
Definition TTreeCache.h:49
TTree * fTree
! pointer to the current Tree
Definition TTreeCache.h:53
EPrefillType GetConfiguredPrefillType() const
Return the desired prefill type from the environment or resource variable.
Bool_t fFirstBuffer
! true if first buffer is used for prefetching
Definition TTreeCache.h:56
static Int_t GetLearnEntries()
Static function returning the number of entries used to train the cache see SetLearnEntries.
virtual void StopLearningPhase()
This is the counterpart of StartLearningPhase() and can be used to stop the learning phase.
Bool_t fOptimizeMisses
! true if we should optimize cache misses.
Definition TTreeCache.h:72
void SetFile(TFile *file, TFile::ECacheAction action=TFile::kDisconnect) override
Change the file that is being cached.
Int_t fNbranches
! Number of branches in the cache
Definition TTreeCache.h:44
Int_t fNReadOk
Number of blocks read and found in the cache.
Definition TTreeCache.h:45
void Print(Option_t *option="") const override
Print cache statistics.
Int_t AddBranch(TBranch *b, Bool_t subgbranches=kFALSE) override
Add a branch to the list of branches to be stored in the cache this function is called by the user vi...
TObjArray * fBranches
! List of branches to be stored in the cache
Definition TTreeCache.h:51
TList * fBrNames
! list of branch names in the cache
Definition TTreeCache.h:52
Helper class to iterate over cluster of baskets.
Definition TTree.h:267
Long64_t Next()
Move on to the next cluster and return the starting entry of this next cluster.
Definition TTree.cxx:649
Long64_t GetNextEntry()
Definition TTree.h:304
A TTree represents a columnar dataset.
Definition TTree.h:79
virtual TVirtualPerfStats * GetPerfStats() const
Definition TTree.h:503
virtual TBranch * GetBranch(const char *name)
Return pointer to the branch with the given name in this tree or its friends.
Definition TTree.cxx:5285
virtual TObjArray * GetListOfLeaves()
Definition TTree.h:486
virtual Long64_t GetEntries() const
Definition TTree.h:460
virtual Long64_t GetReadEntry() const
Definition TTree.h:506
virtual TTree * GetTree() const
Definition TTree.h:514
TEventList * GetEventList() const
Definition TTree.h:470
TClass * IsA() const override
Definition TTree.h:654
virtual TList * GetListOfFriends() const
Definition TTree.h:487
Provides the interface for the PROOF internal performance measurement and event tracing.
virtual void SetMissed(TBranch *b, size_t basketNumber)=0
virtual void UpdateBranchIndices(TObjArray *branches)=0
const Int_t n
Definition legend1.C:16
void Print(std::ostream &os, const OptionType &opt)
Long64_t BinarySearch(Long64_t n, const T *array, T value)
Binary search in an array of n values to locate value.
Definition TMathBase.h:347
Definition file.py:1
Definition tree.py:1
Int_t fLen
Position in file of cache entry.
Definition TTreeCache.h:82
Ta Range(0, 0, 1, 1)