Re: Cloning trees with TChain: drastic performance loss

From: Alexander Mann <amann_at_uni-goettingen.de>
Date: Mon, 16 May 2011 18:27:06 +0200

Hi Philippe,

oh, indeed, thanks for spotting this!

Is one of those variants

 > collChain->GetEntry(jentry);
 > collChain->GetTree()->GetEntry(ientry);

recommended over the other?

cu,
Alexander

On 05/16/2011 06:06 PM, Philippe Canal wrote:
> Hi Alexander,
>
> Instead of
>
> Long64_t ientry = collChain->LoadTree(jentry);
> collChain->GetEntry(ientry);
>
> you meant:
>
> Long64_t ientry = collChain->LoadTree(jentry);
> collChain->GetEntry(jentry);
>
> or
>
> Long64_t ientry = collChain->LoadTree(jentry);
> collChain->GetTree()->GetEntry(ientry);
>
> The value returned by LoadTree is the entry number within the current
> TTree and
> must be used only on the TBranch object or on the underlying TTree.
>
> The code as-is is not only slow but also only look at the data in the
> first TTree.
>
> Cheers,
> Philippe.
>
> On 5/16/11 10:58 AM, Alexander Mann wrote:
>>
>> Hi Philippe,
>>
>> I don't see anything that could trigger such a behaviour, but maybe I
>> am missing something. This is the main loop of my code:
>>
>> // connect branches
>> TBranch *b_runno;
>> collChain->SetBranchAddress("RunNumber", &runno, &b_runno);
>> TBranch *b_evno;
>> collChain->SetBranchAddress("EventNumber", &evno, &b_evno);
>>
>> //Create a new file + a clone of old tree in new file
>> TFile *newfile = new TFile(outFile.c_str(), "recreate");
>> TTree *newtree = collChain->CloneTree(0);
>>
>> // do cloning
>> time_t stimee = time(NULL);
>> cout << "Start cloning tree at " << stimee << "..." << endl;
>> long stepsize = nentries / 100;
>> long stepleft = stepsize;
>> long evwritten = 0;
>> for (Long64_t jentry = 0; jentry < nentries; ++jentry) {
>> stepleft = 1;
>> if (--stepleft == 0) {
>> stepleft = stepsize;
>> cout << jentry << "/" << nentries << " (" << evwritten << ", " <<
>> currtime() << ")\r";
>> fflush(stdout);
>> }
>> Long64_t ientry = collChain->LoadTree(jentry);
>> collChain->GetEntry(ientry);
>> if (vec_evnos[runno].find(evno) != vec_evnos[runno].end()) {
>> newtree->Fill();
>> ++evwritten;
>> }
>> }
>> cout << endl;
>>
>> I am using ROOT 5.26.00e and running over quasi-local files. If you
>> prefer I can send you a test case including the complete macro.
>>
>> cu,
>> Alexander
>>
>>
>> On 05/16/2011 05:46 PM, Philippe Canal wrote:
>>> Hi Alexander,
>>>
>>> LoadTree should be most of the time 'immediate'. The behavior you
>>> describe usually indicates that some other portion of the code is
>>> directly or indirectly doing another call to LoadTree with a totally
>>> different entry number
>>> ( 0 or GetEntries()-1 are the most typical). This result is having to
>>> open the file and load the TTree object every entry (and this is a
>>> 'slow' operation).
>>>
>>> Cheers,
>>> Philippe.
>>>
>>> On 5/16/11 10:41 AM, Alexander Mann wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am using a slightly modified version of some standard ROOT example
>>>> code from
>>>>
>>>> http://root.cern.ch/root/html/tutorials/tree/copytree3.C.html
>>>>
>>>> to skim TTrees. The main difference is that I am using TChain instead
>>>> of TTree, and therefore I am using
>>>>
>>>> Long64_t ientry = collChain->LoadTree(jentry);
>>>> collChain->GetEntry(ientry);
>>>>
>>>> This works fine, but after ~500 events (which is probably all from the
>>>> first file in the TChain) the speed abruptly decreases from about 27
>>>> events/s to 0.05 events/s, spent in equal parts in LoadTree and
>>>> GetEntry as it seems.
>>>>
>>>> Is there something I can do about this?
>>>>
>>>> cu,
>>>> Alexander
>>>>
>>
Received on Mon May 16 2011 - 18:27:12 CEST

This archive was generated by hypermail 2.2.0 : Mon May 16 2011 - 23:50:01 CEST