RE: [ROOT] Optimizing performance in Windows

From: Ed Oltman (eoltman@imago.com)
Date: Mon Feb 25 2002 - 16:03:59 MET


Hi Rene,

The root file I am processing is 133,440 KB.  Here's what the ntuple looks
like:

root [1] nt.Print()
****************************************************************************
**
*Tree    :nt        : Raw
*
*Entries : 10009775 : Total =       361019458 bytes  File  Size =  137536093
*
*        :          : Tree compression factor =   2.63
*
****************************************************************************
**
*Br    0 :x         :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   20655625
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.94
*
*...........................................................................
.*
*Br    1 :y         :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   20856422
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.92
*
*...........................................................................
.*
*Br    2 :tof       :
*
*Entries : 10009775 : Total  Size=   40090988 bytes  File Size  =   24044984
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.67
*
*...........................................................................
.*
*Br    3 :z         :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   12689049
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   3.16
*
*...........................................................................
.*
*Br    4 :v         :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =     157586
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression= 254.42
*
*...........................................................................
.*
*Br    5 :delta     :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =    9622126
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   4.17
*
*...........................................................................
.*
*Br    6 :pulse     :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   14472837
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   2.77
*
*...........................................................................
.*
*Br    7 :vap       :
*
*Entries : 10009775 : Total  Size=   40090988 bytes  File Size  =     151613
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression= 264.43
*
*...........................................................................
.*
*Br    8 :erate     :
*
*Entries : 10009775 : Total  Size=   40093494 bytes  File Size  =   34702827
*
*Baskets :     1253 : Basket Size=      32000 bytes  Compression=   1.16
*
*...........................................................................
.*
root [2]


To spin through the ntuple once it takes about 1 minute of real time (54 sec
CPU)

When I create an output ntuple, it takes about 3 minutes real time (2:30
CPU).  Root
page faults about 400-450 times/sec and the memory increases by about 1.6
MB/sec. Once the
output ntuple was made the process size was 320 MB (originally 40 MB)  The
output ntuple
looks like:

root [9] ntCorr->Print()
****************************************************************************
**
*Tree    :ntCorr    : 1008CC16
*
*Entries :  8595694 : Total =       275456000 bytes  File  Size =          0
*
*        :          : Tree compression factor =   1.00
*
****************************************************************************
**
*Br    0 :x         :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    1 :y         :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    2 :z         :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    3 :tofc      :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    4 :tofr      :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    5 :v         :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    6 :erate     :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*
*Br    7 :bv        :
*
*Entries :  8595694 : Total  Size=   34432000 bytes  File Size  =          0
*
*Baskets :     1076 : Basket Size=      32000 bytes  Compression=   1.00
*
*...........................................................................
.*

When I would spin through this (memory resident) ntuple (15% fewer entries
than input)
it only took 11 seconds (CPU and real)..


Here is the output from ROOTSYS\test\event.exe
[I have a Dell Precision 330: P4 1.5 GHz w/1GB RAM]

==> Step 1
C:\ROOT\test>Event 2000 0 0 1
event:0, rtime=0.032000 s
event:100, rtime=0.609001 s
event:200, rtime=0.608999 s
event:300, rtime=0.610001 s
event:400, rtime=0.608999 s
event:500, rtime=0.610001 s
event:600, rtime=0.608999 s
event:700, rtime=0.608999 s
event:800, rtime=0.610001 s
event:900, rtime=0.593000 s
event:1000, rtime=0.610001 s
event:1100, rtime=0.608999 s
event:1200, rtime=0.610001 s
event:1300, rtime=0.608999 s
event:1400, rtime=0.594000 s
event:1500, rtime=0.609001 s
event:1600, rtime=0.608999 s
event:1700, rtime=0.610001 s
event:1800, rtime=0.608999 s
event:1900, rtime=0.610001 s
****************************************************************************
**
*Tree    :T         : An example of a ROOT tree
*
*Entries :     2000 : Total =       113596283 bytes  File  Size =  113596283
*
*        :          : Tree compression factor =   1.00
*
****************************************************************************
**
*Br    0 :event     :
*
*Entries :     2000 : Total  Size=  113560552 bytes  File Size  =  113560552
 *
*Baskets :     2000 : Basket Size=      64000 bytes  Compression=   1.00
*
*...........................................................................
.*

2000 events and 113414552 bytes processed.
RealTime=12.282000 seconds, CpuTime=12.000000 seconds
compression level=0, split=0, arg4=1
You write 9.234209 Mbytes/Realtime seconds
You write 9.451213 Mbytes/Cputime seconds

==>Step 2
C:\ROOT\test>Event 2000 0 0 20
event:0, rtime=0.063000 s
event:100, rtime=0.202999 s
event:200, rtime=0.203001 s
event:300, rtime=0.202999 s
event:400, rtime=0.187000 s
event:500, rtime=0.204000 s
event:600, rtime=0.202999 s
event:700, rtime=0.187000 s
event:800, rtime=0.202999 s
event:900, rtime=0.188002 s
event:1000, rtime=0.202999 s
event:1100, rtime=0.203001 s
event:1200, rtime=0.188000 s
event:1300, rtime=0.202999 s
event:1400, rtime=0.203001 s
event:1500, rtime=0.186998 s
event:1600, rtime=0.203001 s
event:1700, rtime=0.204000 s
event:1800, rtime=0.187000 s
event:1900, rtime=0.202999 s

2000 events and 110950706 bytes processed.
RealTime=4.030998 seconds, CpuTime=4.000000 seconds
You read 27.524375 Mbytes/Realtime seconds
You read 27.737677 Mbytes/Cputime seconds

==>Step 3
C:\ROOT\test>Event 2000 0 0 20
event:0, rtime=0.063002 s
event:100, rtime=0.202999 s
event:200, rtime=0.203001 s
event:300, rtime=0.202999 s
event:400, rtime=0.188000 s
event:500, rtime=0.203001 s
event:600, rtime=0.186998 s
event:700, rtime=0.203001 s
event:800, rtime=0.188000 s
event:900, rtime=0.203001 s
event:1000, rtime=0.202999 s
event:1100, rtime=0.188000 s
event:1200, rtime=0.203001 s
event:1300, rtime=0.202999 s
event:1400, rtime=0.187000 s
event:1500, rtime=0.204000 s
event:1600, rtime=0.187000 s
event:1700, rtime=0.202999 s
event:1800, rtime=0.188000 s
event:1900, rtime=0.203001 s

2000 events and 110950706 bytes processed.
RealTime=4.016001 seconds, CpuTime=4.000000 seconds
You read 27.627163 Mbytes/Realtime seconds
You read 27.737677 Mbytes/Cputime seconds


I hope this information is helpful..


Ed


> -----Original Message-----
> From: Rene Brun [mailto:brun@pcbrun.cern.ch]
> Sent: Saturday, February 23, 2002 2:48 AM
> To: Ed Oltman
> Cc: Roottalk@Pcroot. Cern. Ch
> Subject: Re: [ROOT] Optimizing performance in Windows
>
>
> Hi Ed,
>
> You do not mention the size of your file nor the time it takes
> to process the file (Real and Cpu time). To help me to understand
> your problem, could you run the standard ROOT test program
> in $ROOTSYS/test?
> cd to this directory. Build Event (make or nmake)
> then run the following sequence
>   Event 2000 0 0 1
>   Event 2000 0 0 20
>   Event 2000 0 0 20
>
> Send me the output of the 3 steps.
> The first step should produce a 113 Mbytes file
> The following steps read the file.
> To give you an indication, here is the output on my small laptop
> (DELL P III 600 Mhz with 256 Mbytes RAM) running Linux.
>
> ==>step 1
> (pcnotebrun) #[182] Event 2000 0 0 1
> event:0, rtime=0.640000 s
> event:100, rtime=0.720000 s
> event:200, rtime=0.710000 s
> event:300, rtime=0.730000 s
> event:400, rtime=0.710000 s
> event:500, rtime=0.720000 s
> event:600, rtime=0.730000 s
> event:700, rtime=0.730000 s
> event:800, rtime=0.730000 s
> event:900, rtime=0.740000 s
> event:1000, rtime=0.720000 s
> event:1100, rtime=0.740000 s
> event:1200, rtime=0.740000 s
> event:1300, rtime=0.740000 s
> event:1400, rtime=0.740000 s
> event:1500, rtime=0.750000 s
> event:1600, rtime=0.740000 s
> event:1700, rtime=0.740000 s
> event:1800, rtime=0.720000 s
> event:1900, rtime=0.740000 s
> ******************************************************************
> ************
> *Tree    :T         : An example of a ROOT tree
> *
> *Entries :     2000 : Total =       113630291 bytes  File  Size =
> 113630291 *
> *        :          : Tree compression factor =   1.00
> *
> ******************************************************************
> ************
> *Br    0 :event     :
> *
> *Entries :     2000 : Total  Size=  113594552 bytes  File Size  =
> 113594552 *
> *Baskets :     2000 : Basket Size=      64000 bytes  Compression=   1.00
> *
> *.................................................................
> ...........*
>
> 2000 events and 113448552 bytes processed.
> RealTime=18.320000 seconds, CpuTime=14.470000 seconds
> compression level=0, split=0, arg4=1
> You write 6.192607 Mbytes/Realtime seconds
> You write 7.840260 Mbytes/Cputime seconds
>
> ==step 2
> (pcnotebrun) #[183] Event 2000 0 0 20
> event:0, rtime=0.070000 s
> event:100, rtime=0.250000 s
> event:200, rtime=0.250000 s
> event:300, rtime=0.220000 s
> event:400, rtime=0.230000 s
> event:500, rtime=0.230000 s
> event:600, rtime=0.500000 s
> event:700, rtime=1.150000 s
> event:800, rtime=0.240000 s
> event:900, rtime=0.250000 s
> event:1000, rtime=0.440000 s
> event:1100, rtime=0.460000 s
> event:1200, rtime=0.310000 s
> event:1300, rtime=0.260000 s
> event:1400, rtime=0.450000 s
> event:1500, rtime=0.440000 s
> event:1600, rtime=0.240000 s
> event:1700, rtime=0.520000 s
> event:1800, rtime=0.460000 s
> event:1900, rtime=0.510000 s
>
> 2000 events and 113448552 bytes processed.
> RealTime=8.510000 seconds, CpuTime=4.390000 seconds
> You read 13.331205 Mbytes/Realtime seconds
> You read 25.842495 Mbytes/Cputime seconds
>
>
> ==>step 3
> (pcnotebrun) #[185] Event 2000 0 0 20
> event:0, rtime=0.070000 s
> event:100, rtime=0.210000 s
> event:200, rtime=0.190000 s
> event:300, rtime=0.180000 s
> event:400, rtime=0.190000 s
> event:500, rtime=0.190000 s
> event:600, rtime=0.190000 s
> event:700, rtime=0.190000 s
> event:800, rtime=0.180000 s
> event:900, rtime=0.190000 s
> event:1000, rtime=0.190000 s
> event:1100, rtime=0.190000 s
> event:1200, rtime=0.180000 s
> event:1300, rtime=0.190000 s
> event:1400, rtime=0.190000 s
> event:1500, rtime=0.180000 s
> event:1600, rtime=0.200000 s
> event:1700, rtime=0.180000 s
> event:1800, rtime=0.190000 s
> event:1900, rtime=0.190000 s
>
> 2000 events and 113448552 bytes processed.
> RealTime=3.850000 seconds, CpuTime=3.750000 seconds
> You read 29.467157 Mbytes/Realtime seconds
> You read 30.252948 Mbytes/Cputime seconds
>
> Could you also send me the result of TTree::Print on your ntuple?
>
> Rene Brun
>
> On Fri, 22 Feb 2002, Ed Oltman wrote:
>
> > Hello,
> >
> >   I would like to know if anybody has any advice on performance
> optimizing
> > on Win2K - I am using qtRoot "version 3.02/06 25 January 2002".  My
> > application analyzed a root file which consists of a TNtuple.
> The analysis
> > requires spinning though the TNtuple multiple times.  The
> TNtuple has 8-10
> > branches and can contain up to 10's of millions of entries.
> Note: I have a
> > gigabyte of physical memory.  I have 2 questions:
> >
> > 1. When I first spin through the ntuple, it takes a long time -
> I can see
> > that my root process is doing lots of I/O - fine.  when I do a
> .ls I see the
> > input ntuple "in memory" that is, it shows up on a line
> starting w/ OBJ:.
> > However my process memory is about the same size as it was
> before I read in
> > the TNtuple.  When I spin through the TNtuple subsequent times, it takes
> > just as long - I my process once again logs lots of I/O.  How
> can I read in
> > the ntuple in such a way that it lives in memory?
> >
> > 2.  At some point, I generate an output TNtuple.  When I do
> this, my root
> > process starts page faulting like crazy but eventually the
> process memory
> > size creeps up to the required size.  [Note: when I spin through the
> > TNtuple, its quite a bit faster!]  How can I configure root to
> use start out
> > with a larger allocated memory so that I don't need to page fault.
> >
> > Thanks..
> >
> > Ed Oltman
> >
> >
>
>
>
>



This archive was generated by hypermail 2b29 : Sat Jan 04 2003 - 23:50:42 MET