Hi,
> I have more details about the memory problem/leak.
I don't see a leak - a leak would mean that the memory is lost. You simply ask the system to allocate more memory that it has.
> 33k * 33k -> Works; almost no memory allocated (NOTE: size in memory
> is 2**15 * 2**15 * sizeof(double) > 4 GiB)
What you observe is indeed an integer overflow: TH1 stores the number of cells in a member of type Int_t. I've added Lorenzo who is maintaining it.
Anyway - I believe TH2 are not the optimal solution for your case. A few more questions:
> One more detail, the TH2I matrices are build symmetric(or if the
> matrix is not build symmetric we have to produce ProjectionY too).
So - are they symmetrical or not? Do you have rare cases where they are not symmetrical? Or are they "almost" symmetrical? I don't understand how to read your statement...
> ProjectionX('..",p_low,p_high) selecting a peak
>
> and two
> ProjectionX('..',b1_low,b1_high) and ProjectionX('..',b2_low,b2_high)
How do you determine p_low, p_high? Simply by subtracting / adding a predetermined constant from the peak center, i.e. is the constant fixed for the whole matrix? Or do you adapt the width e.g. based on the distance to the nearest peak?
Cheers, Axel.
On 2009-11-07 14:00, Plamen Boutachkov wrote:
> Hi Axel,
>
> I have more details about the memory problem/leak.
> Andrey Blazhev and Norbert Braun from Cologne, who are part of our group
> working on the analysis looked into the allocated memory from large TH2D
> histogram on a 64bit machine with 24GB of RAM. They reported:
> 20k * 30k -> 900 MB allocated
> 30k * 30k -> Gives the known exception
> 33k * 33k -> Works; almost no memory allocated (NOTE: size in memory is 2**15 * 2**15 * sizeof(double) > 4 GiB)
> 50k * 30k -> similar to the 20k * 30k case which means that the Integer value is wrapped around.
>
> In the analysis we had problems with crashes then many large histograms
> were created in memory. As a solution we reduced the number of large
> histograms until we had no bad_alloc error. Nevertheless we observed
> "fake" peaks in some of our histograms. They were high peaks with width
> of one channel (which can not be created from our detectors).
>
> I implement THnSparse for all 2D histograms in our analysis. Now the
> output root file which contains all histograms is 4 times larger compare
> to the output file which was generated when TH2I were used, supporting
> the suggestion that we had a memory leak when creating many TH2I
> histograms before.
>
> The question is how can we ensure that we still do not have a memory
> leak? (That all peaks in the histograms are real).
>
> I can track the memory allocated for histograms looking at how many
> channels we allocated in TH1I and with GetSparseFractionMem for the
> THnSparse histograms. Can this be a solution?
>
> On you question: we first project the full range
> ProjectionX('..',-1,-1)
> Then for each peak in the spectrum we make 3 ProjectionX:
>
> ProjectionX('..",p_low,p_high) selecting a peak
>
> and two
> ProjectionX('..',b1_low,b1_high) and ProjectionX('..',b2_low,b2_high)
> selecting the background around the peak.
>
> Then we subtract ProjectionX('..",p_low,p_high) -
> w1*ProjectionX('..',b1_low,b1_high)-w2*ProjectionX('..',b2_low,b2_high)
>
> from the peaks we see in the obtained spectra we decide were to gate
> next and repeat the above procedure. As there are many possible
> coincidences we need look at, the time it takes to generate the final
> spectra is of importance.
>
> One more detail, the TH2I matrices are build symmetric(or if the matrix
> is not build symmetric we have to produce ProjectionY too).
>
> Considering this is a TTree or THnSparse histogram better?
>
> Cheers,
> Plamen
>
> On Fri, 2009-11-06 at 16:15 +0100, Axel Naumann wrote:
>> Hi Plamen,
>>
>> thanks, now we have a clearer idea how to help :-) Follow-up question:
>> do you care about the full y range when scanning for a coincidence with
>> x, or only for the n closest bins?
>>
>> I can suggest two options: you can use a THnSparse, as suggested by
>> David. That one only needs memory for *filled* bins. Or you can use a
>> TTree, which will allow you to keep only a relevant part of the data in
>> memory. We can discuss its layout if you want; you might want to create
>> branches for e.g. all x bins.
>>
>> Let us know!
>>
>> Cheers, Axel.
>>
>> Plamen Boutachkov wrote on 11/06/2009 03:50 PM:
>>> Hi Axel,
>>>
>>> We are analyzing an experiment in which we study the gamma decay of
>>> unstable nuclei. For each nucleus we record the gamma rays emitted in a
>>> coincidence with a hyper pure germanium detector array.
>>>
>>> We fill the TH2I(5001,0.,5000.,5001,0.,5000.) histograms with all
>>> possible pairs (E_i, E_j) of observed gamma rays energies in an event of
>>> interest. Latter on we make multiple gates on one of the axis of the
>>> matrix and look for the gamma rays in coincidence.
>>>
>>> There are a couple of these TH2I histograms per nucleus (created with
>>> different conditions) and about 10 nuclei which we would like to study.
>>> It takes about 1h to sort through all the data in order to produce the
>>> TH2I histograms.
>>>
>>> We need the high number of channels to conserve the resolution of the
>>> gamma detectors. On other hand there are many channels with few or zero
>>> counts in the TH2I histograms.
>>>
>>> Cheers,
>>> Plamen
>>>
>>>
>>>
>>> On Fri, 2009-11-06 at 15:03 +0100, Axel Naumann wrote:
>>>> Hi Plamen,
>>>>
>>>> reaching these limits usually means that your approach is wrong. What
>>>> are you trying to do?
>>>>
>>>> Cheers, Axel.
>>>>
>>>> Plamen Boutachkov wrote on 11/06/2009 02:57 PM:
>>>>> Hi David,
>>>>>
>>>>> I do not know enough about the root internal memory management. I was
>>>>> hoping that the swap of the machines will help.
>>>>>
>>>>> But then if the limit is the physical RAM of the machine why I can
>>>>> create
>>>>> TH2D *t = new TH2D("test","t",70000,0.,1.,70000,0.,1.)
>>>>> 70000*70000* 8 B = 36GB
>>>>>
>>>>> On the same machine 32 bit 4GB RAM I tried:
>>>>> TH2I *t = new TH2I("test","t",100000,0.,1.,100000,0.,1.)
>>>>> a 37GB histogram, it worked.
>>>>>
>>>>> Is there a difference how one and many histograms are handled in
>>>>> memory?
>>>>> Should one track the total memory spend for histograms and make sure it
>>>>> is bellow a given number?
>>>>>
>>>>> Thank you for the suggestion of THnSparse.
>>>>>
>>>>> Plamen
>>>>>
>>>>> On Fri, 2009-11-06 at 14:10 +0100, David Gonzalez Maline wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Just some simple maths would help.
>>>>>>
>>>>>> 5001 xbins * 5001 ybins * 4bytes each bin * 40 histograms ~= 3.72 GB of
>>>>>> memory...
>>>>>>
>>>>>> Obviously in the second machine you don't get such an error until you
>>>>>> change the number of bins to a much bigger number. Consider using
>>>>>> THnSparse instead and/or reducing the number of bins.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>
>>>>>> Plamen Boutachkov wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I need to create a large number of TH2I, this leads to a crash. Here
>>>>>>> is an example script which reproduces the problem:
>>>>>>>
>>>>>>> {
>>>>>>> Int_t nHistograms = 40 ;
>>>>>>> TH2I *h[nHistograms] ;
>>>>>>>
>>>>>>> for(Int_t nIndex = 0; nIndex < nHistograms ; ++nIndex ) {
>>>>>>> TString sHistogramNname = Form("h%d",nIndex);
>>>>>>> cout<<"Histogram Number:"<< nIndex << endl;
>>>>>>>
>>>>>>> h[nIndex] = new TH2I(sHistogramNname.Data(),sHistogramNname.Data()
>>>>>>> ,5001,0.,5000.
>>>>>>> ,5001,0.,5000.);
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> executing the script gives:
>>>>>>>
>>>>>>> Histogram Number:0
>>>>>>> ...
>>>>>>> Histogram Number:29
>>>>>>> Histogram Number:30
>>>>>>> Error: Symbol G__exception is not defined in current scope
>>>>>>> scripts/test.C:11:
>>>>>>> Error: type G__exception not defined FILE:/tmp/test.C LINE:11
>>>>>>> *** Interpreter error recovered ***
>>>>>>>
>>>>>>> If I include the above loop in a stand alone program the error is:
>>>>>>> Histogram Number:0
>>>>>>> ...
>>>>>>> Histogram Number:30
>>>>>>> terminate called after throwing an instance of 'std::bad_alloc'
>>>>>>> what(): std::bad_alloc
>>>>>>> Aborted
>>>>>>>
>>>>>>>
>>>>>>> I can produce a similar error by executing:
>>>>>>>
>>>>>>> TH2D *t = new TH2D("test","t",30000,0.,1.,30000,0.,1.)
>>>>>>> Error: Symbol G__exception is not defined in current scope (tmpfile):1:
>>>>>>> Error: type G__exception not defined FILE:(tmpfile) LINE:1
>>>>>>>
>>>>>>> or
>>>>>>>
>>>>>>> root [0] TH2D *t = new TH2D("test","t",40000,0.,1.,40000,0.,1.)
>>>>>>> Error: Symbol G__exception is not defined in current scope (tmpfile):1:
>>>>>>> Error: type G__exception not defined FILE:(tmpfile) LINE:1
>>>>>>>
>>>>>>> On other hand creating a larger histograms:
>>>>>>>
>>>>>>> root [0] TH2D *t = new TH2D("test","t",50000,0.,1.,50000,0.,1.)
>>>>>>> or
>>>>>>> root [0] TH2D *t = new TH2D("test","t",60000,0.,1.,60000,0.,1.)
>>>>>>> or
>>>>>>> root [0] TH2D *t = new TH2D("test","t",70000,0.,1.,70000,0.,1.)
>>>>>>> works.
>>>>>>>
>>>>>>> I am running ROOT 5.24/00b on a 32bit machine with 3GB of memory.
>>>>>>>
>>>>>>> I tried the same on a 64bit machine, 24GB of memory, root Version
>>>>>>> 5.18/00b
>>>>>>> executing
>>>>>>>
>>>>>>> root [0] TH2D *t = new TH2D("test","t",30000,0.,1.,30000,0.,1.)
>>>>>>>
>>>>>>> does not give an error, but
>>>>>>>
>>>>>>> root [0] TH2D *t = new TH2D("test","t",40000,0.,1.,40000,0.,1.)
>>>>>>> Error: Symbol G__exception is not defined in current scope (tmpfile):1:
>>>>>>> Error: type G__exception not defined FILE:(tmpfile) LINE:1
>>>>>>>
>>>>>>> Regards,
>>>>>>> Plamen
>>>>>>>
>>>>>>> --
>>>>>>> -----------------------------
>>>>>>> Dr. Plamen Boutachkov
>>>>>>> Kernstruktur, GSI Darmstadt
>>>>>>> Planckstr. 1
>>>>>>> D-64291 Darmstadt
>>>>>>> Germany
>>>>>>>
>>>>>>> Tel: +49 6159-71-2436
>>>>>>> Fax: +49-6159-71-2898
>>>>>>> -----------------------------
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>>>
>
>
Received on Sat Nov 07 2009 - 16:42:12 CET
This archive was generated by hypermail 2.2.0 : Sun Nov 08 2009 - 05:50:14 CET