I am doing a comparison of speed of accessing data in a root file and in a custom database file in our company. The data file contains 9700 stocks' Intraday price data. The root file is organized as follows. Each stock has one tree named by the stock's ticker. The tree has a few branches for (Date, time, price, size ...) This root file is for one month only. i.e. each tree contains one month worth of data for one stock. The test is a loop which goes through all 9700 stocks and extracts (Date, time, price, size) for a given date in a month. Here is the pseudo-code: ------------------------------------------------------------ for each stock ** tree = (TTree*) root_file.Get(stock), activate the Date branch, find the indexes for the desired date set up containers for the requested data fields, (using STL's vector) activate branches (date, time, price, size), for all the targeted indexes tree->GetEntry(i), put the data into the container. root_file.Delete(stock) ------------------------------------------------------------- The result: It takes 76.6 seconds to finish the above loop. In comparison, with our internal database format and doing the same task, the time is 23.2 seconds. The mark ** in the above loop may be the bottleneck. So, I tried to use: tree = (TTree *) treeMap[stock]->ReadObj(); where treeMap is a STL map<string, TKey*> which maps the stock ticker to its corresponding TKey structure in memory. This improves the time a little: 68.7 seconds, which is still a factor three behind that for accessing our custom databases. ---------------------------------------------------------- So, the question is: Are Get() and ReadObj() very expensive ?? Could anyone suggest some other ways to organize and to access the data in ROOT ???? Thanks, HP
This archive was generated by hypermail 2b29 : Sat Jan 04 2003 - 23:51:20 MET