Table of Content
- ROOT Files
- Accessing a ROOT File
- The Current Directory
- Reading and Writing Objects
- Objects in Memory and Objects on Disk
- Saving Histograms to Disk
- Histograms and the Current Directory
- Saving Objects to Disk
- Saving Collections to Disk
Today, a huge amount of data is stored into files present on our PC and on the Internet. To achieve the maximum compression, binary formats are used, hence they cannot simply be opened with a text editor to fetch their content. Rather, one needs to use a program to decode the binary files. Quite often, the very same program is used both to save and to fetch the data from those files, but it is also possible (and advisable) that other programs are able to do the same. This happens when the binary format is public and well documented, but may happen also with proprietary formats that became a standard de facto. One of the most important problems of the information era is that programs evolve very rapidly, and may also disappear, so that it is not always trivial to correctly decode a binary file. This is often the case for old files written in binary formats that are not publicly documented, and is a really serious risk for the formats implemented in custom applications.
As a solution to these issues ROOT provides a file format that is a machine-independent compressed binary format, including both the data and its description, and provides an open-source automated tool to generate the data description (or "dictionary") when saving data, and to generate C++ classes corresponding to this description when reading back the data. The dictionary is used to build and load the C++ code to load the binary objects saved in the ROOT file and to store them into instances of the automatically generated C++ classes.
ROOT files can be structured into "directories", exactly in the same way as your operative system organizes the files into folders. ROOT directories may contain other directories, so that a ROOT file is more similar to a file system than to an ordinary file.
In addition, future class versions not backward compatible will not prevent the user from reading old data: ROOT has been written having in mind the needs of high-energy physics experiments, that have a life cycle of 10-20 years, and whose data are going to be re-analyzed for many years after their conclusion, despite of possible changes in their format.
Finally, ROOT has been designed keeping in mind the requirements coming from the enormous amount of data produced by high-energy physics experiments. This means that ROOT allows to save and access terabytes of data in a highly optimized way. Because the total computing time for a given task depends both on the CPU speed and on the data access time (that includes accessing and caching information from the main memory, and accessing and caching information from the disks), the quick data access allowed by ROOT effectively improves the performance of data analysis.
TFile objects are used to access ROOT files. To create a TFile object corresponding to a new file called "Event.root" (.root is the preferred extension):
TFile *MyFile = new TFile("Event.root","NEW");
Other options are CREATE (same as NEW), RECREATE (i.e. replace), UPDATE and READ. To check that file was successfully opened.
if ( MyFile->IsOpen() ) printf("File opened successfully\n");
Once a TFile object has been created it becomes the default file for all I/O. This default is held in the global variable gFile, which can be updated at any time to change the default, e.g.:
gFile = MyFile;
Having finished with a TFile, its Close message should be sent:
or simply delete the TFile object:
ROOT will automatically close any files still open when the session ends.
When you create a TFile object, it becomes the current directory. Therefore, the last file to be opened is always the current directory. To check your current directory you can type in the interpreter:
root gDirectory->pwd() Rint:/
This means that the current directory is the ROOT session (Rint). When you create a file, and repeat the command the file becomes the current directory.
root TFile f1("AFile1.root"); root gDirectory->pwd() AFile1.root:/
If you create two files, the last becomes the current directory.
root TFile f2("AFile2.root"); root gDirectory->pwd() AFile2.root:/
To switch back to the first file, or to switch to any file in general, you can use the TDirectory::cd method. The next command changes the current directory back to the first file.
root f1.cd(); root gDirectory->pwd() AFile1.root:/
Note that even if you open the file in "READ" mode, it still becomes the current directory. CINT also offers a shortcut for gDirectory->pwd() and gDirectory->ls(), you can type:
root .pwd AFile1.root:/ root .ls TFile** AFile1.root TFile* AFile1.root To return to the home directory where we were before: root gROOT->cd() (unsigned char)1 root gROOT->pwd() Rint:/
Once a file has been opened, objects can be written by sending them their Write message, e.g:
writes a copy of MyObject to the current directory of the current file with the named key "MyObject_1". If MyObject does not inherit from TClass, you can do
can be used to read it; If on the file there is a key name 'MyObject_1' and its contains an object of a type that is the same as the type pointed to by MyObject or inherits from this type, MyObject will be updated with the address of the object containing the information on disk. Otherwise MyObject is set to 0. Sending a TFile its Write message:
results in all objects currently attached to the TFile or any of its sub-directory to be written out. Each is asked its name (call to GetName) which is used as its key name. For objects that have no name the object's class name is used (with a version number to ensure the key is unique).
The TFile::ls() method has an option to list the objects on disk ("-d") or the objects in memory ("-m"). If no option is given it lists both, first the objects in memory, then the objects on disk. For example:
root TFile *f = new TFile("hsimple.root"); root gDirectory->ls("-m") TFile** hsimple.root TFile* hsimple.root
Remember that gDirectory is the current directory and at this time is equivalent to "f". This correctly states that no objects are in memory. The next command lists the objects on disk in the current directory.
root gDirectory->ls("-d") TFile** hsimple.root TFile* hsimple.root KEY: TH1F hpx;1 This is the px distribution KEY: TH2F hpxpy;1 py vs px KEY: TProfile hprof;1 Profile of pz versus px KEY: TNtuple ntuple;1 Demo ntuple
To bring an object from disk into memory, we have to use it or "Get" it explicitly. When we use the object, ROOT gets it for us. Any reference to hprof will read it from the file. For example drawing hprof will read it from the file and create an object in memory. Here we draw the profile histogram, and then we list the contents.
root hprof->Draw() <TCanvas::MakeDefCanvas>: created default TCanvas with name c1 root f->ls() TFile** hsimple.root TFile* hsimple.root OBJ: TProfile hprof Profile of pz versus px : 0 KEY: TH1F hpx;1 This is the px distribution KEY: TH2F hpxpy;1 py vs px KEY: TProfile hprof;1 Profile of pz versus px KEY: TNtuple ntuple;1 Demo ntuple
We now see a new line that starts with OBJ. This means that an object of class TProfile, called hprof has been added in memory to this directory. This new hprof in memory is independent from the hprof on disk. If we make changes to the hprof in memory, they are not propagated to the hprof on disk. A new version of hprof will be saved once we call Write. You may wonder why hprof is added to the objects in the current directory. hprof is of the class TProfile that inherits from TH1D, which inherits from TH1. TH1 is the basic histogram. All histograms and trees are created in the current directory (also see "Histograms and the Current Directory"). The reference to "all histograms" includes objects of any class descending directly or indirectly from TH1. Hence, our TProfile hprof is created in the current directory f.There was another side effect when we called the TH1::Draw method. CINT printed this statement:
<TCanvas::MakeDefCanvas>: created default TCanvas with name c1
It tells us that a TCanvas was created and it named it c1. This is where ROOT is being nice, and it creates a canvas for drawing the histogram if no canvas was named in the draw command, and if no active canvas exists. The newly created canvas, however, is NOT listed in the contents of the current directory. Why is that? The canvas is not added to the current directory, because by default ONLY histograms and trees are added to the object list of the current directory. Actually, TEventList objects are also added to the current directory, but at this time, we don't have to worry about those. If the canvas is not in the current directory then where is it? Because it is a canvas, it was added to the list of canvases. This list can be obtained by the command gROOT->GetListOfCanvases()->ls(). The ls() will print the contents of the list. In our list, we have one canvas called c1. It has a TFrame, a TProfile, and a TPaveStats.
root gROOT->GetListOfCanvases()->ls() Canvas Name=c1 Title=c1 Option=TCanvas fXlowNDC=0 fYlowNDC=0 fWNDC=1 fHNDC=1 Name= c1 Title= c1 Option=TFrame X1= -4.000000 Y1=0.000000 X2=4.000000 Y2=19.384882 OBJ: TProfile hprof Profile of pz versus px : 0 TPaveText X1=-4.900000 Y1=20.475282 X2=-0.950000 Y2=21.686837 title TPaveStats X1=2.800000 Y1=17.446395 X2=4.800000 Y2=21.323371 stats
Lets proceed with our example and draw one more histogram, and we see one more OBJ entry.
root hpx->Draw() root f->ls() TFile** hsimple.root TFile* hsimple.root OBJ: TProfile hprof Profile of pz versus px : 0 OBJ: TH1F hpx This is the px distribution : 0 KEY: TH1F hpx;1 This is the px distribution KEY: TH2F hpxpy;1 py vs px KEY: TProfile hprof;1 Profile of pz versus px KEY: TNtuple ntuple;1 Demo ntuple
TFile::ls() loops over the list of objects in memory and the list of objects on disk. In both cases, it calls the ls() method of each object. The implementation of the ls method is specific to the class of the object, all of these objects are descendants of TObject and inherit the TObject::ls() implementation. The histogram classes are descendants of TNamed that in turn is a descent of TObject. In this case, TNamed::ls() is executed, and it prints the name of the class, and the name and title of the object. Each directory keeps a list of its objects in the memory. You can get this list by TDirectory::GetList(). To see the lists in memory contents you can do:
root f->GetList()->ls() OBJ: TProfile hprof Profile of pz versus px : 0 OBJ: TH1F hpx This is the px distribution : 0
Since the file f is the current directory (gDirectory), this will yield the same result:
root gDirectory->GetList()->ls() OBJ: TProfile hprof Profile of pz versus px : 0 OBJ: TH1F hpx This is the px distribution : 0
At this time, the objects in memory (OBJ) are identical to the objects on disk (KEY). Let's change that by adding a fill to the hpx we have in memory.
Now the hpx in memory is different from the histogram (hpx) on disk. Only one version of the object can be in memory, however, on disk we can store multiple versions of the object. The TFile::Write method will write the list of objects in the current directory to disk. It will add a new version of hpx and hprof.
root f->Write() root f->ls() TFile** hsimple.root TFile* hsimple.root OBJ: TProfile hprof Profile of pz versus px : 0 OBJ: TH1F hpx This is the px distribution : 0 KEY: TH1F hpx;2 This is the px distribution KEY: TH1F hpx;1 This is the px distribution KEY: TH2F hpxpy;1 py vs px KEY: TProfile hprof;2 Profile of pz versus px KEY: TProfile hprof;1 Profile of pz versus px KEY: TNtuple ntuple;1 Demo ntuple
The TFile::Write method wrote the entire list of objects in the current directory to the file. You see that it added two new keys: hpx;2 and hprof;2 to the file. Unlike memory, a file is capable of storing multiple objects with the same name. Their cycle number, the number after the semicolon, differentiates objects on disk with the same name. If you wanted to save only hpx to the file, but not the entire list of objects, you could use the TH1::Write method of hpx:
A call to obj->Write without any parameters will call obj->GetName() to find the name of the object and use it to create a key with the same name. You can specify a new name by giving it as a parameter to the Write method.
If you want to re-write the same object, with the same key, use the overwrite option.
If you give a new name and use the kOverwrite, the object on disk with the matching name is overwritten if such an object exists. If not, a new object with the new name will be created.
The Write method did not affect the objects in memory at all. However, if the file is closed, the directory is emptied and the objects on the list are deleted.
root f->Close() root f->ls() TFile** hsimple.root TFile* hsimple.root
In the code snipped above, you can see that the directory is now empty. If you followed along so far, you can see that c1 which was displaying hpx is now blank. Furthermore, hpx no longer exists.
root hpx->Draw() Error: No symbol hpx in current scope
This is important to remember, do not close the file until you are done with the objects or any attempt to reference the objects will fail.
When a histogram is created, it is added by default to the list of objects in the current directory. You can get the list of histograms in a directory and retrieve a pointer to a specific histogram.
TH1F *h = (TH1F*)gDirectory->Get("myHist"); // or TH1F *h = (TH1F*)gDirectory->GetList()->FindObject("myHist");
The method TDirectory::GetList() returns a TList of objects in the directory. You can change the directory of a histogram with the SetDirectory method.
If the parameter is 0, the histogram is no longer associated with a directory.
Once a histogram is removed from the directory, it will no longer be deleted when the directory is closed. It is now your responsibility to delete this histogram object once you are finished with it. To change the default that automatically adds the histogram to the current directory, you can call the static function:
In this case, you will need to do all the bookkeeping for all the created histograms.
In addition to histograms and trees, you can save any object in a ROOT file. For example to save a canvas to the ROOT file you can use either TObject::Write() or TDirectory::WriteTObject(). The example:
This is equivalent to:
For objects that do not inherit from TObject use:
root TFile *f = new TFile("hsimple.root","UPDATE") root hpx->Draw() <TCanvas::MakeDefCanvas>: created default TCanvas with name c1 root c1->Write() root f->ls() TFile** hsimple.root TFile* hsimple.root OBJ: TH1F hpx This is the px distribution : 0 KEY: TH1F hpx;2 This is the px distribution KEY: TH1F hpx;1 This is the px distribution KEY: TH2F hpxpy;1 py vs px KEY: TProfile hprof;2 Profile of pz versus px KEY: TProfile hprof;1 Profile of pz versus px KEY: TNtuple ntuple;1 Demo ntuple KEY: TCanvas c1;1 c1
STL collection classes can be written on a TFile exactly as non TObjects. ROOT collection classes are different: they inherit from TCollection and hence inherit the TCollection::Write() method. When you call TCollection::Write() each object in the container is written individually into its own key in the file. To write all objects into one key you can specify the name of the key and use the option TObject::kSingleKey. For example:
root TList * list = new TList; root TNamed * n1, * n2; root n1 = new TNamed("name1","title1"); root n2 = new TNamed("name2","title2"); root list->Add(n1); root list->Add(n2); root gFile->WriteObject(list,"list",TObject::kSingleKey)