ROOT

New color schemes

2024-10-08T00:00:00+00:00

Choosing an appropriate color scheme is essential for making results easy to understand and interpret. Factors like colorblindness and converting colors to grayscale for publications can impact accessibility. Furthermore, results should be aesthetically pleasing. The following three color schemes, recommended by M. Petroff in arXiv:2107.02270v2 and available on GitHub under the MIT License, meet these criteria.

The following example demonstrates how to use the accessible color schemes (in this case, the one with six colors) to represent a THStack. It also shows that the grayscale version is an acceptable alternative.

void thstackcolorscheme()
{
   auto c1 = new TCanvas();
   auto hs = new THStack("hs","Stacked 1D histograms colored using 6-colors scheme");

   // Create six 1-d histograms  and add them in the stack
   auto h1st = new TH1F("h1st","A",100,-4,4);
   h1st->FillRandom("gaus",20000);
   h1st->SetFillColor(kP6Blue);
   hs->Add(h1st);

   auto h2st = new TH1F("h2st","B",100,-4,4);
   h2st->FillRandom("gaus",15000);
   h2st->SetFillColor(kP6Yellow);
   hs->Add(h2st);

   auto h3st = new TH1F("h3st","C",100,-4,4);
   h3st->FillRandom("gaus",10000);
   h3st->SetFillColor(kP6Red);
   hs->Add(h3st);

   auto h4st = new TH1F("h4st","D",100,-4,4);
   h4st->FillRandom("gaus",10000);
   h4st->SetFillColor(kP6Grape);
   hs->Add(h4st);

   auto h5st = new TH1F("h5st","E",100,-4,4);
   h5st->FillRandom("gaus",10000);
   h5st->SetFillColor(kP6Gray);
   hs->Add(h5st);

   auto h6st = new TH1F("h6st","F",100,-4,4);
   h6st->FillRandom("gaus",10000);
   h6st->SetFillColor(kP6Violet);
   hs->Add(h6st);

   // Draw the stack with colors.
   hs->Draw();
   TLegend *l = gPad->BuildLegend(.8,.55,1.,.9,"","F");
   l->SetLineWidth(0);
   l->SetFillStyle(0);

   // Draw the stack using gray scale.
   auto c2 = new TCanvas();
   c2->SetGrayscale();
   hs->Draw();
   l->Draw();
}

These new color schemes, now in the master version of ROOT, will be available in version 6.34.

New ROOT student course for self-study

2024-08-28T00:00:00+00:00

The academic year is about to start, so why not learn ROOT with us?

During the summer, the ROOT team has organized a number of ROOT Summer Student courses at CERN and two HSF/IRIS-HEP Python for Analysis Trainings online, meaning the ROOT community has expanded by around 200 new students! And that is only the beginning; we think more students should have an opportunity to acquire the course material, this time in a self-study manner.

The course is entirely contained within Python Jupyter Notebooks; the material is available on GitHub, while the video recording is here.

The course introduces ROOT and what it is used for by the LHC experiments and the physicists analyzing the data. The following are five sections in which the most essential features of modern ROOT are presented. We start with the bread and butter of every physicist - histograms, graphs and parameter estimation. We then continue with an explanation of the ROOT file format and how to inspect such ROOT files. In the second part of the course, we introduce the ROOT analysis interface - the RDataFrame (RDF). We go through the RDF basics, continue with the use of collections in RDataFrame, and finally, touch on some more advanced yet very useful RDataFrame concepts.

The lecture part of the course is complimented with a number of exercises, some of which we solve together in the video and some of which are left to the students for further exploration after completing the core parts of the course and after going through the extra material (also linked in the student-course repository).

Completing the core part of the course should take at most 3 hours. To profit from the course the most, we suggest that you follow these three steps:

Open the student-course repository and follow the simple setup steps
Watch the video recording and follow along using the Jupyter Notebooks on your PC
In case of issues, questions, or wishes to share feedback, contact us via the ROOT forum

Enjoy and share with your colleagues and friends!

RNTuple: Where are we now and what’s next?

2024-01-26T00:00:00+00:00

Hello, this is Florine from the ROOT team! Over the past year, I’ve been working as a technical student funded by ATLAS to evaluate and help further develop RNTuple. As you may already be aware, RNTuple [1] is currently being developed as the successor to TTree, and is projected to be used in Run 4. I imagine you might be wondering why there is a need for a completely new (TTree-incompatible) system, and what this looks like. That’s why in this blog post, I will try to answer this question, as well as give you an overview of the current status of RNTuple, what we’re still working on before its first production release (and what we will work on beyond this), and finally how you can already try it out!

Why do we need RNTuple?

At this point, ROOT has been around for more than a quarter of a century – and TTree for just as long. And as you might imagine, the computing landscape today looks vastly different compared to 25 years ago. Just to set the scene: when ROOT was first released, there was no C++ standard yet and parallel (let alone distributed) computing really wasn’t a thing yet. On the hardware side, modern storage technologies such as SSDs and object stores were still unheard of, and let’s not forget to mention the evolution of networking technologies! Naturally, TTree wasn’t designed and implemented with these things in mind. Now of course, over the years a lot of effort has been put into improving the performance and stability of TTree to make it compatible with modern computing practices as much as possible. However, there are limits to what is possible in this regard, especially given the fact that backwards- and forwards-compatibility are two major requirements for ROOT’s I/O system. This has led to the fact that with the High-Luminosity LHC on the horizon, where 90% of the total amount of LHC data is expected to be produced [2], we need to think about more optimized ways to store physics data. The challenge here is that this data is unique in the sense that events (or, in computer science terms, “entry” or “row”) are statistically independent of each other. At the same time one event typically contains many (complex) data structures, of which we often only need a small subset at a time, and we found out that standard technologies are not well-tuned for this type of data storage [3]. That is why we decided to combine the years of experience with TTree and various industry best-practices and invest in the next generation of high-energy physics data storage. Enter RNTuple!

Where we are now?

For the past four years, a lot of effort has been put into making RNTuple the best it can be. We are working closely with the experiments to make sure that RNTuple can support their data models across all relevant stages in the production pipeline. Simultaneously, we want to make sure that it is as optimized as possible. This means making sure that the data stored in RNTuple is as compact as possible, and at the same time coming up with ways in which we can make reading and writing RNTuples to and from memory as fast as possible. To give you an idea of where we’re currently at, the plot below shows the average on-disk event size for ATLAS’s DAOD_PHYS data model [4], comparing TTree and RNTuple. With RNTuple, we could potentially save 20-35% of storage space, and in turn reduce the consumed network bandwidth when reading the data from a remote location. When we’re talking about exabytes of event data, this is quite significant!

Besides storage efficiency, we’re also seeing very promising results when it comes to read throughput. The two plots below show the number of events processed per second for two different types of tasks, comparing ATLAS DAOD_PHYSLITE data sets stored in TTree and RNTuple (stored on an SSD). As you can see, RNTuple is remarkably faster than TTree, and similar observations are made for other data sets [1], [5].

Beyond performance, we have also been working hard on RNTuple’s interface and supported features. This includes compatibility with RDataFrame, being able to read and write C++ STL types as well as user-defined types and various other features to support existing experiment frameworks.

Can I try it out?

Yes! To be able to read and write RNTuples, the first thing you’ll need is a ROOT installation that includes the ROOT 7 experimental features enabled. This is the case for the default LXPLUS installation, which runs ROOT’s (at the time of writing) latest release, 6.30.02! If you are running ROOT in a different way, you can easily check if ROOT 7 is enabled for your installation by running root-config --has-root7 in your terminal. If this returns yes, you’re all set! If you get a no, you will need to use a different installation of ROOT that does. Check out the ROOT installation page to get it. We strongly recommend using the most recent release in order to get the latest and greatest from RNTuple.

Now, on to the fun part: using RNTuple! Of course, you could write a new RNTuple completely from scratch, using fields and data that you come up with. This is done using the RNTupleWriter interface. Reading an RNTuple is then naturally done through the RNTupleReader. To get an idea of what this looks like in practice, check out for example this tutorial.

Of course, it would be more interesting to try out RNTuple with real data, for example with data from an analysis ntuple that is currently stored as a TTree. Well, good news! RNTuple also comes with an RNTupleImporter class that allows you to automatically convert your TTrees to RNTuples. This can be as simple as executing the following two lines in the ROOT prompt. The input file containing the source TTree is read remotely, meaning you can directly copy-paste these lines into your ROOT prompt. Of course, it’s entirely possible to use your own existing TTrees.

root [0] auto importer = ROOT::Experimental::RNTupleImporter::Create(
                "http://root.cern/files/HiggsTauTauReduced/GluGluToHToTauTau.root",
                "Events",
                "my_rntuple.root")
root [1] importer->Import()

This will convert your TTree (called Events here) into an RNTuple also called Events and write it to my_rntuple.root. Easy enough, but maybe you want more control over this newly created RNTuple. For example, you might want to change its name, or set the compression settings to something other than the default. This (and more) can all be tweaked! Check out the reference or this tutorial to see what options are possible.

Now, I already mentioned that we have been working on RNTuple compatibility with RDataFrame. Currently, with just one line change, you will be able to use your existing analysis code with data stored in RNTuple:

// Change this:
ROOT::RDataFrame df("Events", "http://root.cern/files/HiggsTauTauReduced/GluGluToHToTauTau.root");

// To this to use the RNTuple you just imported into "my_rntuple.root":
ROOT::RDataFrame df = ROOT::RDF::Experimental::FromRNTuple("Events", "my_rntuple.root");

// Use your existing analysis as-is!

💡 The automatic detection of RNTuples in RDataFrame is currently available in ROOT’s master branch and will be available in ROOT 6.32.00!

Next steps for RNTuple

So, what’s next? Performance is always one of our main concerns. We are currently working on parallelizing the writing of RNTuples. In addition, we are working on what we like to call “interface ergonomics”, i.e. the way developers will interact with RNTuple. Be aware that this means that the RNTuple interfaces might still change a little in the coming months! Next to all of this, we are preparing for larger-scale performance testing to see in what areas we could further improve. Another area of work for the near future will be in the direction of data set combinatorics – that is, finding smart(er) ways of accessing and combining existing RNTuple data. And of course, we will continue to work with the experiments to make sure the transition to RNTuple will be as smooth as possible.

To wrap things up, things are looking good for RNTuple, and while there is still enough work to be done, we’re excited and eager to make RNTuple as good as it can be! If you want to know more about the evolution and performance of RNTuple, be sure to check out the references below, as well as our other publications. If you are eager to dive deeper into the specifics of the RNTuple binary format, you can read the specification here. Finally, reach out to us on the forum if you have any questions or if you would like to contribute to RNTuple or ROOT in general!

References

[1] J. Blomer, P. Canal, A. Naumann, and D. Piparo, “Evolution of the ROOT Tree I/O,” EPJ Web Conf., vol. 245, 2020, doi: 10.1051/epjconf/202024502030.

[2] ATLAS Collaboration, “ATLAS Software and Computing HL-LHC Roadmap,” CERN, Geneva, CERN-LHCC-2022-005, LHCC-G-182, 2022. Accessed: May 02, 2023. [Online]. Available: http://cds.cern.ch/record/2802918.

[3] J. Blomer, “A quantitative review of data formats for HEP analyses,” J. Phys. Conf. Ser., vol. 1085, p. 032020, Sep. 2018, doi: 10.1088/1742-6596/1085/3/032020.

[4] J. Elmsheuser et al., “Evolution of the ATLAS analysis model for Run-3 and prospects for HL-LHC,” EPJ Web Conf., vol. 245, 2020, doi: 10.1051/epjconf/202024506014.

[5] J. Lopez-Gomez and J. Blomer, “RNTuple performance: Status and Outlook.” arXiv, Apr. 07, 2022. doi: 10.48550/arXiv.2204.09043.

Interactive, web-based canvas is now the default in ROOT

2023-06-05T00:00:00+00:00

After a long period of development and testing we decided to switch to the web-based TCanvas implementation by default in the ROOT master version. It has been present in the ROOT for a while (since 2017) and used already in the web-based TBrowser, which you have probably seen already.

What has changed ? Now when starting a ROOT session and displaying any object in TCanvas, the default system web browser will start and the object will be drawn there using the JavaScript ROOT functionality. The look and feel for basic objects, like histograms and graphs, will not change much – all the drawing options and styles are supported as in the original graphics. You can compare the two following screen shots made with the same macro – one is original ROOT graphics, other is web-based one.

What are the benefits of using the web-based canvas?

The painting is fully decoupled from the main program and runs asynchronously.
As the display is just normal web browser, the painting is the same on all platforms supported by ROOT natively as Linux/Mac/Windows, but also on many others where ROOT may not run at all – like all kinds of smartphones and tablets.
Threads safety - The object painting and the interactivity happen in the web browser and don’t depend on the application state. With some little configuration efforts one can run different canvases from different threads.
Free from gPad problematic - lots of interactivity in original ROOT graphics were built around this global pointer. This made it difficult to use several canvases at the same time.
The Web is a very natural way for implementing remote displays. Instead of struggling with remote X11 one can just use a local web browser accessing remote applications through http. To ease configuration of ssh tunnels, we provide a simple rootssh utility which fully automates the configuration of such tunnels.
One can use QWebEngine (Qt5 and Qt6) to implement a fully local display without any http server in-between. This allows embedding any kind of ROOT web-based widgets into Qt applications on all platforms.

What about image production in batch? For the moment we keep the old functionality, like when running ROOT with the -b flag, for image production. Web-based canvas will be used for PNG/JPEG/SVG images creation by adding --web flag when running ROOT. While image generation involves running web browsers in headless mode, it takes time – approximately 1 second per image. We plan to provide a special API to produce many images with one call – which should significantly improve performance.

What are drawbacks? Probably you’ll encounter minimal differences between drawing with native ROOT graphics and in the web browsers. We do our best to make them similar as much as possible – and you can help us by reporting the problems. Probably some very special usages of TExec objects (do you know about them?) will not work as expected in web-based canvases. With a little help from us this can be fixed and adjusted. For sophisticated use-cases with complex user-defined objects one could consider implementing JavaScript-based painters for them.

We encourage all users to try this functionality and give us the feedback!

New class TScatter

2023-05-30T00:00:00+00:00

2D Scatter plots are a very popular way to represent scientific data. Many scientific plotting packages have this functionality. For many years ROOT itself as offered this kind of visualization via:

The option P to draw TGraph: A marker is drawn at each point positions but all markers will have the same size and the same color.
The COL option of TTree::Draw(): tree.Draw("e1:e2:e3","","col") produces a 2D scatter plot with e1 vs e2, and e3 is mapped on the current color palette. That’s a bit better as it allows to draw three variables on a 2D plot. But one needs to create a TTree or a TNtuple which is a bit heavy when the data are already stored in simple vectors.

Therefore there was a need for a new class able to produce, in a simple way, this famous multi-variables way to visualize data.

In order to full-fill these requirements a new class, TScatter, has been implemented. It is able to draw a four variables scatter plot on a single plot. The first two variables are the x and y coordinates of the markers, the third one is mapped on the current color map, and the fourth one on the marker size.

Note that it is recommended to use a transparent color map as markers will, most of the time, overlap.

The code to produce a scatter plot with the new class TScatter is as simple as:

void scatter()
{
   auto canvas = new TCanvas();
   gStyle->SetPalette(kBird, 0, 0.6); // define a transparent palette

   const int n = 100;
   double x[n];
   double y[n];
   double c[n];
   double s[n];

   // Define four random data set
   auto r  = new TRandom();
   for (int i=0; i<n; i++) {
      x[i] = 100*r->Rndm(i);
      y[i] = 200*r->Rndm(i);
      c[i] = 300*r->Rndm(i);
      s[i] = 400*r->Rndm(i);
   }

   auto scatter = new TScatter(n, x, y, c, s);
   scatter->SetMarkerStyle(20);
   scatter->SetMarkerColor(kRed);
   scatter->SetTitle("Scatter plot;X;Y");
   scatter->Draw("A");
}

Coding in ROOT with the horsepower of an F1

2022-03-01T00:00:00+00:00

If you’ve ever rubbed your eyes trying to decrypt C++ compilation errors from a terminal, or even have faced with your bare eye the intimidating logs of valgrind output for memory leak detection, or manually deployed gdb, you should definitely keep reading. With this post, I believe you’ll improve your productivity and experience with ROOT by using QtCreator as a development and troubleshooting environment.

Errors are development tools, not silly mistakes
- IDEs to the rescue
QtCreator
Debugging tools
Quick recipe Summary

Errors are development tools, not silly mistakes

There is a natural tendency to look at compilation or conceptual errors as unwanted accidents or mistakes that only happen rarely, because of my own inexperience, and that surely will not happen next time. As such, we are not explicitly prepared nor trained to deal with them systematically. We just tackle them as a contingency and try to solve them quickly with whatever tools at hand. Yet experience tells us that errors (in programming, in mathematics, in judgement biases) are not an exception, but rather the rule.

In fact, most of the time in (robust) development is spent on debugging and troubleshooting, either passively by looking at whatever problem pops up, or actively, by creating robust software architecture from its conception that prevents them (via strong-typing, smart pointers, ordered structure and abstraction, good documentation, …), as well as a suite of tests that prevent these in the future in as many virtual scenarios as possible. It is not uncommon that you may write an analysis software in 5 hours, but then spend 5 days tracking down why the heck it’s giving wrong results, or crashing once every 100 times, or even more worryingly, leading you silently to wrong scientific conclusions or errors in other links of your analysis chain, that are far away from its original source and thus hard to trace back.

Yet, despite knowing the forefront impact on your workflow and scientific robustness, many of us physicist are not trained to deal with errors with the proper tools, and we still deploy inefficient and manual ways to hack them “as quickly as possible”, hoping (with uncertainty and fear) that they “won’t come back”. Because we will encounter errors much more frequently than we might think at first place, it makes sense to invest some “initial setup time” to create a robust platform for tackling and fixing these in a systematic way. Rather than reacting with insecurity to these or keeping them in the back of the mind as a passive or transient threat/accident, let’s assume they will be rather the norm and an important key player in our development, a learning tool that will appear continuously and is worth optimizing. In the same way that one does no longer use a pen if one wants to send 10000 letters, compared to only 10. In this situation, it’s interesting to partly shift your paradigm from “troubleshooting your code”, to “code for troubleshooting”, i.e. develop the instruments to quickly detect the mistakes you will surely make.

IDEs to the rescue

Integrated Desktop Environment (IDE) softwares are very powerful tools to detect errors (thanks e.g. to Clang), trace them back to the right point in the source code, and even automatically suggest the solution. ROOT scripts, as well as standalone C++ programs relying on ROOT libraries, can be integrated with minimum effort into these IDEs. Examples on the steps to follow are nicely explained in older blog posts for the Visual Studio and Eclipse IDEs, as well as in the Twiki and other blogs. In this post, I will focus on a third option, the open-source QtCreator IDE.

QtCreator

While optimized for Qt applications, QtCreator is totally generic, open-source, and it can compile and run any C++ program, CMake project, Makefile, etc. You don’t even need to know what Qt means. You don’t need to use qmake nor native Qt project files either (in fact, I prefer to always use CMake to get rid of any dependence with Qt). Your project will be equally compilable from a terminal with Make / CMake than via QtCreator, which just acts as a non-invasive interface.

Installation steps

You can find (usually) outdated versions of QtCreator in your package manager, but I recommend to use the online installer, which then periodically checks for updates at program start. If you prefer not to open a user account with them, you can use the offline installer. While installing, I recommend to deactivate all Qt library options, newer CMake versions or Ninja. You will just need QtCreator.

Select your compiler Kit

Go to “Tools”, “Options”, “Kits”. Click for example on one of the auto-detected kits in the dialog. You can define your custom one, which will appear as “Manual” in the tree view. I recommend setting up a “Manual” kit, where “Qt version” is set to “None” (in case it was set), and in “CMake generator”, “Ninja” is changed to “Unix Makefiles” (if you are not in Windows). If you prefer to use the “Ninja” generator, make sure that it is installed in your system. Finally, click “Ok”.

Beware: in OSx, the “Tools”, “Options” menu is instead under “AppName”, “Preferences”.

Open a C++ CMake project

You can open any CMake project you have on your computer by clicking on “File”, “Open File or Project”. Find then the main folder where your project’s source code is located. Usually, there will be a “CMakeLists.txt” file in the main directory. Double-click then on this one, rather than in any other “CMakeLists.txt” that might appear in the subdirectories of this same project. If you prefer the command line, you can run directly as qtcreator my/folder/CMakeLists.txt &. If you installed QtCreator in a local folder, you might need to run something like: /opt/Qt/Tools/QtCreator/bin/qtcreator my/folder/CMakeLists.txt &.

If you rather use Makefiles, that’s also supported via the Import menu, by clicking on “File”, “New File or Project”, “Import Project”, “Import Existing Project”, “Choose”, and then select the source files you want to see in your tree (or just click on select all and deactivate those that are images, etc.). The Makefile will be automatically detected behind the scenes. You can edit the number of threads (-j) later on the project’s “Build settings”.

Let me load into QtCreator the simplest CMake example. After “Open File or Project”, a “Kit dialog” will apear. The button “Manage Kits” on the top left allows you to remember what compiler was associated with each kit (as explained above), click “Ok” to close. Select the “build kit” you prefer, and then click on “Details”. There, you can specify in what folder to build your program. Under “Tools”, “Options”, “Build”, “Default Properties” allows you to setup a default directory for your builds.

Once you click on “Configure Project”, CMake will be automatically run. In the “Projects” pane, you can tune any CMake flag as needed, as well as specify command line arguments when running. The “Build” hammer icon on the left compiles your project (make), and the “Run” play icon executes it.

You will not need to re-do all these configuration steps later on for this project, as QtCreator will store these settings in a file called “CMakeLists.txt.user” and recognize it automatically the next time you open the project.

The Power of `F1`

Let’s assume now that you have forgotten what class std::cout corresponds to. Luckily, Qt has an in-built (offline) help support system. For a first-time configuration, you will just need to download the “Help Book” of your library, in this case the std library from cppreference or via your package manager (sudo apt install cppreference-doc-en-qch). Then, in “Tools”, “Options”, “Help”, “Documentation”, you can add the downloaded (or /usr/share/ installed) “.qch” file.

Once this is set, you can either Ctrl+Click on your function or object to immediately go to the source code definition (file will open in another tab), or press F1, and the HTML documentation will appear on your right side without having to type / search anything online.

If you use and compile LLVM yourself, you can also get your Qt Help file as described here.

The ROOT framework also has a “.qch” Help Book available for download, thus you’ll be able to quickly consult any documentation using the F1 key, rather than searching online, which can be useful in case you are traveling and have no Internet access.

You can not only check the documentation with F1, but fully open the HTML reference guide by clicking on the big “Help” icon (left pane), as shown below.

Alternatively, you can also open the Help Books and search them using Qt Assistant. Linux apt packages are qt4-dev-tools or qt5-assistant, and the executables are assistant-qt4 and assistant, respectively. (qt6 version is not yet in the package manager.) You will have to add the .qch file to its database by going to “Edit”, “Preferences”, “Documentation”, “Add”.

And if you already use other IDEs or operating systems ? In addition to inline HTML searching, the building of the (ROOT) doxygen documentation can be configured to output a format that is compatible with MacOS - Xcode, Windows - VSstudio, or Eclipse. ROOT only provides for download the Qt help files (.qch) for the moment, but you can build the documentation yourself adapting those flags in the Doxyfile.

The Power of Clang

Grown over many years and standards, larger software projects have plenty of legacy code that is not as safe as the one someone would write today. Unsurprisingly, there are still some bugs here and there, and instabilities that haven’t been solved. Some of these bugs and potential style improvements can be detected thanks to the Clang-analyzer, which performs code analysis based on configurable settings.

QtCreator bundles perfectly with Clang-Analyzer, see left pane, “Debug” icon, then “Debugger” dropdown menu, “Clang-Tidy and Clazy”. It parses its output warnings and takes you directly to where the code needs to be changed. In addition, it even lets you apply “fixits” by a mouse-click: if Clang knows how to correct the problem, he will change the code automatically for you.

To give an example, analyzing the core of ROOT yields several diagnostics, and this can be quite useful for tracing in case you are seeing some memory leak when your application is deploying ROOT libraries:

If, for example, you would like to modernize your code syntax to the latest C++ standard, you can configure the Clang settings in “Tools”, “Analyzer”, “Default checks”, and enable the modernize- option. For example, with a single click, you can change NULL to nullptr across your whole codebase.

Formatting your code

Whether you like 4 spaces, 2 spaces, 1 tab, braces in the beginning or in the end… it does not matter what your taste is. What’s important is that you do not spend your valuable time on formatting things by hand. QtCreator can be helpful in this regard, too, if you activate the Beautifier plugin as well as install clang-format.

For example, let’s suppose you want to submit a pull request of one of your functions to ROOT, which has its own formatting guidelines. The easiest is to copy to your project the .clang-format configuration file from the repository or the website and then go to “Tools”, “Options”, “Beautifier”, “Clang Format”, and specify the file location. (Or if you are building ROOT itself using QtCreator, specify “File” in the dropdown menu, and it will auto-detect the one in the source tree).

You can also define a keyboard shortcut to format the file, by going to “Tools”, “Environment”, “Keyboard”, search for “format” and assign e.g. Ctrl+Alt+F.

Once that is configured, you can enable to auto-format your file when saving, or apply changes manually via Ctrl+Alt+F. Here a snippet before and after applying it:

int main(int argc, char* argv[])
{

int main(int argc, char *argv[]) {

git version control

For one of your projects, or even for the ROOT codebase, you might be using git for version control. QtCreator integrates seamlessly with the typical git commands, and can show you a visual diff of the current changes, as well as commit (Alt+G, Alt+C) and push your changes using its graphical interface, or pull the latest version from the remote repository.

Why bother with QtCreator when I am pro with emacs and vim?

You can get the best of both worlds by using the FakeVim mode or the emacs plugin. Just give it a try ;)

And if you just need column-editing, you don’t need any of those, QtCreator supports that natively.

CTests

If you’ve built ROOT enabling the “testing” CMake flag, or if your project contains “CTests”, “Boost Tests”, etc. for ensuring that new changes you apply don’t break older functionality, QtCreator has a platform to visually run and check the results of all those tests. No need to scroll in a terminal to find which one failed.

Beware:

Go first to “Tools”, “Options”, “Testing”, “General”, and adapt the total “Timeout” to allow running all tests at once.
Be sure that the option “CTest” is active under “Active Frameworks” of that same menu.

To gild the lily

QtCreator lets you not only find compilation errors, but also documentation errors, by interfacing with the warnings issued by doxygen. This metawarning function can prove extremely useful for detecting outdated or incorrect documentation and going to the right spot in the source code in just one click, rather than diving through thousands of lines of output and tracing it manually.

To give it a try, take a look at building the ROOT documentation project. Follow these steps:

Call first source /path/to/ROOT/bin/thisroot.sh in the terminal and launch qtcreator from there. Alternatively, you can manually specify all the variables in the “Build” environment.
Import the Makefile located in root/documentation/doxygen into QtCreator, as explained above
Click then on the “Build” icon.

Below a screenshot of the errors and the points in the source code found by just clicking on those issues.

I’d suggest you to define a custom output parser to catch some doxygen warnings of “potential candidates” when there is an ambiguous matching in the signatures. To do this, go to “Tools”, “Options”, “Build&Run”, “Custom Output Parsers”, “Add”, and in “Warning”, specify the pattern (.*) at line (\d+) of file (.*) and order 3,2,1. “Apply”, “Ok”, and in the “Projects”, “Build Settings”, on the bottom, click on activate the newly defined “Parser”.

If you want even more verbose warnings about undocumented parameters, try setting WARN_NO_PARAMDOC to YES in the Doxyfile and EXTRACT_ALL to NO. This will account for many much more weak points of your documentation and let you pinpoint your efforts on the right spot. And while it can be burdensome to write all this extra missing documentation, QtCreator also simplifies the task by typing three magic characters on top a function. Then, it will autocomplete all the skeleton in doxygen format. Check first if “Tools”, “Text editor”, “Completion”, “Enable Doxygen blocks” is enabled.

Consider also enabling this spell-checking plugin for detecting typos in your documentation. This can be done by simply downloading the release file and unzipping into into your qtcreator folder. Then, under “Tools”, “Options”, “Spellchecker”, you can configure which dictionary or language(s) to use.

Debugging tools

If you need to debug your ROOT scripts, or the ROOT library itself, I recommend building ROOT from its sources, but using the “Debug” flag.

Building ROOT in Debug Mode

To do this:

Clone the ROOT git repository
Open QtCreator
“File”, “Open File or Project” and double click on the main “CMakeLists.txt” file.
In the “Configure Project” dialog that will appear, you will be prompted to select which kit (compiler) you want to use for building.
On the top left, you can click on “Manage Kits” to remember your different compiler choices. (See the kit configuration above). Click “Ok” to close.
Click on the “checkbox” of the kit of choice for this build.
Click then on “Details”. Activate the “Debug”, deactivate “Release”. The “Debug” mode will internally set the CMAKE_BUILD_TYPE to Debug, as you would do from a command line.
Specify also the folder where it will be built if you do not like the default choice. This can be set in the text box right from the “Debug” checkbox (while you are in the “Configure Project” - “your chosen Kit” dialog, “Details” dropdown unfolded).
If you’ve already built ROOT using debug mode via your command line, then you can “import” your preexisting build, to not recompile it and save your time.
Press then on the “Configure Project” button.
On the left pane, press again on the “Projects” icon.
At the bottom of the “Key” tree viewer, deactivate or activate submodules of ROOT as needed. This acts as passing -Dmodule=ON via the command line.
Consider enabling “testing” to run all ROOT tests.
In the “Build steps” section, click on “Details”, and specify -j8 on the “CMake arguments” or whatever other number, to speed up the build.
On the left bottom pane, click on the “Build” (the big hammer) icon, and ROOT will be compiled.
Once built, on the left, click on “Projects”, click on “your-Kit-name” on the left, under “Build & Run”, then on the “Run” small icon just below it, and under “Run configuration” on the right, select from the dropdown which executable you want to run. (The one selected by default might be “FileCheck”, click on it to change it).
Under “Command line arguments”, specify what arguments you want to pass to this executable.
Specify also the “Working directory”.
If needed, activate the “Run in terminal” checkbox.
You can then run it from the big “Play” icon on the left pane.

Debugging your ROOT scripts or executables with GDB

To debug your script, on the “Projects”, “Build & Run”, “Your-Kit-Name, “Run” settings, specify your executable right of “Run configuration” by clicking on the dropdown menu (your own standalone application, or root.exe) and specify your “Command line arguments”, e.g. -l -b as well as “Working directory”, e.g. the name of the script you want to run as well as their parameters. If you want to precompile instead of interpret with cling, consider using the debug flag g when passing the command line argument (yourScript.C+g)

Click then on the “Play-Bug” icon on the left, and your script will run in “Debug” mode. Breakpoints can be set interactively on your code. F5 will pause or resume your process, as well as show you a workspace of the active variables and threads. For example, specify as “Command line arguments” -l -b hsimple.C+ -q and as working directory your-root-folder/tutorials. Open this file within QtCreator, and click on the left of the line numbers, and then click on the “Play-Bug” icon on the left, the script will execute and pause when it reaches that point. You can then perform step-by-step execution using the three little arrow icons right from the “Debugger” dropdown menu. You can hover your mouse over them and a tooltip will show their function.

Below a screenshot of another example, while debugging a deadlock in the TThread class.

Side note: if at some point, your ROOT script gets very complex long, I recommend instead to use a standalone C++ application using CMake, and link the ROOT libraries easily to it, as explained here.

Memory error detection

To check for memory leaks and corruption, QtCreator offers a seamless integration with valgrind (or heob on Windows), making the backtrace of your errors fully interactive. To run it, press on the big “Debug button” on the left. Then, on the dropdown menu, change from “Debugger” to “Memcheck” and click on the small play button.

If you need extra arguments for valgrind, you will need to specify those under “Tools”, “Options”, “Analyzer”, “Valgrind”. There, I also recommend to click on “Add”, “etc/valgrind-root.supp” from your cloned repository, to suppress spurious warnings.

The resulting warnings can be easily clicked to bring you to the right spot in your code, or in the ROOT codebase, where the issue is arising from.

Often, you will also find helpful to run the static Clang-analyzer, which is able to detect many unsafe parts of your code that might be leading to memory leaks. It’s in the same dropdown menu, under “Clang-Tidy and Clazy”.

Data race detection

First, I recommend to click on “Tools”, “Options”, “Analyzer”, “Valgrind”, “Add”, “etc/valgrind-root.supp” from your git repository. Then, on the dropdown menu, change from “Debugger” to “Memcheck” and click on the small play button.

Helgrind cannot be run yet directly from QtCreator. The workaround is to run root.exe or your own executable from your command line, with the flags --tool=helgrind --xml=yes --xml-file=yourfile.xml. Then, you can “load” the result using the small “open” button right from the dropdown menu. The parsing tool works great and takes you the relevant location in your code.

Performance analysis

In case you want to optimize the performance of your code, you can select from the debugger dropdown menu between “Callgrind” or the Performance Analyzer. If you install and use callgrind, consider installing also kcachegrind for visualization.

Other approaches

There are other tricks to boost your development in a way that’s integrated with your IDE. For example:

If you use a standalone application that uses ROOT libraries and graphical interface, but not it’s terminal, you might want to check the TGCommandPlugin window. With it, you can nicely interact with your internal C++ classes while your program is executing, without having to build in “Debug” mode, which has sometimes downsides due to its slow performance. To make ROOT aware of your C++ object, you need to call within your program:
```
gROOT->ProcessLine(
    static_cast<TString>(
        "MyClassType* const fMyInstance = reinterpret_cast(") +
    dynamic_cast<std::ostringstream &&>(std::ostringstream("") << fMyInstance)
        .str() +
    ");");
```
And then, of course, creating a TGCommandPlugin window. From there, typing fMyInstance->MyMethod() will execute binary code interactively.
This VS Studio plugin allows for a nice integration of a ROOT file browser. Maybe it will come at some point for QtCreator, too.
Interfaces between Cling and Qt have been attempted before.

Quick recipe Summary

Optional: install valgrind, callgrind, kcachegrind.
Install QtCreator deactivating all extra options.
Download std Help Book from cppreference or package manager (sudo apt install cppreference-doc-en-qch).
Download ROOT Help Book.
Add both “.qch” files via “Tools”, “Options”, “Help”, “Documentation”.
“Tools”, “Options”, “Analyzer”, “Default checks”, configure as needed.
Install the Beautifier plugin, potentially download the ROOT one to store in your own project.
“Tools”, “Options”, “Beautifier”, “Clang”, “Use predefined style”, “File”
If you enable “testing” flag in CMake, adapt “Timeout” in “Tools”, “Options”, “Testing”.
Be sure that the option “CTest” is active under “Active Frameworks” of that same menu.
Optional: Check that “Tools”, “Kits”, you have selected the compiler you want, as well as “CMake generator”. If you use Ninja, make sure it’s installed.
Optional: Check that “Tools”, “Text editor”, “Completion”, “Enable Doxygen blocks” is enabled.
Optional: Under “Tools”, “Options”, “Build & Run”, “Custom Output Parsers”, “Add”, “Warning”, specify the pattern (.*) at line (\d+) of file (.*) and order 3,2,1. “Apply”, “Ok”. Activate it under “Projects”, “Build Settings”, on the bottom.
Optional: Install a spellchecker plugin by unzipping the release file into your QtCreator installation folder. Configure then your dictionary under “Tools”, “Options”, “Spellchecker”.
Clone the ROOT git repository and open main “CMakeLists.txt” with QtCreator.
Optional: configure your default’s “Kit” build directory to e.g. ~/builds/.
Specify -j8 on “Projects”, “Build & Run”, “Kit-name”, “Build”, “Build Steps”, and root.exe as your executable in the run settings.
Optional: also specify the “Working directory”.
“Tools”, “Options”, “Analyzer”, “Valgrind”, “Add”, “etc/valgrind-root.supp” and “etc/helgrind-root.supp” from your cloned repository.

Setting up all this platform requires some initial effort, but once it is running, it will smooth your development and bug hunting, and once you’ve get used to it, you will find it much more tiring to program without it ;) .

Fernando Hueso-González IFIC - Instituto de Física Corpuscular (CSIC / Universitat de València)

Debugging CERN ROOT scripts and ROOT-based programs in Eclipse IDE

2021-10-30T00:00:00+00:00

NOTE: originally this post outlined a setup of a ROOT-based project in Eclipse IDE based on the Eclipse CDT4 CMake generator functionality. However, CMake4eclipse plugin provides a better integration of ROOT-based projects in Eclipse. Therefore post is updated in September 2022 to demonstrate the new approach. Former notes can be found here.

ROOT framework is written in C++, a language with complete manual control over the memory. Therefore, development and execution of your ROOT script (or Geant4 program) may sometimes lead to a crash providing minimal information in the stack trace. ROOT framework does not provide out-of-the-box solutions for debugging scripts. Hence, a question about debugging ROOT scripts now and then arises in the ROOT community.

Generally speaking, one does not need a special development environment to invoke a debugger on a ROOT script. Users can simply invoke the GNU Debugger (GDB) on the debug the root.exe binary:

gdb --args root.exe -l -b -q yourRootMacro.C

Similarly, GDB can be used for debugging stand-alone ROOT and Geant4-based programs. However, this debugging experience is carried out in the Terminal and lacks user interface and many useful features.

In this article, we outline an approach for robust debugging of CERN ROOT scripts and ROOT-based programs (also applies to Geant4-based programs). We will utilize Eclipse CDT (C/C++ Development Tooling) Integrated Desktop Environment (IDE), a free software coupled with the GNU debugger (GDB):

Eclipse indexer scans the object-oriented hierarchy of the library classes, allowing easy navigation between C++ sources and headers, quick lookup of method overrides, code highlighting and many more.
GDB allows pausing program execution at any time. Computer memory, object instances, variable values .

Additionally, the current approach allows users to have ROOT and Geant4 frameworks built in both - Release and Debug modes installed on the same computer. Debug binaries are great for development, allowing memory analysis and efficient development. Release builds - on the other hand - can be optimized for robust execution of the program and may work up to 10 times faster.

A few words about the operating system (OS). In this post, we will consider the setup on Linux-based systems. A similar approach may be replicated to macOS with GNU toolchain, but will require a code signing procedure. Windows is a totally different story.

Following milestones are required to complete the setup of the development environment:

Install Eclipse IDE on your computer. Eclipse is an all-in-one development solution that automates many things: source highlighting and formatting, invokes the CMake build, lets users set breakpoints in code, attaches the debugger to the executable, and many more.
Obtain ROOT source code on your computer. Once attached to the project, this allows easy inspection and navigation between your script (or program) and ROOT source files within the IDE user interface. It also allows modification of the frameworks’ source files while debugging your program, which makes it easy to fix bugs and issue Pull Requests to the ROOT open-source code.
Compile ROOT with debug symbols. This provides the ability to set up breakpoints in your code and original ROOT (and/or Geant4) source files, inspect variables, access data types and object members in the program source code.
Transform ROOT script into a standalone program. ROOT scripts designed to run with C++ interpreter (executed line-by-line), need to be transformed into a compiled C++ program - with an entry point in “main()” function.
Set up your ROOT-based program in Eclipse IDE. During the main project setup, ROOT project is marked as a reference. This automatically triggers the rebuild and re-install of corresponding ROOT components prior to the main project build. Additionally, ROOT project indexer database becomes shared with your project.

Eclipse IDE Setup

In this section we demonstrate how to install Eclipse IDE on a personal Linux computer. We will use Eclipse with the cmake4eclipse plugin - a powerful tool for use with CMake-based projects. Cmake4eclipse automates the project setup and allows for automatic rebuild of the frameworks’ libraries (ROOT and/or Geant4) once their source code was changed.

Today (Aug 2022) CMake4eclipse plugin provides better integration of CMake-based projects in Eclipse compared to other options as:

Using CMake generator to create Eclipse project.
Using the built-in Eclipse wizard to import an existing CMake project.

Each of the above options have its own drawbacks that are a subject of a separate discussion. Following steps are required to set up the IDE workflow.

Install Eclipse IDE. Download the Eclipse installer from the official website, extract it and run. Select “Eclipse IDE for C/C++ Developers”. Refer to the screenshot and instructions below:

A. Recent Eclipse versions come with bundled Java Runtime Environment (JRE). As of July 2022, specify the built-in JRE version 11. Otherwise there will be an error accessing Eclipse help. This may be fixed in later Eclipse releases.

B. On Linux it is a good practice to install software that is not included in your distribution under the /opt, /usr/local/ folder or home folder. In this article we will stick to the latter option and install Eclipse in the home folder under ~/Applications/ for consistency with the setup on macOS.

C. Wizard will provide a required list of packages to be installed on your system. Ensure all of the package dependencies are installed on your system.

D. Exit the wizard. There is no need to launch Eclipse right away. We will tweak its configuration file first.
Increase Eclipse memory limits. ROOT libraries contain thousands of source files. Usually, when indexing a ROOT-based project, memory use fluctuates around 2GB. Eclipse memory use be inspected with the VisualVM application.

Memory limits are specified in the eclipse.ini file located inside the Eclipse install folder. Use text editor to update following lines:
```
-Xms512m
-Xmx4096m       (set to 2048m minimum or higher if available)
```
Here the -Xms value corresponds to the initial heap size used at the Eclipse startup. The latter -Xmx value is to the maximum available memory limit. The more libraries are used in your project (ROOT, Geant4) the higher -Xmx value is required for Eclipse indexer. Indexing speed of the framework source files will be higher with more available RAM.
Fix Eclipse launcher. If Eclipse window does not properly minimize to the dock icon, execute following command (bug report submitted here):
```
echo StartupWMClass=Eclipse >> ~/.local/share/applications/epp.package.cpp.desktop
```
Tweak memory limit for Eclipse indexer. Launch Eclipse and select default workspace location (e.g. ~Development/eclipse-workspace). In the Eclipse menu open Window → Preferences → C/C++ → Indexer. Under “Cache Limits” set:

Limit relative to maximum heap size: 75%
Absolute limit: 4096 MB (same as for -Xmx value in eclipse.ini)
Update Eclipse and its CDT plugin. In the menu select Help → Check for updates. Follow the wizard steps. Restart Eclipse if required.
Install CMake4eclipse plugin. Project details can be found on GitHub. In the Eclipse menu select Help → Install new software. Enter following URL in the “Work with” field: https://raw.githubusercontent.com/15knots/CMake4eclipse/master/releng/comp-update/

In the modal dialog select everything but uncheck the older version of CMake4eclipse (v2). Keep only version v3. Follow the wizard steps and restart Eclipse. Refer to the screenshot below:
Tweak cmake4eclipse settings. Set default workbench for CMake4eclipse. In the Eclipse menu select Window → Preferences → C/C++ → Cmake4eclipse → Default build system → Set “Unix Makefiles”.

On the “CMake cache entries” tab, specify the C++ standard used for the build. Add corresponding CMake cache entry:

Name Type Value

CMAKE_CXX_STANDARD STRING 17

It is important for the ROOT-based programs to be compiled with the same C++ standard as the ROOT libraries. Therefore, in this guide we will explicitly set the C++ standard for ROOT, Geant4 and their based programs, more info here. We will use the C++17 standard.

Tip: if having problems with the build later, check “Force re-creation with each build”. This will trigger the CMakeLists.txt update and re-generation of the Unix makefile at every build, reflecting possible changes in the CMake cache entries (variables) and ROOT components’ source code.

Name	Type	Value
CMAKE_CXX_STANDARD	STRING	17

Optionally apply following useful tweaks to the Eclipse workflow:

Hide the Launch Bar. It indeed takes quite some space on smaller screens. In the Eclipse menu go to: Window → Preferences → Run/Debug → Launching → Launch Bar. Uncheck “Enable the Launch Bar”.
Display line numbers. In the Eclipse menu go to: Window → Preferences → General → Editors → Text Editors. Check “Show line numbers”.
Kill the previous application run upon starting a new one. In the Eclipse menu, select Window → Preferences → Run/Debug → Launching → “Terminate and relaunch while launching”.
Save before building. It’s useful to automatically save source files upon triggering the build. In the Eclipse menu, select Window → Preferences → General → Workspace → Build and select “Save automatically before manual build”.
Tweak scalability settings. In Window → Preferences → C/C++ → Editor → Scalability set “Enable scalability mode… is more than” 50000 lines.

We successfully installed and set up the Eclipse with CMake4eclipse plugin and are now ready to set up ROOT project in Eclipse IDE.

Building ROOT with Debug Symbols

In this section we address the setup of ROOT libraries as a project in Eclipse IDE. Framework will be built with debug symbols. This allows for setting breakpoints in the ROOT code, inspecting memory and variable values during the program run.

Install dependencies. Refer to this page on ROOT website to satisfy the dependencies for your particular Linux distribution.
Obtain the source code. There are a few options here.

A straightforward way is to download ROOT sources for a specific release from the ROOT website. Extract ROOT sources under the ~/Development home folder. We will keep all the source code and Git repositories in this folder for consistency purposes.

Alternatively, if a user plans on contributing towards the ROOT repository it is recommended to fork the latest master branch on GitHub, create a new branch in your forked repository and check it out:
```
mkdir -p ~/Development && cd ~/Development
git clone https://github.com//root
git checkout -b 
```
This allows for issuing Pull Requests to the original repository. More details can be found on this page.

Set up a project in Eclipse. Launch Eclipse. In the menu open File → New → Project… Expand “C/C++” and select “C++ Project” (not “C/C++ Project”).

On the next dialog, specify “root” as the project name. Uncheck “Use default location” and “Browse…” for ROOT sources location (e.g. ~/Development/root). In “Project Type” expand “Cmake4eclipse” and select “Empty Project”. In “Toolchains” select “CMake driven”. Click “Next >”.

We are building ROOT with debug symbols. Therefore, uncheck “Default” and “Release” build options and only keep the “Debug”. Essentially this dialog box specifies the CMake -DCMAKE_BUILT_TYPE variable.

Next we provide the CMake plugin with ROOT build options. Click “Advanced Settings…”. Go to C/C++ Build → Cmake4eclipse. Open the “CMake cache entries” tab. Add following variable names and values. Use “Add…” button on the right and input following variable names, types and values:

Name	Type	Value
CMAKE_INSTALL_PREFIX	PATH	${HOME}/Applications/root-debug
all	BOOL	ON

When specifying variables of a PATH type, it is handy to use the “File System…” button. It will display the folder picker dialog and minimize the chance of specifying a wrong path. Refer to the screenshot below:

In this tutorial we build ROOT with all optional components turned on (-Dall=ON). Find a complete list of the ROOT CMake build variables on the ROOT website and tailor the build for your needs.

Click “Apply and Close”. Click “Finish”.

Notice that Eclipse will start indexing the project. However, we will reschedule this operation later - after the build is completed. Reveal the “Progress” panel (tiny scrollbar animation in the very bottom right corner). Stop the indexer operation.

Build framework in Eclipse. Reveal the “Build Targets” tab (on the right side) and select “root” project. Right-click and select “New…” build target. Name target “install”. Click “Ok”. Expand “root” in the “Build Targets” tab and double-click the “install” target.

Build process speed depends on your computer speed and provided build variables. It may take up to a few hours to finish.

Tip: to switch between the CMake console and Linux make console, locate the “Display Selected Console” dropdown on the bottom actions panel.
Exclude build folder from indexing. Cmake4eclipse plugin performs a so-called in-source build. Meaning that the build folder is located within a project file tree. During the build ROOT header files are copied and duplicated inside the “_build” folder. To avoid indexing duplicate sources and headers, right click “root” project → Properties → C/C++ General → Paths and Symbols → Source Location. Expand the “/root” folder. Select “Filter”. Click “Edit filter…”. Add the “_build” folder to the filter. Click “Apply and Close”.
Run Eclipse indexer. We are now ready to index all ROOT source files and headers. This will create an Object-Oriented Programming (OOP) database of all ROOT object types, their methods and inheritance relations. Right click “root” project → Index → Rebuild.

Tip 1: sometimes Eclipse indexer may freeze while parsing the ./interpreter/llvm/src/tools/clang/lib/Driver/ sub-folder. If this happens, exclude the interpreter folder from the build (this also excludes folders from the index). Highlight the interpreter folder in the project tree. Right click, and select Resource Configurations → Exclude from build. Check “Debug” configuration. Click “Ok”. Now right click “root” project → Index → Rebuild.

Tip 2: Indexer usually takes couple hours to parse all of the ROOT framework source files. Computers with fast NVMe hard drives will perform this task the best. For computers with SATA drives and older I recommend keeping ROOT sources on the RAMDisk. Feel free to find my RAMDisk implementation on GitHub.
Turn off false positive errors. Even though the ROOT compilation succeeds, Eclipse code analysis tool displays semantic errors in ROOT sources. To turn them off, right-click “root” project and open Preferences → C/C++ General → Code Analysis. Select “Use Project Settings” option. Uncheck “Syntax and Semantic Errors” group. Maybe someone has a better idea how to fix that?

Eclipse carries out the in-source build, meaning that the “_build” folder is located inside the ROOT project tree. Apparently, if users want to issue pull requests to the ROOT GitHub repository, the “_build” folder needs to be added to the “.gitignore” of the local ROOT Git branch.

At this point ROOT libraries are compiled with debug symbols and Eclipse has indexed all the framework source files.

Not to mention, Geant4 framework libraries can be built in Eclipse exactly the same way as outlined above for the ROOT project.

Transform ROOT script into CMake ROOT-based Program

ROOT scripts are originally designed to run through the Cling, a modern C++ interpreter based on the LLVM and Clang. To debug a ROOT script with native Linux GNU development tools - gcc compiler and gdb debugger - we need to convert a ROOT script into a ROOT-based program. Having it compiled into an executable with debug symbols, we will be able to invoke a debugger on it.

Skip to the next section if you already have a ROOT-based program code ready.

Next we elaborate how to convert a ROOT script into a CMake ROOT-based program. Generally speaking, this involves following:

Compose CMakeLists.txt file containing a set of instructions that:
- Locate necessary libraries (ROOT, Geant4).
- Find and compile source and header files in your project.
- Create a ROOT dictionary and shared library for your program.
- Link all object files and shared libraries into an executable.
Explicitly define all the headers used in your script (Cling interpreter does not require that).
If using the Object-Oriented-Programming (OOP) approach, certain class names need to be listed as directives in a special “LinkDef.h” file for the dictionary and shared library generation.

Detailed instructions elaborating each item above can be found in this template repository on GitHub. Please refer to the repository README file.

Setting up a ROOT-Based CMake Program

In this section we will set up a ROOT-based CMake project in Eclipse IDE.

Obtain the source code. Place your ROOT-based project into a desired location, e.g. ~/Development.

Set up the CMake4Eclipse project. Similarly to the ROOT project setup, in the Eclipse menu open File → New → Project… Expand “C/C++” and select “C++ Project” (not “C/C++ Project”).

On the next dialog, specify your project name. Uncheck “Use default location” and “Browse…” for your project “CMakeLists.txt” location. In “Project Type” expand “Cmake4eclipse” and select “Empty Project”. In “Toolchains” select “CMake driven”. Click “Next >”.

For the development purpose we uncheck “Default” and “Release” build configurations and keep the “Debug” option only.

Next we need to provide the CMake plugin with corresponding build variables. Click “Advanced Settings…”. Go to C/C++ Build → Cmake4eclipse. Open the “CMake cache entries” tab.

Since we compiled and installed ROOT and Geant4 not system-wide, but in ~/Applications home directory, we need to provide the CMake with the locations of “RootConfig.cmake” file. Use “Add…” button on the right to input following variable name, type and value:

Name	Type	Value
ROOT_DIR	PATH	${HOME}/Applications/root-debug/cmake

Make sure that the path specified above is accurate.

Alternatively, paths to ROOT (and Geant4) CMake configuration files can be provided together in one variable CMAKE_PREFIX_PATH. Multiple paths are separated by the semicolon:

Name	Type	Value
CMAKE_PREFIX_PATH	PATH	${HOME}/Applications/root-debug/cmake

Set environment variables. Now we need to set the environment variables for the ROOT-based project. It is important to set the variables only for your particular ROOT-based project in Eclipse (not in general Eclipse settings). If set for all Eclipse projects, environment variables may interfere with the subsequent rebuilds of the ROOT (or Grant4) framework.

Source ROOT and/or Geant4 variables in Terminal and manually plug them into the Eclipse settings. Open Terminal and execute following command (copy paste as one line):
```
source $HOME/Applications/root-debug/bin/thisroot.sh && \\
env | grep 'G4\|ROOTSYS\|^LD_LIBRARY_PATH\|^PATH'
```
Required environment variables are output in the Terminal window:

Now go back to the project setup dialog in Eclipse, open C/C++ Build → Environment. Manually “Add…” each environment variable name and value into the Eclipse project settings:

Also, select “Replace native environment with specified one” option. This will isolate Eclipse project environment variables containing paths to frameworks built with debug symbols from potentially installed ROOT or Geant4 release versions on the system. Click “Finish” button.
Build project in Eclipse. Highlight your project in the Project Explorer. Right click → build.
Reference ROOT project. Select your ROOT-based project in the Project Explorer. Right click properties → Project References → Check “root”. This allows following:
- Sharing ROOT and/or Geant4 indexer database with your project.
- Rebuild of ROOT libraries prior to your project build in case any of ROOT source files were changed.
Create a run (debug) configuration. Select your project in the project tree. In the Eclipse menu, open Run → Debug Configurations…

Select “C/C++ Application”. Press the “New launch configuration” button (on the very top left). Click the “Search Project…” button and locate the corresponding executable file.

If necessary, specify any command-line parameters on the “Arguments” tab. Click “Debug”.

This is it. Now you can enjoy full-scale debugging of your ROOT-based applications in Eclipse IDE.

Summary

In this post, we learned how to pair Eclipse IDE and cmake4eclipse plugins to setup an effective development environment for CERN ROOT scripts and ROOT-based programs. It was a process, so let’s draw a line and summarise what we learned today:

Learned how to install and setup the Eclipse IDE and cmake4eclipse.
Compiled ROOT project with debug symbols in Eclipse.
Investigated how to transform a ROOT script into a ROOT-based program with CMake configuration file.
Got familiar with Eclipse IDE user interface. Set up our ROOT-based project, created and run the Debug configurations.

I hope you enjoyed this technical note. If not yet familiar, you can now continue learning fundamental Eclipse CDT hotkeys and debugging capabilities on YouTube.

For those who are interested in setting up the same development environment for a project that utilizes both - Geant4 and ROOT, please follow this link.

Feel free to leave comments below if you have any questions or recommendations.

ROOT Manual’s week

2021-10-11T00:00:00+00:00

ROOT has now a brand new Manual, bringing “how do you read data with ROOT” and similar core aspects into contemporary ROOT. Much of the manual has been rethought and simplified, we hope and expect that concepts and interfaces are now easier to understand!

To achieve this, all the ROOT team was involved in a big update of the Manual and Reference Guide during one full week.

Previously, the ROOT documentation was spread over three different main manuals:

The Reference Guide
The Manual
The Old ROOT User’s Guide

The Reference Guide and the Manual are the current valid sources of documentation. The Manual acts as a “User’s Guide” helping users to find their way into the huge amount of documentation provided in the Reference Guide.

The Old User’s Guide was outdated and not updated but nevertheless contained some valuable information we did not want to lose. It was more a “Long Write-Up” like the following example:

The first task of the week was to make sure the Manual’s table of content was complete: by groups of experts, we updated the existing chapters (Histograms, Graphs, Trees …) and create the new needed ones (JSROOT, ROOT I/O…) which also led to updates in the Reference Guide.

We also moved everything still valuable in the Old User’s Guide to the relevant places, updating the Manual or the Reference Guide. In the end (and after another, final round at the end of the year) we will drop completely the Old User’s Guide and instead have an accurate and complete Manual.

This new Manual allows you to contribute - for instance by letting us know when something is hard to understand (by opening an issue) or even by fixing it yourself: see the GitHub octocat at the bottom right corner of each page!

We hope you’ll enjoy the new manual, and that it’s useful for today’s new grad students!

ROOT joins Hacktoberfest 2021!

2021-10-05T00:00:00+00:00

The ROOT GitHub repository is open for Hacktoberfest contributions!

What is it?

Hacktoberfest is a yearly event that encourages participation in open source communities and projects. Would you like to help us with some (not so scary) bugs? Or maybe you have some place in our documentation that you think deserves some love? Any small but noticeable feature you would like to see in ROOT? This definitely is a nice time to give your support to the project!

How does it work?

You just have to submit a PR to our repository. If it gets approved, it will count towards your hacktoberfest points. Check out the Hacktoberfest website for a full list of rules. Take a look at our issues labeled “good first issue” to get started, or feel free to make a PR about your own ideas! Just make sure you’re not a bot 😅

RDataFrame is going distributed!

2021-07-22T00:00:00+00:00

So you love RDataFrame, but would like to use it on a cluster? We hear you! In fact, we just introduced in ROOT a Python package to enable distributing ROOT RDataFrame workloads to a set of remote resources. This feature is available in experimental phase since the latest ROOT 6.24 release, allowing users to write and run their applications from within the same interface while steering the computations to, for instance, an Apache Spark cluster.

One programming model, many backends

RDataFrame is ROOT’s high-level interface for data analysis since ROOT v6.14. By now many real world analyses use it, and on top of that we see lots of non-analysis usage in the wild. Parallelism has always been a staple of its design with support for executing the event loop on all cores of a machine thanks to implicit multi-threading. Since ROOT 6.24, this aspect of RDataFrame has been enhanced further with distributed computing capabilities, allowing users to run their analysis on multi-node computing clusters through widely used frameworks. Currently the package offers support for running the application on an Apache Spark cluster, but the package design will make it possible to add many more backends in the future. For example, backends for Dask and AWS Lambda have already been implemented and demonstrated at different conferences [1][2]. They will be made available in future ROOT releases.

The main goal is to support distributed execution of any RDataFrame application. This has led to the creation of a Python package that connects the RDataFrame API (available in Python through PyROOT) and the APIs of distributed computing frameworks, which are offered in Python in the vast majority of cases. Another key goal is to offer a variety of backends, to provide a solution to a variety of use cases. This is achieved through a modular implementation that defines a generic task (representing the RDataFrame computation graph) to be executed on data. The input dataset is logically split into many ranges of entries, which will be sent to the distributed nodes for processing. Each range will then be paired to the generic task and submitted to the computing framework via a specific backend implementation. An added benefit of using RDataFrame is that the distributed tasks run C++ computations. This is made possible by PyROOT and cling.

Excellent scaling is paramount for this distributed RDataFrame implementation, to ensure you can run the RDataFrame computation graph efficiently across multiple computing nodes and different backend implementations. This has been shown since the first stages of the development of this package with a real use case analysis running on a Spark cluster [3]. More recently, a benchmark based on CERN open data has shown promising scaling performance with both Spark and Dask [4]:

We hear you asking: how does it look in code? Here is an example of an RDataFrame that is able to delegate its computations to a Spark scheduler (requires the Python pyspark package):

import ROOT
 
# Point RDataFrame calls to the Spark specific RDataFrame
RDataFrame = ROOT.RDF.Experimental.Distributed.Spark.RDataFrame
 
# It still accepts the same constructor arguments as traditional RDataFrame.
# It defaults to running a Spark process on the local machine, but it is possible
# to configure the RDataFrame to connect to a preexisting cluster.
df = RDataFrame("mytree", "myfile.root")
 
# Continue with the traditional RDataFrame API
sum = df.Filter("x > 10").Sum("y")
h = df.Histo1D("x")
 
print(sum.GetValue())
h.Draw()

The only difference with respect to local RDataFrame was the usage of a Spark backend specific RDataFrame. By default, it runs on the local machine using all cores. That is, it uses the default Spark behaviour. If a cluster is available, distributed RDataFrame allows connections to the remote scheduler through an extra optional argument in the constructor. Here is an example that also shows the connection to a Dask cluster:

from dask.distributed import Client
import ROOT

# Point RDataFrame calls to the Dask specific RDataFrame
RDataFrame = ROOT.RDF.Experimental.Distributed.Dask.RDataFrame

# Create the Client object to connect to the Dask cluster
# See the Dask documentation for all the options available
client = Client("DASK_SCHEDULER_ADDRESS")

# It still accepts the same constructor arguments as traditional RDataFrame
# And supports some extra keyword arguments
df = RDataFrame("mytree", "myfile.root", npartitions = 8, daskclient = client)

In this example the npartitions parameter tells the RDataFrame into how many ranges of entries the input dataset should be split. Each range will then correspond to a task on a node of the cluster. The daskclient parameter receives the object needed to connect to the Dask scheduler. All the options are available in the Dask documentation. The equivalent object for the Spark framework is called SparkContext and in general every backend will have its own way to connect to a cluster of nodes.

Once the correct RDataFrame object has been created, there is no need to modify any other part of the program.

Conclusions

Distributed RDataFrame enables large-scale interactive data analysis with ROOT. This Python layer on top of RDataFrame allows to steer C++ computations on a set of computing nodes and returning the final result directly to the user, so that the entire analysis can be run within the same application. It is available as an experimental feature since ROOT 6.24 with the support for running on a Spark cluster, with more backends in the works: Dask will be available soon in our nightly builds. Try distributed RDataFrame with our tutorial tutorial and learn more about it in the respective RDataFrame documentation section.

References

[1] Vincenzo Eduardo Padulano, Enric Tejedor Saavedra. “Dask backend for distributed RDataFrame”. In: Dask Distributed Summit 2021. https://summit.dask.org/schedule/presentation/24/dask-in-high-energy-physics-community

[2] Jacek Kusnierz et al. “Distributed Parallel Analysis Engine for High Energy Physics Using AWS Lambda”. In: HPDC 2021. https://dl.acm.org/doi/10.1145/3452413.3464788

[3] Valentina Avati et al. “Declarative Big Data Analysis for High-Energy Physics: TOTEM Use Case”. In:Euro-Par 2019: Parallel Processing (2019),pp. 241–255. https://doi.org/10.1007/978-3-030-29400-7_18

[4] Vincenzo Eduardo Padulano, Enric Tejedor Saavedra. “A Python package for distributed ROOT RDataFrame analysis”. In: PyHEP 2021. https://indico.cern.ch/event/1019958/contributions/4419751

Cling Transitions to LLVM’s Clang-Repl

2021-06-30T00:00:00+00:00

Over the last decade we developed an interactive, interpretative C++ interpreter (aka REPL) as part of the high-energy physics (HEP) data analysis project – ROOT. We invested a significant effort to replace CINT, the C++ interpreter used until ROOT5, with a newly implemented REPL based on llvm – Cling. Cling is a core component of ROOT and has been in production since 2014.

Cling is also a standalone tool, which has a growing community outside of our field. It is recognized for enabling interactivity, dynamic interoperability and rapid prototyping capabilities for C++ developers. For example, if you are typing C++ in a Jupyter notebook you are using the xeus-cling jupyter kernel. One of the major challenges is to ensure Cling’s sustainability and to foster that growing community.

Goals

Cling is built on top of LLVM and Clang. Reusing this compiler infrastructure means that Cling gets easy access to new future C++ standards, new compiler features and static analysis infrastructure. Our project organization mostly followed the LLVM community standards, but the remaining LLVM-specific customizations, while kept at minimum, are now costly for sustainability and development. For example, it is time consuming to move to newer LLVM versions and release Cling as following the LLVM release schedule.

A natural next step to mitigate some of these challenges is to move the essential part of the infrastructure closer to the LLVM orbit. The benefits of the solid software engineering by the LLVM community have been praised widely. For example, LLVM’s rigorous standards for code reviews, release cycles and integration are often raised by our “external” users. We would connect two highly knowledgeable system software engineering communities – the one around LLVM, and the one around data analysis in HEP. The success of Cling demonstrates that incrementally compiled C++ is a feature the C++ community can benefit from and the data science community needs. Finally, there are also potential synergies with projects such as clangd and lldb which would help interactive C++ become more popular to the broader C++ audience.

Where Are We Now

In 2018 we decided to approach the issue in a more structural way. We dedicated resources from various ongoing activities in DIANA-HEP and IPCC-ROOT and in 2019 we received an NSF award supporting this goal.

In July 2020, we laid our arguments in a “request for comment” document on the llvm mailing lists. The encouraging community response motivated us to produce several llvm blog posts with intention to clarify capabilities, design aspects and advanced feature use:

Interactive C++ with Cling – motivates an interactive C++ interpreter, explains Cling’s architecture and design principles, and shares several implementation challenges of specific features.
Interactive C++ for Data Science – motivates interactive C++ in the context of Data Science and High Energy Physics use case. Examples include eval-style programming capabilities, and enabling technologies such as C++ in Jupyter Notebooks via xeus-cling and interactive CUDA C++ development.
Cling – Beyond Just Interpreting C++ – demonstrates advanced usage such as template instantiation on demand, language interoperability with Python and D via cppyy, as well as using the interpreter as a service to bridge compiled and interpreted code. Other applications include extending the C++ language to provide automatic differentiation features on the fly.

The LLVM community encouraged the general direction of moving reusable components in Clang. The “new” Clang tool is called clang-repl. The motivation behind the new name has two main aspects. Firstly, we need to ensure gradual code reuse from clang-repl to downstream Cling and clashing class names is yet another unnecessary complication. Secondly, some Cling features are tailored towards HEP and hard to argue for wider use. Such an example is the implicit auto keyword injection or connecting ROOT files to the name lookup. In that respect having a project named Cling in the Clang repository which differs in functionality than the one in ROOT and HEP would create confusion and packaging problems. And the final (bonus) argument is that ROOT will always require occasional hot fixes in both Cling and LLVM which cannot be bound to the major LLVM release schedule. For example, it would be unreasonable to wait for the next LLVM release to address that (or even just the rigorous review procedure).

On 12th of May the initial, minimally functional, clang-repl landed in the LLVM repository. Hooray!

Next Challenges

Although the acceptance of the initial clang-repl patch was a considerable success, it is essentially about initiating a new direction for the LLVM community. Such strategic choice came by the years-long effort within HEP to innovate, and also sustain, its technology advancement making it accessible to a broader audience.

Several of Cling’s technical aspects are now being discussed with the LLVM community, for instance how to implement reliable error recovery and code-removal mechanisms that free the unused underlying memory. These tasks have proven to be difficult for Cling being outside of the LLVM infrastructure. Thanks to John McCall and Richard Smith the sketch of the feature technical design within LLVM is sound and we are working towards it. However, this process poses an anticipated challenge – how to advance the technology in a slightly tangent direction while feeding it back to its major field of use?

Feeding back implementation from mainline LLVM to Cling and ROOT is a non-trivial task, partially because ROOT usually uses significantly older LLVM versions. The LLVM API does not promise backward compatibility and ROOT uses an intricate and vast API surface which makes transitions to newer versions essentially a development task. The goal of upstreaming parts of Cling in LLVM is to reduce the used API surface. We will make the upgrade procedure faster, though still measured in months to allow for extensive testing by the experiments software stacks. We cannot expect that ROOT can easily adopt each release of LLVM, without speaking of each commit. However, we can keep ROOT closer to LLVM mainline which makes backporting features from mainline easier.

The next piece of the puzzle is, if mainline functionality is successfully backported, how to evolve ROOT and Cling’s codebases incrementally towards it and how to ensure things work at the full scale of experiments. My personal take is that it is possible only if the two ends match by design. That is, when developing a patch against clang-repl we need to evaluate its reuse in Cling. This is easier said than done and we will need to learn through experience…

Sustainability in open source usually means having advanced users who can contribute back bug reports, code reviews, and code. Thus, a non-negligible part of this effort is outreach and community building for both clang-repl and Cling. Cling has been lucky to have people donating their time to help going towards LLVM mainline. Here I want to thank all of them. In particular Raphael Isemann, Jonas Hahnfeld and Pratyush Das who have each dedicated significant time to help our efforts and thereby reducing the accumulated technical debt in HEP.

Conclusion

The research and development efforts towards an interactive and incremental C++ in ROOT resulted in Cling, which became a cornerstone for data analysis in the field of HEP. Technical advancements in Cling enables new, previously unthought-of, abilities for clang and C++ such as template instantiations on demand, reflection, and language interoperability.

Thanks to support from CERN, USCMS, DIANA-HEP, Intel, the “technical debt” in the initial Cling implementation has been significantly reduced. Even so, much of that work is still ahead of us.

Cling is now used outside of HEP. We are excited to be working towards making it available to an even broader audience for instance by increasing Cling’s ties with the llvm project, while feeding back advancements from other communities to HEP through Cling and ROOT.

Acknowledgement

The author would like to thank Axel Naumann and David Lange who contributed to this post. You can find out more about our activities at https://compiler-research.org and https://root.cern/cling/

Support for redefinitions in Cling, ROOT’s C++ interpreter

2021-06-07T00:00:00+00:00

Back in ROOT 6.20, we introduced a big quality-of-life improvement for interpreted C++. Since then, the feedback we gathered convinced us that it’s time for the world to know about declaration shadowing!

Prior to the 6.20 release, a user couldn’t redefine a function, variable, or class whose definition was already provided for a particular interpreter session. If you have used ROOT for quite some time, it’s almost sure that you have seen this already:

root [0] int i;
root [1] double i = 1.0;
input_line_4:2:9: error: redefinition of 'i' with different type: 'double' vs 'int'
  double i = 1.0;
         ^
input_line_3:2:6: note: previous definition is here
  int i;

While this behavior is expected from a ISO-compliant C++ compiler, it doesn’t seem convenient for interpreted C++ where users expect a behavior closer to a scripting language like Python. This issue was especially visible in Jupyter notebooks, where cells that provided a definition couldn’t be edited and re-run without restarting the C++ kernel. We knew it was annoying and we fixed it in the 6.20 release.

Do I have to do anything to enable this?

No. Support for redefinitions is automatically enabled for the ROOT prompt and Jupyter notebooks as of 6.20. Therefore, the following is now legal in a ROOT interpreter session:

root [0] int i = 0
(int) 0
root [1] double i = 1.0
(double) 1.0000000
root [2] // Note that `i` can become a different thing, e.g. a function
root [3] double i(double x) { return x - 1; }
root [4] i(3.141592653)
(double) 2.1415927

However if you are using Cling standalone, this feature is considered optional and thus disabled at startup. In any case, you can manually turn it on/off as follows:

gClingOpts->AllowRedefinition = 1; // or 0 to disable

The gritty details

Formally, the ISO C++ one-definition-rule (ODR) forbids multiple definitions in order to ensure a consistent view of an entity across different translation units. The technique implemented in Cling does not, however, violate the ODR as each definition is internally enclosed in its own namespace. This ensures the uniqueness of the qualified (and mangled) name of each definition. The trick is completed by making the latest definition available in the global scope by fixing up the translation unit lookup table.

For more information, you can take a look at Cling issue #259, where part of the brainstorm took place. Also, you can refer to our conference paper published in the Proceedings of the 29th International Conference on Compiler Construction (CC 2020).

Summary

This feature allows a user to redefine functions, variables, or classes declared within the same interpreter session. We hope that our users will enjoy this as much as we enjoyed implementing it.

Special thanks go to Chandler Carruth, Axel Huebl et al. for providing some initial ideas on which the final design was built, and to Axel Naumann and Vassil Vassilev for reviewing the implementation and the submitted paper.

New ‘latest-stable’ branch in git repository

2021-05-19T00:00:00+00:00

This post is a must-read for advanced users that build ROOT from source. Starting with 6.24, we provide a new branch named latest-stable that will be regularly updated after each release. If you want to know more, keep reading!

We have listened to you! This is for people who want the latest - without trying out things before they are released: a new major version twice a year, else bug fixes, i.e. always the latest version that went through the release validation.

Before this change, the instructions to build from source suggested to clone the latest tagged release (or patches branch). This turns out to be inconvenient, as the tag/branch names for the latest available version change after each release, e.g., as of this writing, the latest tagged release is v6-24-00 that will soon be superseded by v6-26-00. As a consequence, a user that wants to build from source the latest release, had to check the tag/branch name before issuing the git clone command.

Starting with the 6.24 release, we created the latest-stable branch, which is targeted at users that regularly build ROOT from source. Furthermore, we will automatically provide updates to the latest tagged release to users that already checked out this branch.

Great! How do I use it?

Before getting hands-on, keep in mind that building from source is for advanced users. The preferred method for regular users to install ROOT is via pre-compiled packages. More on that can be found in the install guide.

That said, our aim was to make this really simple for those who are used to build ROOT from source. And that’s now as simple as:

$ git clone --branch latest-stable https://github.com/root-project/root.git root_src

Then, you can follow the rest of the instructions in build from source as usual.

But we didn’t stop there. This branch will be updated regularly after each release, which means that you can easily upgrade ROOT to the latest release by simply:

$ git pull
$ cd dir>
$ # build as usual

Summary

We hope that this change saves time (and avoids issues) for users that build ROOT from source. If you face any problem while using this new branch, please feel free to report it here.

Special thanks go to Jonas Hahnfeld for our discussions on the optimal approach, and to the rest of the ROOT team for providing useful feedback.

ROOT News: v6.24, what 2021 will bring, and a surprise

2021-05-05T00:00:00+00:00

With 6.24 out the door it’s worthwhile to see what it brings. We would also like to hear your thoughts on our plans for 2021 that we’ll share with you here. And because we want to thank you for reading this (and as an extra motivation to participate) we’ll have a little contest with a prize!

ROOT v6.24

Generally we have two big releases per year - but v6.24/00 took a bit longer: we really wanted to have the upgrade to LLVM 9 in, for full C++17 support and many bug fixes!

But there’s more: ROOT::RDF::RunGraphs can run multiple RDataFrames in parallel. That’s an easy way to evaluate, for instance, uncertainty variations concurrently.

And look at this:

RDataFrame = ROOT.RDF.Experimental.Distributed.Spark.RDataFrame

It’s running your RDataFrame on a Spark cluster! To see how it works, check the new tutorial. Once we are happy, this will be our recommended replacement for good old PROOF.

We have converted the blazingly fast RANLUX++ implementation from x86 assembly to portable C++. And it’s equally fast as the assembler version! It will be presented at vCHEP - including a fix for a bug in the assembly version, discovered by Martin Lüscher, the original author of RANLUX.

Still in the math department, TMVA can now interface with PyTorch as a more flexible alternative to PyKeras. RooFit has several big improvements under the hood; we expect that since v6.20, typical uses of RooFit will be accelerated by a factor 4 to 16! Several of you were using community-developed implementations called RooDSCBShape and RooSDSCBShape. We have integrated them (and significantly improved them!) as RooCrystalBall, so please switch!

If you want to see all the new features and all the squashed bugs then please check the release notes. To get ROOT v6.24, you could for instance use conda, snap, MacPorts, Homebrew (soon), or some of the other ways (including downloading binaries). If you’re using the LCG release: ROOT 6.24/00 is part of LCG100.

Plans for 2021

In the beginning of every year, the ROOT team discusses with the experiments what we will be working on during that year. We publish that, so you can track how often we manage to actually do what we plan to do! More seriously, it’s our way of inviting feedback for you to influence our priorities. So let’s do that: I tell you what we plan to work on, and you tell us what’s missing, and what’s more important than other items!

For us, for 2021, the main items are:

progress with distributed RDataFrame, adding for instance Dask support;
pushing training data from RDataFrame into machine learning through python generators;
adding support for parameter variation (systematics) in RDataFrame within a single event loop;
adding support for datasets / categories in RDataFrame, so you can easily handle histograms from MC vs data samples;
high-performance, efficient, and “no frills” (i.e. no dependencies!) ONNX-model evaluations;
super-fast RooFit computations on the GPU, transparently (i.e. just automagically if you have a GPU);
switching to the new RBrowser by default

There are a couple of “high intensity” developments ongoing for LHC’s high-luminosity future, such as RNTuple, and more “software engineering” or framework-oriented ones such as modular CMake “superbuilds”, interpreter debug symbols and optimization - but that should either “just work” or will only arrive at your ROOT in the future. As you can see and as you probably know, some developments need lots of time: we need to do research, we need to benchmark, collect usage feedback, compare with alternatives, etc. And we don’t always have the helping and coding hands we need. So even if you don’t see ROOT’s new graphics system and new histograms up there: they are expected to progress, also in 2021. We think they are crucial to make ROOT easier to use and future-proof, but we will only blog about them once we think you should try them out.

Please let us know below if ROOT is missing a crucial feature, for you, your analysis, or your way of working!

The Competition

We have this new web site, people seem to like it, but it has a big problem: The 404-page isn’t customized! As we are using Jekyll, it’s super-easy to do. We just never felt like that’s more important than whatever else we have to do :-)

If you submit a pull request against our web site repo for a custom ROOT 404 page, and we select your proposal, then we’ll be sending you a Raspberry Pi Zero WH (i.e. with wifi, bluetooth, and pre-soldered GPIO headers), leftover from an earlier ROOT workshop, for you to play with, complete with HDMI and USB OTG adapter and SD card! And yes, that totally runs ROOT, too! If you win, the whole community will be enjoying your page whenever people look at ROOT forum posts from 20 years ago!

How to run ROOT Macros in VS Code

2021-03-31T00:00:00+00:00

Visual Studio Code offers some great functionalities when it comes to coding, such as IntelliSense, debugging, built-in Git, and many more through extensions. In this blog post I’m going to show how to configure VS Code in order to use all of these awesome features when creating and editing ROOT Macros!

A few weeks ago, I published a blog post about ROOT File Viewer, an extension to view ROOT Files directly in VS Code. On the discussion of said post the question of how to run ROOT Macros in VS Code arose, and I came up with an example repository to show how that can be accomplished. Let us now walk through this example repository and see how everything works.

Get it running

To get started, we will start cloning and opening the repository. This can be done from the command line with:

git clone https://github.com/AlbertoPdRF/root-on-vscode.git
cd root-on-vscode
code .

After this is done, we need to manually set the path to our local ROOT installation in two files. We can do that quickly with the editor’s convenient global search and replace functionality:

Toggle the global Search view with Ctrl + Shift + F or clicking on the Search icon in the Activity Bar
Search for /home/apdrf/programs/root-6.22/install, two occurrences should show up
Toggle Replace by clicking on the caret on the left of the Search input box
Write the path to your local ROOT installation directory on the Replace input box
Replace the two occurrences with Ctrl + Alt + Enter or clicking on Replace All on the right of the input box

We are almost ready to see everything in action, so let us open the workspace and get to it! There’s a few ways to do that, but probably the simplest one to do it is to:

Open the file root-on-vscode.code-workspace, which is located inside the .vscode folder
Click on the blue button at the bottom right of the editor that reads Open Workspace

After we have opened the workspace, a toast notification will appear at the bottom right of the editor asking us if we want to install the recommended extensions for the repository. Clicking on the Install button will install both the C/C++ and ROOT File Viewer extensions. Please note that the second one is not strictly required for everything to work, it is just listed as a recommendation for convenience.

And that is everything that is needed! Just press F5 and see it for yourself: the example hsimple.C ROOT Macro will run and the hsimple.root file will be recreated.

With everything running, we can now take advantage of some of the awesome functionalities that we mentioned before to develop our ROOT Macros, for instance:

Use IntelliSense’s smart completions to easily create and edit macros without fear of getting the syntax wrong
Debug the macros right from the editor setting break points, seeing call stacks, and with an interactive console available
Use the integrated clang-format to always get our code styled as we want it
And there’s much more!

But I know this all feels like magic, so if you want to know how everything works, keep reading!

How it works

What we do in this example is to define a VS Code Workspace with the necessary configuration for everything to work, so let us first see what a workspace is and how it is configured.

A VS Code Workspace is simply a collection of (one or more) folders that are opened in a VS Code instance (a window). For this example, we have defined a root-on-vscode workspace through the root-on-vscode.code-workspace JSON file located on the .vscode folder. Said file contains the following configuration:

folders: here we define the path to our workspace folder(s), relative to the location of this same file
settings: through this object we tell VS Code to treat files with the .C extension as C++ files and the path where it has to search for our header files, which in this case is ROOT’s include directory
extensions: here is were we recommend for the C/C++ and ROOT File Viewer extensions to be installed for this workspace

Apart from configuring the settings of the workspace, we have also defined a launch configuration, which is how we are going to be able to run ROOT Macros directly within VS Code. This is done in the launch.json file, also located on the .vscode folder. From this file I will just mention a few of its key points:

We will use the gdb debugger under the hood
The program that we will launch is the root.exe executable
We will pass everything necessary to ROOT’s executable as arguments (which of course can be tweaked at will), as we would do on the command line:
- -l to avoid showing ROOT’s banner
- -q so the program quits after it finishes processing the macro
- hsimple.C+g to tell the program which macro to run and to compile it with debugging symbols – this is what allows us to set break points through the macro

And this is basically it, the rest of the things that the repository includes are:

A very basic .gitignore file to not commit compilation artifacts to the repository
The hsimple.C macro from ROOT Tutorials
The resulting hsimple.root file
And a README.md file with some basic information

Summary

With this blog post I just wanted to quickly illustrate how we can configure VS Code to run ROOT Macros directly in it. Doing so allows us to take advantage of some great functionalities of the editor that will make our lives way easier, and this way we can focus on what truly matters!

New PyTorch Interface in TMVA!

2021-03-25T00:00:00+00:00

What if we combine PyTorch and TMVA? Ever wondered how ROOT utilizes powerful external MVA libraries making them easily accessible with a direct integration into the TMVA workflow? These interfaces between TMVA and Python frameworks are powered by the PyMVA backbone. All PyMVA methods provide the same plug-and-play mechanisms as TMVA.

With the coming release of ROOT v6-24-00 we are excited to launch a brand new PyTorch Interface for TMVA.

PyTorch is a Python-based scientific package supporting automatic differentiation. An open-source machine learning framework that accelerates the path from research prototyping to production deployment.

Need for a PyTorch Interface?

TMVA already has a PyKeras interface which we all love, especially with Keras’ simple high-level tensor-flow API. If your work involves some elementary experiments, Keras maybe the goto framework due to its plug and play spirit.

But things get interesting when one requires low level control and flexibility. That’s when the argument for Keras starts losing water. PyTorch on the other hand is amazing in terms of control, flexibility and raw power that it can provide to the user. Its lower-level approach is better suited for the more mathematically-inclined users.

PyTorch is widely used among researchers and hence has a large community around it.

ROOT + PyTorch: Allows to integrate ROOT methods which are good at handling HEP data and PyTorch for Machine Learning.
Power & Flexibility: Neural Nets are not easy to develop using TMVA, as they require complex configuration strings. Even with PyKeras Interface, designing custom layers is not feasible. PyTorch offers the power and flexibility to achieve complex models with custom layers, optimizers, loss functions and training methodologies.
Ease of Debugging: PyTorch models make use of dynamic computation graphs and are based on eager execution. This makes it easier to use debugging tools like pdb.
Performance: PyTorch is extremely fast due to its highly optimized C++ backend.

Designing a simple model in PyTorch using a PyTorch container is extremely simple. Here we use nn.Sequential:

model = nn.Sequential()
model.add_module('linear_1', nn.Linear(in_features=4, out_features=64))
model.add_module('relu', nn.ReLU())
model.add_module('linear_2', nn.Linear(in_features=64, out_features=2))
model.add_module('softmax', nn.Softmax(dim=1))

See PyTorch docs for more tutorials.

As we mentioned earlier the power and flexibility comes in the form of designing custom layers as well writing a custom training loop.

loss = torch.nn.MSELoss()
optimizer = torch.optim.SGD

def train(model, train_loader, val_loader, num_epochs,
          batch_size, optimizer, criterion, save_best, scheduler):
          ...

def predict(model, test_X, batch_size=32):
    ...

Defining a load_model_custom_objects dictionary with the keys "optimizer", "criterion", "train_func" and "predict_func" is the only extra step required when using the PyTorch Interface in TMVA. Everything else is native PyTorch or TMVA.

load_model_custom_objects = {"optimizer": optimizer, "criterion": loss,
                             "train_func": train, "predict_func": predict}

In the end we book our TMVA method with kPyTorch type and call the training method.

factory.BookMethod(dataloader, TMVA.Types.kPyTorch, 'PyTorch',
                   'H:!V:VarTransform=D,G:FilenameModel=model.pt:'
                   'NumEpochs=20:BatchSize=32')

factory.TrainAllMethods()

You can checkout a more detailed tutorial here as well as some examples in the ROOT repository. Read more about the development journey of TMVA PyTorch Interface at Anirudh’s GSoC blog.

View ROOT Files directly in VS Code!

2021-03-11T00:00:00+00:00

As a heavy user of ROOT, many of the results of my data analysis are saved in ROOT Files and, honestly, I always found it a bit annoying to glance over them. The good news is that that isn’t the case any longer! Let me present ROOT File Viewer, an extension for Visual Studio Code that makes use of the wonderful JavaScript ROOT to display ROOT Files directly within VS Code, with just a click!

Before diving into the details, let us just quickly see the extension in action. Although the saying goes like:

An image is worth a thousand words

I think that a GIF will serve us better this time:

After seeing this, who would still want to open a terminal, launch ROOT, and create a TBrowser instead? And I haven’t even mentioned yet that for this extension to work no local installation of ROOT is required!

If you just want to install the extension already and have a good time opening all your ROOT Files, you can do so by:

Launching VS Code’s Quick Open (Ctrl + P), pasting ext install albertopdrf.root-file-viewer, and pressing enter
or searching ROOT File Viewer directly within VS Code’s Extensions view (Ctrl + Shift + X or click on the Extensions icon in the Activity Bar)
or running code --install-extension albertopdrf.root-file-viewer from the command line
or… well, many other options!

And if you want to know more, just keep reading!

Visual Studio Code

VS Code is a free, open source, and very popular code editor developed by Microsoft with TypeScript, a superset of JavaScript. It runs on any operating system, supports many languages, has built-in support for Git, and much more. Moreover, its functionality can easily be extended thanks to the Extension API. This is where the fun begins!

In the case of the ROOT File Viewer extension, the Custom Editor API is leveraged to handle ROOT Files. The custom editor requires two parts: a view and a document model. The view of the file is implemented through the Webview API, and the document model is a custom RootFileDocument class which we will keep simple by implementing a CustomReadonlyEditorProvider, the RootFileEditorProvider. We could go deep into details here, but that is probably outside of the scope of this post.

JavaScript ROOT

JavaScript ROOT brings ROOT to the browser. It is basically a drawing and I/O library that can be used to provide interactive plots and many other ROOT core functionalities, as can be seen on the examples page.

ROOT File Viewer makes use of the HierarchyPainter object to do all the heavy lifting regarding the handling of the ROOT files and the drawing of the objects stored in it. It is configured with the tabs layout and it gets passed the user’s VS Code theme background color so it integrates better with the editor.

Implementation

The implementation of the extension boils down to how to glue the two awesome tools mentioned above together and, in all honesty, I tried to keep everything as simple as possible in order to have a proof of concept up and running quickly. All the magic happens on the rootFileEditor.ts file, which contains both the implementation of the custom document and the webview.

The RootFileDocument custom document is the object that gets created each time that a user opens a ROOT File document. For what it concerns us, it stores the path to the file that we want to create a view for.

The RootFileEditorProvider is were all the functionality is implemented, which can be summarized in:

Letting VS Code know which view type is supporting
Creating a custom document for each ROOT File that gets opened
Managing all the webviews associated to a custom document
- In our case, we only allow a webview per file
Providing the actual content that is displayed on the webviews

This last point is where JavaScript ROOT comes into play, as all the custom editor provider does at this point is to create a template HTML document with an embedded script where JavaScript ROOT gets passed the ROOT File path and the customization options mentioned before. Everything else just automagically works!

If you would like, you can check out (and even contribute to) the source code on ROOT File Viewer’s GitHub repository. And, of course, you also can (and are encouraged to) open an issue if a bug arises or you have a feature suggestion!

Updates

VS Code extensions receive automatic updates, so rest assured that you won’t miss any cool future features that may come!

Summary

To wrap everything up, with ROOT File Viewer I wanted to solve a pain point that I believe exists for more people than just me. I hope that glancing over the contents of a ROOT File is quicker and more practical now that this extension exists.

Working with such awesome tools as VS Code and JavaScript ROOT has been a ton of fun, and I would definitely recommend it to the geeks out there who enjoy getting to know new technologies and like to build things for people to interact with.

And, lastly, I would like to thank you for dedicating the time to read this post and everyone who has shown their support after the launch of ROOT File Viewer!

A Snap package for ROOT on Linux

2021-01-15T00:00:00+00:00

There is a new experimental package format for ROOT, based on the Snap package manager from Canonical. This package can be ideal for new ROOT users, new Linux users, or people whose ROOT requirements might be entirely satisfied with an immutable container image.

Take a look at the store listing at https://snapcraft.io/root-framework, where you can find installations for some common distributions, e.g. Ubuntu, Debian, Fedora, OpenSUSE, CentOS, Arch, and more!

On Ubuntu, it is as simple as:

sudo snap install root-framework && root

You might be able to even just search for the ROOT Framework and install it in a single click!

This is a full-fat installation of ROOT, complete with its utilities such as hadd, PyROOT via Python 3.8 (with SciPy, NumPy, Pandas and Matplotlib), and JupyROOT. You will get these bundled by default, and since the whole package is based on container technology, they cannot interfere with any of your system libraries and can be easily upgraded (automatically!), removed, and mixed alongside other ROOT installations.

Just run root in the terminal after installation and you can get to work instantly. Give root --notebook a go and try out the JupyROOT support. As a special case, if you want PyROOT, you must run pyroot rather than python. This ensures you get the bundled version of Python in the container rather than the host system, but from there you can import ROOT normally and run your scripts. You can also pass parameters to pyroot as if it were python, e.g. pyroot -i $(root-config --tutdir)/pyroot/fillrandom.py.

There is no need to mount the $HOME directory, graphical support should work by default, and a lot of optional packages are built by default. The goal is to provide a Docker-like experience for ROOT but blur the distinction between the container and host environment, in a way that is convenient for users. For example, by simply adding a shortcut to start ROOT in the start menu of most systems under the science section.

Tell Me More

Sandboxing

Most snap packages are under a sandboxing model that might subtly interfere with a user’s regular workflow. This means that the ROOT snap cannot just access a user’s camera or microphone for example, since this makes little sense for ROOT. However, one notable feature is that ROOT is limited to accessing files in the users home directory (aside from over the network). Furthermore, the snap will be prevented from accessing hidden files/folders in the top level of the home directory itself, such as $HOME/.ssh.

To help make this work, the $HOME variable, and gSystem->HomeDirectory() will return a modified value for the users home directory, generally /home/example/snap/root-framework/current/. If a user wants to make use of rootlogin.c for the entire application, keep in mind it will look for it in $HOME which points there instead. If you make use of the parallel installation ability mentioned below, this can be an advantage as each installed version of the ROOT snap will have a unique $HOME, and may have different rootlogin files, history, etc.

To be clear, this does not prevent you from reading and writing to /home/example/desktop/ for example. The value for the current working directory works the same as normal. If ROOT is opened with the current working directory set to to /home/example, you can access your desktop folder as simply ./desktop.

$ pwd
/home/james
$ echo "Hello World" >> Example.txt
$ root
   ------------------------------------------------------------------
  | Welcome to ROOT 6.22/06                        https://root.cern |
  | (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Jan 08 2021, 20:08:00                 |
  | From tags/v6-22-06@v6-22-06                                      |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------

root [0] .!cat Example.txt
Hello World
root [1] .!pwd
/home/james

Automatic Updates

Snaps update automatically, using delta patches to be bandwidth efficient. Due to the container properties, this should be a safe operation, so that jumping from one version of ROOT to the next can be done even if it upgrades to an entirely different compiler toolchain.

Nightlies & Tracks

Nightly builds are produced and accessible with sudo snap install root-framework --edge.

If you are already using the snap and want to swap to the edge branch, use sudo snap refresh root-framework --channel=edge.

A track in Snapcraft terms is a separate branch of a project that can be downloaded instead of the default release. The default release is called “latest”, and its stable channel will generally follow the newest stable ROOT release. As a result, users will automatically update to newer branches of ROOT. However, in some scenarios people may like to use an older release, and tracks could be used to provide this in the future.

If there is a demand to produce these tracks, please provide some feedback and it can be looked into! The following example syntax would be usable if/when tracks are declared.

sudo snap install root-framework --channel=v6-22/stable

Multiple Installations of the ROOT Snap

https://snapcraft.io/docs/parallel-installs

https://snapcraft.io/docs/commands-and-aliases

This feature is still experimental, but it is possible to have both the ROOT stable snap and the ROOT nightly snap alongside each other. However to make proper use of this functionality, it helps to understand Snap aliases.

The root command itself is an alias, because the snap package is called root-framework. When installed from the Snap Store, an alias is created automatically between root and root-framework. For the other binaries such as hadd, the original names are namespaced, so hadd in the namespaced form is root-framework.hadd. When you install extra instances of a snap, you must decide which aliases you wish to use manually.

sudo snap set system experimental.parallel-instances=true
sudo snap install root-framework
sudo snap install root-framework_nightly --edge --unaliased

While root will point to the stable version, you can run sudo snap prefer root-framework_nightly so that the next invocation of root will be from the nightly branch. You can also alias individual commands, or simply use the unaliased binary names, such as root-framework_nightly.root.

It is preferable to use Snap aliases rather than Bash aliases, a snap alias will affect all system users.

The additional installations of snaps will all have their own unique $HOME values, so can have differing rootlogin files and different history for each snap instance.

Windows Subsystem for Linux 2.

https://discourse.ubuntu.com/t/using-snapd-in-wsl2/12113

Officially, running snaps on WSL2 is unsupported. This is because WSL2 has a custom init system rather than systemd. Unofficially, it is possible to get it working reasonably well anyway, but this is not directly supported.

Having personally tried it, it is possible to get JupyROOT in the snap running in a web browser on Windows, and for some people, the snap on WSL2 might make sense.

Other virtual machine platforms, such as Virtualbox should be able to install Snaps without issues.

CUDA & OpenGL

At the moment, CUDA is not supported, but this may change in the future.

OpenGL generally works fine in a Snap environment for most GPUs. A notable exception is amdgpu-pro drivers, and NVidia proprietary drivers on Debian and Debian derivatives (but excluding Ubuntu and Ubuntu derivatives). These issues can be resolved pending upstream work in the future.

Performance

There should not be any observable performance difference from the snap version of ROOT and any other version, a macro that takes an hour to run outside the snap should take an hour to run inside it.

Creating Independent Executables

Creating independent executables is not supported in the snap environment. The ABI is not stable, the compiler toolchain will be foreign to most systems, and the automatic updates would ruin this regularly even if you managed to hack it into working. If this is essential to your workflow, you are likely better suited with an alternative package.

Packages and executables outside the Snap container

Executing binaries from outside the snap environment from inside the environment itself will not work due to the sandboxing, and the image itself is by default inflexible, so that adding more Python modules for example involves either rebuilding the snap or using debug modes. If there are binaries and packages that might make sense inside the container, please give feedback and they can be considered to be default!

If you want to change the CMake parameters, add some extra packages or some extra Python modules, you might be pleasantly surprised with the Snapcraft build system and there are some instructions on my personal GitHub page on how to do it. https://github.com/MrCarroll/root-snap

IDEs

Because the snap purposefully keeps its files away from the normal system, IDEs do not work with the ROOT snap. Consider using root --notebook to access JupyROOT for an IDE-like experience. If this is insufficient, you are likely better suited with an alternative package.

Alternate CPU architectures

Currently the ROOT snap is only built for AMD64/x86_64.

Snaps can work on other architectures, if there is a demand for alternative architectures such as ARM64, please give feedback and it can be considered. In the meantime, the Snapcraft build system should generally work on ARM64 and various other architectures, so you might be able to compile your own.

Summary

In summary, I hope there are a lot of users for whom a Snap package of ROOT might make sense. Prior to uploading this blog post, there are already several thousand downloads, going well above my personal expectations, and the issue tracker has not crashed yet so I am hopeful that it is being successful in helping get ROOT into people’s hands.

Please feel free to give your feedback on this package. Whilst not everything will be actionable, knowing what issues people have can help guide future improvements. In particular, feedback about additional python modules, issues with the sandboxing, and performance regressions are appreciated, though any feedback at all would be very much appreciated. You can get in touch with me on the ROOT forums as @james-carroll; or feel free to report issues at https://github.com/MrCarroll/root-snap, where you can also find information on building your own custom ROOT snap.

Special thanks go to Axel Naumann for being responsive and helping to reduce the bus factor of this package; thanks to my good friend Theodore Zorbas for giving me the inspiration to tackle this project and being my guinea pig for testing it, thanks to the ROOT community for already investing significant time in making ROOT easier to package, thanks to Canonical for the Snapcraft tooling, hosting, and build servers, and thanks to GitHub for their hosting and build servers too! Between Canonical and GitHub this entire package is built and distributed free of cost.

Implementing and tuning RANLUX++ for ROOT

2020-11-30T00:00:00+00:00

ROOT comes with support for different pseudorandom number generators (PRNGs). This post discusses the recent implementation of RANLUX++ and how I tuned its performance. Because of its theoretical strengths and its performance, this generator might become the default in future versions of ROOT.

The added TRandomRanluxpp joins the group of available RNGs in ROOT:

TRandom1 is the RANLUX generator proposed by Martin Lüscher. The implementation is based on the original description by Fred James and generates single precision values based on 24 bits of randomness.
TRandom3 is based on the Mersenne Twister generator. By default, this implementation is used for gRandom and generates 32 bits of randomness.
TRandomMixMax is a 61 bits matrix recursive MIXMAX generator (described in https://mixmax.hepforge.org/). It is based on a state of size 240, other sizes are available via MixMaxEngine.

There are a few more variants that are linked in the documentation.

Why so many PRNGs?

There exist many properties to evaluate PRNGs, with the easiest being its period: it captures how many random numbers can be generated before the sequence “wraps around” and repeats. For obvious reasons, the period of a PRNG cannot be longer than its state. This is one of the problems of the oldest LCGs with a state of 32 bits or less.

However, there are many more statistical properties that are at least equally important. A widely used collection of empirical tests is TestU01 by L’Ecuyer and Simard. It has to be noted that the tests can only prove the existence of defects. That is, passing all tests does not mean that a PRNG is free of errors.

As with many other design issues, PRNGs need to make a certain trade-off: fast generators with a small state will usually fail many statistical tests. On the other hand, good generators are typically slower or use a much larger state. In fact, most generators will fail one test or another, including the widely used Mersenne Twister. Notable exceptions are RANLUX and MIXMAX that pass all currently known tests.

RANLUX++

RANLUX++ is an LCG implementation derived from the ideas of RANLUX. It aims to provide better performance while maintaining the good properties of RANLUX. In particular, this includes the foundation based on mixing of classical mechanical systems. For this reason, the randomness of the generated sequence can be theoretically understood. The formulation as LCG makes it possible to reach even better properties without impact on performance.

The generator uses a single 576 bit number as state. When invoked, it hands out the next unused bits that the generator remembers as an offset. Once all bits are exhausted, the LCG performs a multiplication and a modulo operation to advance the state. Arithmetic on 576 bit numbers is way beyond what current hardware offers. Instead the state is portably stored as 9 numbers of 64 bits each. On this representation, multiplication and the modulo operation must thus be done in software. For the multiplication, this is very similar to the way multiplications are taught in high school. The modulo operation can take advantage of the known modulus (for details, refer to section 2.1 of the paper).

For efficiency reasons, Sibidanov implemented the routines in x86 assembly. This results in impressive performance, but it is not portable across all architectures supported by ROOT. So how much slower would a portable implementation be? After implementing RANLUX++ for ROOT, it turns out: not much!

The initial version added to ROOT takes around 30ns per random double sampled. For comparison, the implementation by Sibidanov comes in at around 8ns per number. This was measured with 10 repetitions of a microbenchmark with a standard deviation of less than 1%. The numbers are from a single core of an AMD Ryzen 9 3900 running CentOS 8.2.2004. The default compiler is GCC in version 8.3.1, but I will present more numbers in the following.

Performance tuning

30ns per number is already quite impressive and twice as fast compared to TRandom1. That implementation of RANLUX takes around 62ns per number and only generates single precision values. It must be noted that a newer version by Lüscher was also measured at 30ns for double values. That said, can we do even better? Yes!

The graph below shows the evolution of performance after tuning. In addition to GCC 8.3.1, I also include numbers for Clang 9.0.1 and GCC 7.5.0. While the former two are packaged for CentOS 8, I compiled GCC 7.5.0 from the official sources. The dashed horizontal line represents the assembly version by Sibidanov. Details about the individual changes are described in the following sections.

Loop unrolling

The portable code relies on loops to implement the multiplication and modulo operation in software. This induces overhead when evaluating the exit condition and incrementing the loop variable. Furthermore, the loop structure leads to jumps in the code that hinder instruction level parallelism. Fortunately loop unrolling is a well known compiler optimization to tackle this problem: it duplicates the loop body to enable other optimization passes and generate more efficient code. In case of constant loop bounds, it is even possible to fully unroll the loop.

One disadvantage of loop unrolling is the increased code size. For this reason, compilers are conservative in their unrolling heuristics. During experiments, I found that unrolling of certain loops consistently improves performance. By adding #pragma directives, it is possible to help the compiler. As can be seen in above graph, this improves performance for GCC 8.3.1 and Clang 9.0.1 by up to 30%. GCC 7.5.0 does not benefit from the change because #pragma GCC unroll was only added in GCC 8.

Overflow handling

For the multiplication of two 576 bit numbers, the portable code is constrained by the available data types: it can at most multiply two 32 bit values at a time to stay within the 64 bits of uint64_t. This makes it necessary to handle overflows when summing up the partial results. By arranging the operations differently, it is possible to avoid conditional execution. Together with full unrolling, this eliminates all jumps from the generated machine code for GCC 8.3.1. During execution, this leads to performance improvements of up to 15% for the two versions of GCC. There is no change for Clang because its optimizations already transformed the original code.

Native multiplication

An even faster way is to let hardware handle the multiplication of two 64 bit numbers. This is possible with the extension type __int128 supported by both GCC and Clang. Using this type, it is possible to multiply 64 bit numbers with 128 bit precision. For that, the compiler can make use of special instructions available for the targeted hardware. Afterwards the upper and lower halves can be extracted as 64-bit numbers. The change taking advantage of __int128 improves performance of all three compilers by more than 40%!

Carry propagation

Additional tests showed inferior performance with older compilers. Part of this is related to missing loop unrolling because the #pragma is not available (see above). An additional problem is the generation of jumping code to propagate carry bits in case of overflows. Instead of relying on inline assembly code, I found a portable solution after refactoring the common code. The modification results in good code for all tested compilers. For the older GCC 7.5.0 in particular, it improves performance by more than 20%.

Conclusion

In this blog post, I discussed the implementation of RANLUX++ and its performance. After tuning, the average time per number is down to around 9ns, very close to the assembly version. It is slightly slower than TRandom3 currently used for gRandom (around 3ns per number). However, TRandom3 generates 32 bits of randomness while TRandomRanluxpp uses 52 bits per double value. Furthermore RANLUX++ inherits and extends the mathematically proven properties of RANLUX. As it also passes all tests in TestU01, it might replace the default gRandom in the future.

But you do not need to wait until then: TRandomRanluxpp will be available with version 6.24 of ROOT. Once released, you can replace the default generator as described in the documentation. Or use a nightly version right now via LCG or Conda!

Google warns about root.cern

2020-10-02T00:00:00+00:00

Thank you for your reports of Google warning about some pages on https://root.cern. Let me give you some background:

We have investigated virus scanner reports on some of ROOT’s binaries. The ROOT team has invested about 2.5 work days into this investigation: we took this seriously. On top of that, CERN IT’s security team has investigated this independently, and also invested a notable amount of working hours. We were able to create binaries that were diagnosed as infected, by building hadd.exe “from scratch”. We are convinced that these reports are false positives. We found out that upgrading the compiler works around the issue.

We could report this as false positive to the vendors whose virus scanner was reporting some of ROOT’s binaries. This has a high latency, especially when adding the latency between the virus scanner engines tweaking their patterns and Google removing root.cern from the list of flagged sites.

We could argue with Google. This does not seem like a fast track option either.

Instead, we have removed the file in question. We are upgrading the compiler on our build machines, which means that we cannot create Windows binaries for new patch releases of ROOT 6.20 and before: only newer ROOT versions can be built with the newest Visual Studio version.

Given the removal of the file reported by Google, we have asked Google for a review of root.cern. As Google puts it, “A review can take from a few days to a few weeks to complete.”

If you have ideas how to further improve the situation, please let us know by adding a comment below.

Please rest assured that we build our binaries on always up-to-date Windows installations running updated virus scanners, and that we do whatever we can to keep our binaries clean. This is the second time in 20+ years that ROOT files have been misdiagnosed as infected. To address this, we will scan binaries proactively on VirusTotal from here on, before offering them for download, to avoid false positives affecting us again.

Axel, for the ROOT team.

ROOT

New color schemes

New ROOT student course for self-study

RNTuple: Where are we now and what’s next?

Why do we need RNTuple?

Where we are now?

Can I try it out?

Next steps for RNTuple

References

Interactive, web-based canvas is now the default in ROOT

New class TScatter

Coding in ROOT with the horsepower of an F1

Errors are development tools, not silly mistakes

IDEs to the rescue

QtCreator

Installation steps

Select your compiler Kit

Open a C++ CMake project

The Power of F1

The Power of Clang

Formatting your code

git version control

Why bother with QtCreator when I am pro with emacs and vim?

CTests

To gild the lily

Debugging tools

Building ROOT in Debug Mode

Debugging your ROOT scripts or executables with GDB

Memory error detection

Data race detection

Performance analysis

Other approaches

Quick recipe Summary

Debugging CERN ROOT scripts and ROOT-based programs in Eclipse IDE

Eclipse IDE Setup

Building ROOT with Debug Symbols

Transform ROOT script into CMake ROOT-based Program

Setting up a ROOT-Based CMake Program

Summary

ROOT Manual’s week

ROOT joins Hacktoberfest 2021!

What is it?

How does it work?

RDataFrame is going distributed!

One programming model, many backends

Conclusions

References

Cling Transitions to LLVM’s Clang-Repl

Goals

Where Are We Now

Next Challenges

Conclusion

Acknowledgement

Support for redefinitions in Cling, ROOT’s C++ interpreter

Do I have to do anything to enable this?

The gritty details

Summary

New ‘latest-stable’ branch in git repository

Great! How do I use it?

Summary

ROOT News: v6.24, what 2021 will bring, and a surprise

ROOT v6.24

Plans for 2021

The Competition

How to run ROOT Macros in VS Code

Get it running

How it works

Summary

New PyTorch Interface in TMVA!

Need for a PyTorch Interface?

View ROOT Files directly in VS Code!

Visual Studio Code

JavaScript ROOT

Implementation

Updates

Summary

A Snap package for ROOT on Linux

Tell Me More

Sandboxing

Automatic Updates

The Power of `F1`