At this point, ROOT has been around for more than a quarter of a century – and TTree for just as long. And as you might imagine, the computing landscape today looks vastly different compared to 25 years ago. Just to set the scene: when ROOT was first released, there was no C++ standard yet and parallel (let alone distributed) computing really wasn’t a thing yet. On the hardware side, modern storage technologies such as SSDs and object stores were still unheard of, and let’s not forget to mention the evolution of networking technologies! Naturally, TTree wasn’t designed and implemented with these things in mind. Now of course, over the years a lot of effort has been put into improving the performance and stability of TTree to make it compatible with modern computing practices as much as possible. However, there are limits to what is possible in this regard, especially given the fact that backwards- and forwards-compatibility are two major requirements for ROOT’s I/O system. This has led to the fact that with the High-Luminosity LHC on the horizon, where 90% of the total amount of LHC data is expected to be produced [2], we need to think about more optimized ways to store physics data. The challenge here is that this data is unique in the sense that events (or, in computer science terms, “entry” or “row”) are statistically independent of each other. At the same time one event typically contains many (complex) data structures, of which we often only need a small subset at a time, and we found out that standard technologies are not well-tuned for this type of data storage [3]. That is why we decided to combine the years of experience with TTree and various industry best-practices and invest in the next generation of high-energy physics data storage. Enter RNTuple!
For the past four years, a lot of effort has been put into making RNTuple the best it can be. We are working closely with the experiments to make sure that RNTuple can support their data models across all relevant stages in the production pipeline. Simultaneously, we want to make sure that it is as optimized as possible. This means making sure that the data stored in RNTuple is as compact as possible, and at the same time coming up with ways in which we can make reading and writing RNTuples to and from memory as fast as possible. To give you an idea of where we’re currently at, the plot below shows the average on-disk event size for ATLAS’s DAOD_PHYS data model [4], comparing TTree and RNTuple. With RNTuple, we could potentially save 20-35% of storage space, and in turn reduce the consumed network bandwidth when reading the data from a remote location. When we’re talking about exabytes of event data, this is quite significant!
Besides storage efficiency, we’re also seeing very promising results when it comes to read throughput. The two plots below show the number of events processed per second for two different types of tasks, comparing ATLAS DAOD_PHYSLITE data sets stored in TTree and RNTuple (stored on an SSD). As you can see, RNTuple is remarkably faster than TTree, and similar observations are made for other data sets [1], [5].
Beyond performance, we have also been working hard on RNTuple’s interface and supported features. This includes compatibility with RDataFrame, being able to read and write C++ STL types as well as user-defined types and various other features to support existing experiment frameworks.
Yes! To be able to read and write RNTuples, the first thing you’ll need is a ROOT
installation that includes the
ROOT 7 experimental features enabled.
This is the case for the default LXPLUS installation, which runs ROOT’s (at the
time of writing) latest release, 6.30.02!
If you are running ROOT in a different way, you can easily check if ROOT 7 is
enabled for your installation by running root-config --has-root7
in your terminal.
If this returns yes
, you’re all set! If you get a no
, you will need to use a different
installation of ROOT that does. Check out the ROOT installation page
to get it. We strongly recommend using the most recent release in order to get
the latest and greatest from RNTuple.
Now, on to the fun part: using RNTuple! Of course, you could write a new RNTuple
completely from scratch, using fields and data that you come up with. This is
done using the RNTupleWriter
interface. Reading an RNTuple is then naturally done through the
RNTupleReader
.
To get an idea of what this looks like in practice, check out for example
this tutorial.
Of course, it would be more interesting to try out RNTuple with real data, for
example with data from an analysis ntuple that is currently stored as a TTree.
Well, good news! RNTuple also comes with an RNTupleImporter
class that allows you to automatically convert your TTrees to RNTuples. This
can be as simple as executing the following two lines in the ROOT prompt. The
input file containing the source TTree is read remotely, meaning you can
directly copy-paste these lines into your ROOT prompt. Of course, it’s entirely
possible to use your own existing TTrees.
root [0] auto importer = ROOT::Experimental::RNTupleImporter::Create(
"http://root.cern/files/HiggsTauTauReduced/GluGluToHToTauTau.root",
"Events",
"my_rntuple.root")
root [1] importer->Import()
This will convert your TTree (called Events
here) into an RNTuple also called
Events
and write it to my_rntuple.root
. Easy enough, but maybe you want more
control over this newly created RNTuple. For example, you might want to change
its name, or set the compression settings to something other than the default.
This (and more) can all be tweaked! Check out
the reference
or this tutorial to see
what options are possible.
Now, I already mentioned that we have been working on RNTuple compatibility with RDataFrame. Currently, with just one line change, you will be able to use your existing analysis code with data stored in RNTuple:
// Change this:
ROOT::RDataFrame df("Events", "http://root.cern/files/HiggsTauTauReduced/GluGluToHToTauTau.root");
// To this to use the RNTuple you just imported into "my_rntuple.root":
ROOT::RDataFrame df = ROOT::RDF::Experimental::FromRNTuple("Events", "my_rntuple.root");
// Use your existing analysis as-is!
💡 The automatic detection of RNTuples in RDataFrame is currently available in ROOT’s
master
branch and will be available in ROOT 6.32.00!
So, what’s next? Performance is always one of our main concerns. We are currently working on parallelizing the writing of RNTuples. In addition, we are working on what we like to call “interface ergonomics”, i.e. the way developers will interact with RNTuple. Be aware that this means that the RNTuple interfaces might still change a little in the coming months! Next to all of this, we are preparing for larger-scale performance testing to see in what areas we could further improve. Another area of work for the near future will be in the direction of data set combinatorics – that is, finding smart(er) ways of accessing and combining existing RNTuple data. And of course, we will continue to work with the experiments to make sure the transition to RNTuple will be as smooth as possible.
To wrap things up, things are looking good for RNTuple, and while there is still enough work to be done, we’re excited and eager to make RNTuple as good as it can be! If you want to know more about the evolution and performance of RNTuple, be sure to check out the references below, as well as our other publications. If you are eager to dive deeper into the specifics of the RNTuple binary format, you can read the specification here. Finally, reach out to us on the forum if you have any questions or if you would like to contribute to RNTuple or ROOT in general!
[1] J. Blomer, P. Canal, A. Naumann, and D. Piparo, “Evolution of the ROOT Tree I/O,” EPJ Web Conf., vol. 245, 2020, doi: 10.1051/epjconf/202024502030.
[2] ATLAS Collaboration, “ATLAS Software and Computing HL-LHC Roadmap,” CERN, Geneva, CERN-LHCC-2022-005, LHCC-G-182, 2022. Accessed: May 02, 2023. [Online]. Available: http://cds.cern.ch/record/2802918.
[3] J. Blomer, “A quantitative review of data formats for HEP analyses,” J. Phys. Conf. Ser., vol. 1085, p. 032020, Sep. 2018, doi: 10.1088/1742-6596/1085/3/032020.
[4] J. Elmsheuser et al., “Evolution of the ATLAS analysis model for Run-3 and prospects for HL-LHC,” EPJ Web Conf., vol. 245, 2020, doi: 10.1051/epjconf/202024506014.
[5] J. Lopez-Gomez and J. Blomer, “RNTuple performance: Status and Outlook.” arXiv, Apr. 07, 2022. doi: 10.48550/arXiv.2204.09043.
]]>TCanvas
implementation by default in the ROOT master version. It has been present in the ROOT for a
while (since 2017) and used already in the web-based TBrowser, which you have probably seen already.
What has changed ? Now when starting a ROOT session and displaying any object in TCanvas
,
the default system web browser will start and the object will be drawn there using the JavaScript
ROOT functionality. The look and feel for basic objects, like histograms and graphs, will not
change much – all the drawing options and styles are supported as in the original graphics.
You can compare the two following screen shots made with the same macro – one is original
ROOT graphics, other is web-based one.
What are the benefits of using the web-based canvas?
gPad
problematic - lots of interactivity in original ROOT graphics were built
around this global pointer. This made it difficult to use several canvases at the same time.ssh
tunnels, we provide a simple rootssh
utility which
fully automates the configuration of such tunnels.What about image production in batch?
For the moment we keep the old functionality, like when running ROOT with the -b
flag, for image
production. Web-based canvas will be used for PNG/JPEG/SVG images creation by adding --web
flag when running ROOT. While image generation involves running web browsers in headless
mode, it takes time – approximately 1 second per image. We plan to provide a special API to
produce many images with one call – which should significantly improve performance.
What are drawbacks?
Probably you’ll encounter minimal differences between drawing with native ROOT graphics and in
the web browsers. We do our best to make them similar as much as possible – and you can help
us by reporting the problems. Probably some very special usages of TExec
objects (do you know about them?)
will not work as expected in web-based canvases. With a little help from us this can be
fixed and adjusted. For sophisticated use-cases with complex user-defined objects one could
consider implementing JavaScript-based painters for them.
We encourage all users to try this functionality and give us the feedback!
]]>2D Scatter plots are a very popular way to represent scientific data. Many scientific plotting packages have this functionality. For many years ROOT itself as offered this kind of visualization via:
P
to draw TGraph:
A marker is drawn at each point positions but all markers will have the same size and the same color.COL
option of TTree::Draw()
:
tree.Draw("e1:e2:e3","","col")
produces a 2D scatter plot with e1
vs e2
, and e3 is mapped on the current
color palette. That’s a bit better as it allows to draw three variables on a 2D plot. But
one needs to create a TTree
or a TNtuple
which is
a bit heavy when the data are already stored in simple vectors.Therefore there was a need for a new class able to produce, in a simple way, this famous multi-variables way to visualize data.
In order to full-fill these requirements a new class, TScatter
, has been
implemented. It is able to draw a four variables scatter plot on a single plot. The first two variables
are the x and y coordinates of the markers, the third one is mapped on the current color map, and
the fourth one on the marker size.
Note that it is recommended to use a transparent color map as markers will, most of the time, overlap.
The code to produce a scatter plot with the new class TScatter
is as simple as:
There is a natural tendency to look at compilation or conceptual errors as unwanted accidents or mistakes that only happen rarely, because of my own inexperience, and that surely will not happen next time. As such, we are not explicitly prepared nor trained to deal with them systematically. We just tackle them as a contingency and try to solve them quickly with whatever tools at hand. Yet experience tells us that errors (in programming, in mathematics, in judgement biases) are not an exception, but rather the rule.
In fact, most of the time in (robust) development is spent on debugging and troubleshooting, either passively by looking at whatever problem pops up, or actively, by creating robust software architecture from its conception that prevents them (via strong-typing, smart pointers, ordered structure and abstraction, good documentation, …), as well as a suite of tests that prevent these in the future in as many virtual scenarios as possible. It is not uncommon that you may write an analysis software in 5 hours, but then spend 5 days tracking down why the heck it’s giving wrong results, or crashing once every 100 times, or even more worryingly, leading you silently to wrong scientific conclusions or errors in other links of your analysis chain, that are far away from its original source and thus hard to trace back.
Yet, despite knowing the forefront impact on your workflow and scientific robustness, many of us physicist are not trained to deal with errors with the proper tools, and we still deploy inefficient and manual ways to hack them “as quickly as possible”, hoping (with uncertainty and fear) that they “won’t come back”. Because we will encounter errors much more frequently than we might think at first place, it makes sense to invest some “initial setup time” to create a robust platform for tackling and fixing these in a systematic way. Rather than reacting with insecurity to these or keeping them in the back of the mind as a passive or transient threat/accident, let’s assume they will be rather the norm and an important key player in our development, a learning tool that will appear continuously and is worth optimizing. In the same way that one does no longer use a pen if one wants to send 10000 letters, compared to only 10. In this situation, it’s interesting to partly shift your paradigm from “troubleshooting your code”, to “code for troubleshooting”, i.e. develop the instruments to quickly detect the mistakes you will surely make.
Integrated Desktop Environment (IDE) softwares are very powerful tools to detect errors (thanks e.g. to Clang), trace them back to the right point in the source code, and even automatically suggest the solution. ROOT scripts, as well as standalone C++ programs relying on ROOT libraries, can be integrated with minimum effort into these IDEs. Examples on the steps to follow are nicely explained in older blog posts for the Visual Studio and Eclipse IDEs, as well as in the Twiki and other blogs. In this post, I will focus on a third option, the open-source QtCreator IDE.
While optimized for Qt applications, QtCreator is totally generic, open-source, and it can compile and run any C++ program, CMake project, Makefile, etc. You don’t even need to know what Qt means. You don’t need to use qmake nor native Qt project files either (in fact, I prefer to always use CMake to get rid of any dependence with Qt). Your project will be equally compilable from a terminal with Make / CMake than via QtCreator, which just acts as a non-invasive interface.
You can find (usually) outdated versions of QtCreator in your package manager, but I recommend to use the online installer, which then periodically checks for updates at program start. If you prefer not to open a user account with them, you can use the offline installer. While installing, I recommend to deactivate all Qt library options, newer CMake versions or Ninja. You will just need QtCreator.
Go to “Tools”, “Options”, “Kits”. Click for example on one of the auto-detected kits in the dialog. You can define your custom one, which will appear as “Manual” in the tree view. I recommend setting up a “Manual” kit, where “Qt version” is set to “None” (in case it was set), and in “CMake generator”, “Ninja” is changed to “Unix Makefiles” (if you are not in Windows). If you prefer to use the “Ninja” generator, make sure that it is installed in your system. Finally, click “Ok”.
Beware: in OSx, the “Tools”, “Options” menu is instead under “AppName”, “Preferences”.
You can open any CMake project you have on your computer by clicking on “File”, “Open File or Project”. Find then the main folder where your project’s source code is located. Usually, there will be a “CMakeLists.txt” file in the main directory. Double-click then on this one, rather than in any other “CMakeLists.txt” that might appear in the subdirectories of this same project. If you prefer the command line, you can run directly as qtcreator my/folder/CMakeLists.txt &
. If you installed QtCreator in a local folder, you might need to run something like: /opt/Qt/Tools/QtCreator/bin/qtcreator my/folder/CMakeLists.txt &
.
If you rather use Makefiles, that’s also supported via the Import menu, by clicking on “File”, “New File or Project”, “Import Project”, “Import Existing Project”, “Choose”, and then select the source files you want to see in your tree (or just click on select all and deactivate those that are images, etc.). The Makefile will be automatically detected behind the scenes. You can edit the number of threads (-j
) later on the project’s “Build settings”.
Let me load into QtCreator the simplest CMake example. After “Open File or Project”, a “Kit dialog” will apear. The button “Manage Kits” on the top left allows you to remember what compiler was associated with each kit (as explained above), click “Ok” to close. Select the “build kit” you prefer, and then click on “Details”. There, you can specify in what folder to build your program. Under “Tools”, “Options”, “Build”, “Default Properties” allows you to setup a default directory for your builds.
Once you click on “Configure Project”, CMake will be automatically run. In the “Projects” pane, you can tune any CMake flag as needed, as well as specify command line arguments when running. The “Build” hammer icon on the left compiles your project (make), and the “Run” play icon executes it.
You will not need to re-do all these configuration steps later on for this project, as QtCreator will store these settings in a file called “CMakeLists.txt.user” and recognize it automatically the next time you open the project.
Let’s assume now that you have forgotten what class std::cout
corresponds to. Luckily, Qt has an in-built (offline) help support system. For a first-time configuration, you will just need to download the “Help Book” of your library, in this case the std
library from cppreference or via your package manager (sudo apt install cppreference-doc-en-qch
). Then, in “Tools”, “Options”, “Help”, “Documentation”, you can add the downloaded (or /usr/share/
installed) “.qch” file.
Once this is set, you can either Ctrl+Click on your function or object to immediately go to the source code definition (file will open in another tab), or press F1, and the HTML documentation will appear on your right side without having to type / search anything online.
If you use and compile LLVM yourself, you can also get your Qt Help file as described here.
The ROOT framework also has a “.qch” Help Book available for download, thus you’ll be able to quickly consult any documentation using the F1 key, rather than searching online, which can be useful in case you are traveling and have no Internet access.
You can not only check the documentation with F1, but fully open the HTML reference guide by clicking on the big “Help” icon (left pane), as shown below.
Alternatively, you can also open the Help Books and search them using Qt Assistant. Linux apt packages are qt4-dev-tools
or qt5-assistant
, and the executables are assistant-qt4
and assistant
, respectively. (qt6
version is not yet in the package manager.) You will have to add the .qch
file to its database by going to “Edit”, “Preferences”, “Documentation”, “Add”.
And if you already use other IDEs or operating systems ? In addition to inline HTML searching, the building of the (ROOT) doxygen documentation can be configured to output a format that is compatible with MacOS - Xcode, Windows - VSstudio, or Eclipse. ROOT only provides for download the Qt help files (.qch
) for the moment, but you can build the documentation yourself adapting those flags in the Doxyfile.
Grown over many years and standards, larger software projects have plenty of legacy code that is not as safe as the one someone would write today. Unsurprisingly, there are still some bugs here and there, and instabilities that haven’t been solved. Some of these bugs and potential style improvements can be detected thanks to the Clang-analyzer, which performs code analysis based on configurable settings.
QtCreator bundles perfectly with Clang-Analyzer, see left pane, “Debug” icon, then “Debugger” dropdown menu, “Clang-Tidy and Clazy”. It parses its output warnings and takes you directly to where the code needs to be changed. In addition, it even lets you apply “fixits” by a mouse-click: if Clang knows how to correct the problem, he will change the code automatically for you.
To give an example, analyzing the core
of ROOT yields several diagnostics, and this can be quite useful for tracing in case you are seeing some memory leak when your application is deploying ROOT libraries:
If, for example, you would like to modernize your code syntax to the latest C++ standard, you can configure the Clang settings in “Tools”, “Analyzer”, “Default checks”, and enable the modernize-
option. For example, with a single click, you can change NULL
to nullptr
across your whole codebase.
Whether you like 4 spaces, 2 spaces, 1 tab, braces in the beginning or in the end… it does not matter what your taste is. What’s important is that you do not spend your valuable time on formatting things by hand. QtCreator can be helpful in this regard, too, if you activate the Beautifier plugin as well as install clang-format
.
For example, let’s suppose you want to submit a pull request of one of your functions to ROOT, which has its own formatting guidelines. The easiest is to copy to your project the .clang-format
configuration file from the repository or the website and then go to “Tools”, “Options”, “Beautifier”, “Clang Format”, and specify the file location. (Or if you are building ROOT itself using QtCreator, specify “File” in the dropdown menu, and it will auto-detect the one in the source tree).
You can also define a keyboard shortcut to format the file, by going to “Tools”, “Environment”, “Keyboard”, search for “format” and assign e.g. Ctrl+Alt+F.
Once that is configured, you can enable to auto-format your file when saving, or apply changes manually via Ctrl+Alt+F. Here a snippet before and after applying it:
int main(int argc, char* argv[])
{
int main(int argc, char *argv[]) {
For one of your projects, or even for the ROOT codebase, you might be using git for version control. QtCreator integrates seamlessly with the typical git commands, and can show you a visual diff of the current changes, as well as commit
(Alt+G, Alt+C) and push
your changes using its graphical interface, or pull
the latest version from the remote repository.
You can get the best of both worlds by using the FakeVim mode or the emacs plugin. Just give it a try ;)
And if you just need column-editing, you don’t need any of those, QtCreator supports that natively.
If you’ve built ROOT enabling the “testing” CMake flag, or if your project contains “CTests”, “Boost Tests”, etc. for ensuring that new changes you apply don’t break older functionality, QtCreator has a platform to visually run and check the results of all those tests. No need to scroll in a terminal to find which one failed.
Beware:
QtCreator lets you not only find compilation errors, but also documentation errors, by interfacing with the warnings issued by doxygen. This metawarning function can prove extremely useful for detecting outdated or incorrect documentation and going to the right spot in the source code in just one click, rather than diving through thousands of lines of output and tracing it manually.
To give it a try, take a look at building the ROOT documentation project. Follow these steps:
source /path/to/ROOT/bin/thisroot.sh
in the terminal and launch qtcreator
from there. Alternatively, you can manually specify all the variables in the “Build” environment.root/documentation/doxygen
into QtCreator, as explained aboveBelow a screenshot of the errors and the points in the source code found by just clicking on those issues.
I’d suggest you to define a custom output parser to catch some doxygen warnings of “potential candidates” when there is an ambiguous matching in the signatures. To do this, go to “Tools”, “Options”, “Build&Run”, “Custom Output Parsers”, “Add”, and in “Warning”, specify the pattern (.*) at line (\d+) of file (.*)
and order 3,2,1. “Apply”, “Ok”, and in the “Projects”, “Build Settings”, on the bottom, click on activate the newly defined “Parser”.
If you want even more verbose warnings about undocumented parameters, try setting WARN_NO_PARAMDOC
to YES
in the Doxyfile and EXTRACT_ALL
to NO
. This will account for many much more weak points of your documentation and let you pinpoint your efforts on the right spot. And while it can be burdensome to write all this extra missing documentation, QtCreator also simplifies the task by typing three magic characters on top a function. Then, it will autocomplete all the skeleton in doxygen format. Check first if “Tools”, “Text editor”, “Completion”, “Enable Doxygen blocks” is enabled.
Consider also enabling this spell-checking plugin for detecting typos in your documentation. This can be done by simply downloading the release file and unzipping into into your qtcreator folder. Then, under “Tools”, “Options”, “Spellchecker”, you can configure which dictionary or language(s) to use.
If you need to debug your ROOT scripts, or the ROOT library itself, I recommend building ROOT from its sources, but using the “Debug” flag.
To do this:
CMAKE_BUILD_TYPE
to Debug
, as you would do from a command line.-Dmodule=ON
via the command line.-j8
on the “CMake arguments” or whatever other number, to speed up the build.To debug your script, on the “Projects”, “Build & Run”, “Your-Kit-Name, “Run” settings, specify your executable right of “Run configuration” by clicking on the dropdown menu (your own standalone application, or root.exe
) and specify your “Command line arguments”, e.g. -l -b
as well as “Working directory”, e.g. the name of the script you want to run as well as their parameters. If you want to precompile instead of interpret with cling, consider using the debug flag g
when passing the command line argument (yourScript.C+g
)
Click then on the “Play-Bug” icon on the left, and your script will run in “Debug” mode. Breakpoints can be set interactively on your code. F5 will pause or resume your process, as well as show you a workspace of the active variables and threads. For example, specify as “Command line arguments” -l -b hsimple.C+ -q
and as working directory your-root-folder/tutorials
. Open this file within QtCreator, and click on the left of the line numbers, and then click on the “Play-Bug” icon on the left, the script will execute and pause when it reaches that point. You can then perform step-by-step execution using the three little arrow icons right from the “Debugger” dropdown menu. You can hover your mouse over them and a tooltip will show their function.
Below a screenshot of another example, while debugging a deadlock in the TThread class.
Side note: if at some point, your ROOT script gets very complex long, I recommend instead to use a standalone C++ application using CMake, and link the ROOT libraries easily to it, as explained here.
To check for memory leaks and corruption, QtCreator offers a seamless integration with valgrind (or heob on Windows), making the backtrace of your errors fully interactive. To run it, press on the big “Debug button” on the left. Then, on the dropdown menu, change from “Debugger” to “Memcheck” and click on the small play button.
If you need extra arguments for valgrind, you will need to specify those under “Tools”, “Options”, “Analyzer”, “Valgrind”. There, I also recommend to click on “Add”, “etc/valgrind-root.supp” from your cloned repository, to suppress spurious warnings.
The resulting warnings can be easily clicked to bring you to the right spot in your code, or in the ROOT codebase, where the issue is arising from.
Often, you will also find helpful to run the static Clang-analyzer, which is able to detect many unsafe parts of your code that might be leading to memory leaks. It’s in the same dropdown menu, under “Clang-Tidy and Clazy”.
First, I recommend to click on “Tools”, “Options”, “Analyzer”, “Valgrind”, “Add”, “etc/valgrind-root.supp” from your git repository. Then, on the dropdown menu, change from “Debugger” to “Memcheck” and click on the small play button.
Helgrind cannot be run yet directly from QtCreator. The workaround is to run root.exe
or your own executable from your command line, with the flags --tool=helgrind --xml=yes --xml-file=yourfile.xml
. Then, you can “load” the result using the small “open” button right from the dropdown menu. The parsing tool works great and takes you the relevant location in your code.
In case you want to optimize the performance of your code, you can select from the debugger dropdown menu between “Callgrind” or the Performance Analyzer. If you install and use callgrind
, consider installing also kcachegrind
for visualization.
There are other tricks to boost your development in a way that’s integrated with your IDE. For example:
gROOT->ProcessLine(
static_cast<TString>(
"MyClassType* const fMyInstance = reinterpret_cast<MyClasstype*>(") +
dynamic_cast<std::ostringstream &&>(std::ostringstream("") << fMyInstance)
.str() +
");");
And then, of course, creating a TGCommandPlugin
window. From there, typing fMyInstance->MyMethod()
will execute binary code interactively.
This VS Studio plugin allows for a nice integration of a ROOT file browser. Maybe it will come at some point for QtCreator, too.
valgrind
, callgrind
, kcachegrind
.sudo apt install cppreference-doc-en-qch
).(.*) at line (\d+) of file (.*)
and order 3,2,1. “Apply”, “Ok”. Activate it under “Projects”, “Build Settings”, on the bottom.~/builds/
.-j8
on “Projects”, “Build & Run”, “Kit-name”, “Build”, “Build Steps”, and root.exe
as your executable in the run settings.Setting up all this platform requires some initial effort, but once it is running, it will smooth your development and bug hunting, and once you’ve get used to it, you will find it much more tiring to program without it ;) .
Fernando Hueso-González IFIC - Instituto de Física Corpuscular (CSIC / Universitat de València)
]]>NOTE: originally this post outlined a setup of a ROOT-based project in Eclipse IDE based on the Eclipse CDT4 CMake generator functionality. However, CMake4eclipse plugin provides a better integration of ROOT-based projects in Eclipse. Therefore post is updated in September 2022 to demonstrate the new approach. Former notes can be found here.
ROOT framework is written in C++, a language with complete manual control over the memory. Therefore, development and execution of your ROOT script (or Geant4 program) may sometimes lead to a crash providing minimal information in the stack trace. ROOT framework does not provide out-of-the-box solutions for debugging scripts. Hence, a question about debugging ROOT scripts now and then arises in the ROOT community.
Generally speaking, one does not need a special development environment to invoke a debugger on a ROOT script. Users can simply invoke the GNU Debugger (GDB) on the debug the root.exe binary:
gdb --args root.exe -l -b -q yourRootMacro.C
Similarly, GDB can be used for debugging stand-alone ROOT and Geant4-based programs. However, this debugging experience is carried out in the Terminal and lacks user interface and many useful features.
In this article, we outline an approach for robust debugging of CERN ROOT scripts and ROOT-based programs (also applies to Geant4-based programs). We will utilize Eclipse CDT (C/C++ Development Tooling) Integrated Desktop Environment (IDE), a free software coupled with the GNU debugger (GDB):
Additionally, the current approach allows users to have ROOT and Geant4 frameworks built in both - Release and Debug modes installed on the same computer. Debug binaries are great for development, allowing memory analysis and efficient development. Release builds - on the other hand - can be optimized for robust execution of the program and may work up to 10 times faster.
A few words about the operating system (OS). In this post, we will consider the setup on Linux-based systems. A similar approach may be replicated to macOS with GNU toolchain, but will require a code signing procedure. Windows is a totally different story.
Following milestones are required to complete the setup of the development environment:
In this section we demonstrate how to install Eclipse IDE on a personal Linux computer. We will use Eclipse with the cmake4eclipse plugin - a powerful tool for use with CMake-based projects. Cmake4eclipse automates the project setup and allows for automatic rebuild of the frameworks’ libraries (ROOT and/or Geant4) once their source code was changed.
Today (Aug 2022) CMake4eclipse plugin provides better integration of CMake-based projects in Eclipse compared to other options as:
Each of the above options have its own drawbacks that are a subject of a separate discussion. Following steps are required to set up the IDE workflow.
Install Eclipse IDE. Download the Eclipse installer from the official website, extract it and run. Select “Eclipse IDE for C/C++ Developers”. Refer to the screenshot and instructions below:
A. Recent Eclipse versions come with bundled Java Runtime Environment (JRE). As of July 2022, specify the built-in JRE version 11. Otherwise there will be an error accessing Eclipse help. This may be fixed in later Eclipse releases.
B. On Linux it is a good practice to install software that is not included in your distribution under the /opt
, /usr/local/
folder or home folder. In this article we will stick to the latter option and install Eclipse in the home folder under ~/Applications/
for consistency with the setup on macOS.
C. Wizard will provide a required list of packages to be installed on your system. Ensure all of the package dependencies are installed on your system.
D. Exit the wizard. There is no need to launch Eclipse right away. We will tweak its configuration file first.
Increase Eclipse memory limits. ROOT libraries contain thousands of source files. Usually, when indexing a ROOT-based project, memory use fluctuates around 2GB. Eclipse memory use be inspected with the VisualVM application.
Memory limits are specified in the eclipse.ini
file located inside the Eclipse install folder. Use text editor to update following lines:
-Xms512m
-Xmx4096m (set to 2048m minimum or higher if available)
Here the -Xms
value corresponds to the initial heap size used at the Eclipse startup. The latter -Xmx
value is to the maximum available memory limit. The more libraries are used in your project (ROOT, Geant4) the higher -Xmx
value is required for Eclipse indexer. Indexing speed of the framework source files will be higher with more available RAM.
Fix Eclipse launcher. If Eclipse window does not properly minimize to the dock icon, execute following command (bug report submitted here):
echo StartupWMClass=Eclipse >> ~/.local/share/applications/epp.package.cpp.desktop
Tweak memory limit for Eclipse indexer. Launch Eclipse and select default workspace location (e.g. ~Development/eclipse-workspace
). In the Eclipse menu open Window → Preferences → C/C++ → Indexer. Under “Cache Limits” set:
Limit relative to maximum heap size: 75%
Absolute limit: 4096 MB (same as for -Xmx value in eclipse.ini)
Update Eclipse and its CDT plugin. In the menu select Help → Check for updates. Follow the wizard steps. Restart Eclipse if required.
Install CMake4eclipse plugin. Project details can be found on GitHub. In the Eclipse menu select Help → Install new software. Enter following URL in the “Work with” field: https://raw.githubusercontent.com/15knots/CMake4eclipse/master/releng/comp-update/
In the modal dialog select everything but uncheck the older version of CMake4eclipse (v2). Keep only version v3. Follow the wizard steps and restart Eclipse. Refer to the screenshot below:
Tweak cmake4eclipse settings. Set default workbench for CMake4eclipse. In the Eclipse menu select Window → Preferences → C/C++ → Cmake4eclipse → Default build system → Set “Unix Makefiles”.
On the “CMake cache entries” tab, specify the C++ standard used for the build. Add corresponding CMake cache entry:
Name | Type | Value |
---|---|---|
CMAKE_CXX_STANDARD | STRING | 17 |
It is important for the ROOT-based programs to be compiled with the same C++ standard as the ROOT libraries. Therefore, in this guide we will explicitly set the C++ standard for ROOT, Geant4 and their based programs, more info here. We will use the C++17 standard.
Tip: if having problems with the build later, check “Force re-creation with each build”. This will trigger the CMakeLists.txt update and re-generation of the Unix makefile at every build, reflecting possible changes in the CMake cache entries (variables) and ROOT components’ source code.
Optionally apply following useful tweaks to the Eclipse workflow:
We successfully installed and set up the Eclipse with CMake4eclipse plugin and are now ready to set up ROOT project in Eclipse IDE.
In this section we address the setup of ROOT libraries as a project in Eclipse IDE. Framework will be built with debug symbols. This allows for setting breakpoints in the ROOT code, inspecting memory and variable values during the program run.
Install dependencies. Refer to this page on ROOT website to satisfy the dependencies for your particular Linux distribution.
Obtain the source code. There are a few options here.
A straightforward way is to download ROOT sources for a specific release from the ROOT website. Extract ROOT sources under the ~/Development
home folder. We will keep all the source code and Git repositories in this folder for consistency purposes.
Alternatively, if a user plans on contributing towards the ROOT repository it is recommended to fork the latest master
branch on GitHub, create a new branch in your forked repository and check it out:
mkdir -p ~/Development && cd ~/Development
git clone https://github.com/<your-username>/root
git checkout -b <your-feature-branch>
This allows for issuing Pull Requests to the original repository. More details can be found on this page.
Set up a project in Eclipse. Launch Eclipse. In the menu open File → New → Project… Expand “C/C++” and select “C++ Project” (not “C/C++ Project”).
On the next dialog, specify “root” as the project name. Uncheck “Use default location” and “Browse…” for ROOT sources location (e.g. ~/Development/root
). In “Project Type” expand “Cmake4eclipse” and select “Empty Project”. In “Toolchains” select “CMake driven”. Click “Next >”.
We are building ROOT with debug symbols. Therefore, uncheck “Default” and “Release” build options and only keep the “Debug”. Essentially this dialog box specifies the CMake -DCMAKE_BUILT_TYPE
variable.
Next we provide the CMake plugin with ROOT build options. Click “Advanced Settings…”. Go to C/C++ Build → Cmake4eclipse. Open the “CMake cache entries” tab. Add following variable names and values. Use “Add…” button on the right and input following variable names, types and values:
Name | Type | Value |
---|---|---|
CMAKE_INSTALL_PREFIX | PATH | ${HOME}/Applications/root-debug |
all | BOOL | ON |
When specifying variables of a PATH type, it is handy to use the “File System…” button. It will display the folder picker dialog and minimize the chance of specifying a wrong path. Refer to the screenshot below:
In this tutorial we build ROOT with all optional components turned on (-Dall=ON
). Find a complete list of the ROOT CMake build variables on the ROOT website and tailor the build for your needs.
Click “Apply and Close”. Click “Finish”.
Notice that Eclipse will start indexing the project. However, we will reschedule this operation later - after the build is completed. Reveal the “Progress” panel (tiny scrollbar animation in the very bottom right corner). Stop the indexer operation.
Build framework in Eclipse. Reveal the “Build Targets” tab (on the right side) and select “root” project. Right-click and select “New…” build target. Name target “install”. Click “Ok”. Expand “root” in the “Build Targets” tab and double-click the “install” target.
Build process speed depends on your computer speed and provided build variables. It may take up to a few hours to finish.
Tip: to switch between the CMake console and Linux make console, locate the “Display Selected Console” dropdown on the bottom actions panel.
Exclude build folder from indexing. Cmake4eclipse plugin performs a so-called in-source build. Meaning that the build folder is located within a project file tree. During the build ROOT header files are copied and duplicated inside the “_build” folder. To avoid indexing duplicate sources and headers, right click “root” project → Properties → C/C++ General → Paths and Symbols → Source Location. Expand the “/root” folder. Select “Filter”. Click “Edit filter…”. Add the “_build” folder to the filter. Click “Apply and Close”.
Run Eclipse indexer. We are now ready to index all ROOT source files and headers. This will create an Object-Oriented Programming (OOP) database of all ROOT object types, their methods and inheritance relations. Right click “root” project → Index → Rebuild.
Tip 1: sometimes Eclipse indexer may freeze while parsing the ./interpreter/llvm/src/tools/clang/lib/Driver/
sub-folder. If this happens, exclude the interpreter
folder from the build (this also excludes folders from the index). Highlight the interpreter
folder in the project tree. Right click, and select Resource Configurations → Exclude from build. Check “Debug” configuration. Click “Ok”. Now right click “root” project → Index → Rebuild.
Tip 2: Indexer usually takes couple hours to parse all of the ROOT framework source files. Computers with fast NVMe hard drives will perform this task the best. For computers with SATA drives and older I recommend keeping ROOT sources on the RAMDisk. Feel free to find my RAMDisk implementation on GitHub.
Turn off false positive errors. Even though the ROOT compilation succeeds, Eclipse code analysis tool displays semantic errors in ROOT sources. To turn them off, right-click “root” project and open Preferences → C/C++ General → Code Analysis. Select “Use Project Settings” option. Uncheck “Syntax and Semantic Errors” group. Maybe someone has a better idea how to fix that?
Eclipse carries out the in-source build, meaning that the “_build” folder is located inside the ROOT project tree. Apparently, if users want to issue pull requests to the ROOT GitHub repository, the “_build” folder needs to be added to the “.gitignore” of the local ROOT Git branch.
At this point ROOT libraries are compiled with debug symbols and Eclipse has indexed all the framework source files.
Not to mention, Geant4 framework libraries can be built in Eclipse exactly the same way as outlined above for the ROOT project.
ROOT scripts are originally designed to run through the Cling, a modern C++ interpreter based on the LLVM and Clang. To debug a ROOT script with native Linux GNU development tools - gcc
compiler and gdb
debugger - we need to convert a ROOT script into a ROOT-based program. Having it compiled into an executable with debug symbols, we will be able to invoke a debugger on it.
Skip to the next section if you already have a ROOT-based program code ready.
Next we elaborate how to convert a ROOT script into a CMake ROOT-based program. Generally speaking, this involves following:
Explicitly define all the headers used in your script (Cling interpreter does not require that).
Detailed instructions elaborating each item above can be found in this template repository on GitHub. Please refer to the repository README file.
In this section we will set up a ROOT-based CMake project in Eclipse IDE.
Obtain the source code. Place your ROOT-based project into a desired location, e.g. ~/Development
.
Set up the CMake4Eclipse project. Similarly to the ROOT project setup, in the Eclipse menu open File → New → Project… Expand “C/C++” and select “C++ Project” (not “C/C++ Project”).
On the next dialog, specify your project name. Uncheck “Use default location” and “Browse…” for your project “CMakeLists.txt” location. In “Project Type” expand “Cmake4eclipse” and select “Empty Project”. In “Toolchains” select “CMake driven”. Click “Next >”.
For the development purpose we uncheck “Default” and “Release” build configurations and keep the “Debug” option only.
Next we need to provide the CMake plugin with corresponding build variables. Click “Advanced Settings…”. Go to C/C++ Build → Cmake4eclipse. Open the “CMake cache entries” tab.
Since we compiled and installed ROOT and Geant4 not system-wide, but in ~/Applications
home directory, we need to provide the CMake with the locations of “RootConfig.cmake” file. Use “Add…” button on the right to input following variable name, type and value:
Name | Type | Value |
---|---|---|
ROOT_DIR | PATH | ${HOME}/Applications/root-debug/cmake |
Make sure that the path specified above is accurate.
Alternatively, paths to ROOT (and Geant4) CMake configuration files can be provided together in one variable CMAKE_PREFIX_PATH. Multiple paths are separated by the semicolon:
Name | Type | Value |
---|---|---|
CMAKE_PREFIX_PATH | PATH | ${HOME}/Applications/root-debug/cmake |
Set environment variables. Now we need to set the environment variables for the ROOT-based project. It is important to set the variables only for your particular ROOT-based project in Eclipse (not in general Eclipse settings). If set for all Eclipse projects, environment variables may interfere with the subsequent rebuilds of the ROOT (or Grant4) framework.
Source ROOT and/or Geant4 variables in Terminal and manually plug them into the Eclipse settings. Open Terminal and execute following command (copy paste as one line):
source $HOME/Applications/root-debug/bin/thisroot.sh && \\
env | grep 'G4\|ROOTSYS\|^LD_LIBRARY_PATH\|^PATH'
Required environment variables are output in the Terminal window:
Now go back to the project setup dialog in Eclipse, open C/C++ Build → Environment. Manually “Add…” each environment variable name and value into the Eclipse project settings:
Also, select “Replace native environment with specified one” option. This will isolate Eclipse project environment variables containing paths to frameworks built with debug symbols from potentially installed ROOT or Geant4 release versions on the system. Click “Finish” button.
Build project in Eclipse. Highlight your project in the Project Explorer. Right click → build.
Create a run (debug) configuration. Select your project in the project tree. In the Eclipse menu, open Run → Debug Configurations…
Select “C/C++ Application”. Press the “New launch configuration” button (on the very top left). Click the “Search Project…” button and locate the corresponding executable file.
If necessary, specify any command-line parameters on the “Arguments” tab. Click “Debug”.
This is it. Now you can enjoy full-scale debugging of your ROOT-based applications in Eclipse IDE.
In this post, we learned how to pair Eclipse IDE and cmake4eclipse plugins to setup an effective development environment for CERN ROOT scripts and ROOT-based programs. It was a process, so let’s draw a line and summarise what we learned today:
I hope you enjoyed this technical note. If not yet familiar, you can now continue learning fundamental Eclipse CDT hotkeys and debugging capabilities on YouTube.
For those who are interested in setting up the same development environment for a project that utilizes both - Geant4 and ROOT, please follow this link.
Feel free to leave comments below if you have any questions or recommendations.
]]>To achieve this, all the ROOT team was involved in a big update of the Manual and Reference Guide during one full week.
Previously, the ROOT documentation was spread over three different main manuals:
The Reference Guide and the Manual are the current valid sources of documentation. The Manual acts as a “User’s Guide” helping users to find their way into the huge amount of documentation provided in the Reference Guide.
The Old User’s Guide was outdated and not updated but nevertheless contained some valuable information we did not want to lose. It was more a “Long Write-Up” like the following example:
The first task of the week was to make sure the Manual’s table of content was complete: by groups of experts, we updated the existing chapters (Histograms, Graphs, Trees …) and create the new needed ones (JSROOT, ROOT I/O…) which also led to updates in the Reference Guide.
We also moved everything still valuable in the Old User’s Guide to the relevant places, updating the Manual or the Reference Guide. In the end (and after another, final round at the end of the year) we will drop completely the Old User’s Guide and instead have an accurate and complete Manual.
This new Manual allows you to contribute - for instance by letting us know when something is hard to understand (by opening an issue) or even by fixing it yourself: see the GitHub octocat at the bottom right corner of each page!
We hope you’ll enjoy the new manual, and that it’s useful for today’s new grad students!
]]>Hacktoberfest is a yearly event that encourages participation in open source communities and projects. Would you like to help us with some (not so scary) bugs? Or maybe you have some place in our documentation that you think deserves some love? Any small but noticeable feature you would like to see in ROOT? This definitely is a nice time to give your support to the project!
You just have to submit a PR to our repository. If it gets approved, it will count towards your hacktoberfest points. Check out the Hacktoberfest website for a full list of rules. Take a look at our issues labeled “good first issue” to get started, or feel free to make a PR about your own ideas! Just make sure you’re not a bot 😅
]]>RDataFrame is ROOT’s high-level interface for data analysis since ROOT v6.14. By now many real world analyses use it, and on top of that we see lots of non-analysis usage in the wild. Parallelism has always been a staple of its design with support for executing the event loop on all cores of a machine thanks to implicit multi-threading. Since ROOT 6.24, this aspect of RDataFrame has been enhanced further with distributed computing capabilities, allowing users to run their analysis on multi-node computing clusters through widely used frameworks. Currently the package offers support for running the application on an Apache Spark cluster, but the package design will make it possible to add many more backends in the future. For example, backends for Dask and AWS Lambda have already been implemented and demonstrated at different conferences [1][2]. They will be made available in future ROOT releases.
The main goal is to support distributed execution of any RDataFrame application. This has led to the creation of a Python package that connects the RDataFrame API (available in Python through PyROOT) and the APIs of distributed computing frameworks, which are offered in Python in the vast majority of cases. Another key goal is to offer a variety of backends, to provide a solution to a variety of use cases. This is achieved through a modular implementation that defines a generic task (representing the RDataFrame computation graph) to be executed on data. The input dataset is logically split into many ranges of entries, which will be sent to the distributed nodes for processing. Each range will then be paired to the generic task and submitted to the computing framework via a specific backend implementation. An added benefit of using RDataFrame is that the distributed tasks run C++ computations. This is made possible by PyROOT and cling.
Excellent scaling is paramount for this distributed RDataFrame implementation, to ensure you can run the RDataFrame computation graph efficiently across multiple computing nodes and different backend implementations. This has been shown since the first stages of the development of this package with a real use case analysis running on a Spark cluster [3]. More recently, a benchmark based on CERN open data has shown promising scaling performance with both Spark and Dask [4]:
We hear you asking: how does it look in code? Here is an example of an RDataFrame that is able to delegate its
computations to a Spark scheduler (requires the Python pyspark
package):
import ROOT
# Point RDataFrame calls to the Spark specific RDataFrame
RDataFrame = ROOT.RDF.Experimental.Distributed.Spark.RDataFrame
# It still accepts the same constructor arguments as traditional RDataFrame.
# It defaults to running a Spark process on the local machine, but it is possible
# to configure the RDataFrame to connect to a preexisting cluster.
df = RDataFrame("mytree", "myfile.root")
# Continue with the traditional RDataFrame API
sum = df.Filter("x > 10").Sum("y")
h = df.Histo1D("x")
print(sum.GetValue())
h.Draw()
The only difference with respect to local RDataFrame was the usage of a Spark backend specific RDataFrame. By default, it runs on the local machine using all cores. That is, it uses the default Spark behaviour. If a cluster is available, distributed RDataFrame allows connections to the remote scheduler through an extra optional argument in the constructor. Here is an example that also shows the connection to a Dask cluster:
from dask.distributed import Client
import ROOT
# Point RDataFrame calls to the Dask specific RDataFrame
RDataFrame = ROOT.RDF.Experimental.Distributed.Dask.RDataFrame
# Create the Client object to connect to the Dask cluster
# See the Dask documentation for all the options available
client = Client("DASK_SCHEDULER_ADDRESS")
# It still accepts the same constructor arguments as traditional RDataFrame
# And supports some extra keyword arguments
df = RDataFrame("mytree", "myfile.root", npartitions = 8, daskclient = client)
In this example the npartitions
parameter tells the RDataFrame into how many ranges of entries the input dataset should
be split. Each range will then correspond to a task on a node of the cluster. The daskclient
parameter receives the
object needed to connect to the Dask scheduler. All the options are available in the
Dask documentation. The equivalent object for the Spark framework
is called SparkContext and in
general every backend will have its own way to connect to a cluster of nodes.
Once the correct RDataFrame
object has been created, there is no need to modify any other part of the program.
Distributed RDataFrame enables large-scale interactive data analysis with ROOT. This Python layer on top of RDataFrame allows to steer C++ computations on a set of computing nodes and returning the final result directly to the user, so that the entire analysis can be run within the same application. It is available as an experimental feature since ROOT 6.24 with the support for running on a Spark cluster, with more backends in the works: Dask will be available soon in our nightly builds. Try distributed RDataFrame with our tutorial tutorial and learn more about it in the respective RDataFrame documentation section.
[1] Vincenzo Eduardo Padulano, Enric Tejedor Saavedra. “Dask backend for distributed RDataFrame”. In: Dask Distributed Summit 2021. https://summit.dask.org/schedule/presentation/24/dask-in-high-energy-physics-community
[2] Jacek Kusnierz et al. “Distributed Parallel Analysis Engine for High Energy Physics Using AWS Lambda”. In: HPDC 2021. https://dl.acm.org/doi/10.1145/3452413.3464788
[3] Valentina Avati et al. “Declarative Big Data Analysis for High-Energy Physics: TOTEM Use Case”. In:Euro-Par 2019: Parallel Processing (2019),pp. 241–255. https://doi.org/10.1007/978-3-030-29400-7_18
[4] Vincenzo Eduardo Padulano, Enric Tejedor Saavedra. “A Python package for distributed ROOT RDataFrame analysis”. In: PyHEP 2021. https://indico.cern.ch/event/1019958/contributions/4419751
]]>Cling is also a standalone tool, which has a growing community outside of our field. It is recognized for enabling interactivity, dynamic interoperability and rapid prototyping capabilities for C++ developers. For example, if you are typing C++ in a Jupyter notebook you are using the xeus-cling jupyter kernel. One of the major challenges is to ensure Cling’s sustainability and to foster that growing community.
Cling is built on top of LLVM and Clang. Reusing this compiler infrastructure means that Cling gets easy access to new future C++ standards, new compiler features and static analysis infrastructure. Our project organization mostly followed the LLVM community standards, but the remaining LLVM-specific customizations, while kept at minimum, are now costly for sustainability and development. For example, it is time consuming to move to newer LLVM versions and release Cling as following the LLVM release schedule.
A natural next step to mitigate some of these challenges is to move the essential part of the infrastructure closer to the LLVM orbit. The benefits of the solid software engineering by the LLVM community have been praised widely. For example, LLVM’s rigorous standards for code reviews, release cycles and integration are often raised by our “external” users. We would connect two highly knowledgeable system software engineering communities – the one around LLVM, and the one around data analysis in HEP. The success of Cling demonstrates that incrementally compiled C++ is a feature the C++ community can benefit from and the data science community needs. Finally, there are also potential synergies with projects such as clangd and lldb which would help interactive C++ become more popular to the broader C++ audience.
In 2018 we decided to approach the issue in a more structural way. We dedicated resources from various ongoing activities in DIANA-HEP and IPCC-ROOT and in 2019 we received an NSF award supporting this goal.
In July 2020, we laid our arguments in a “request for comment” document on the llvm mailing lists. The encouraging community response motivated us to produce several llvm blog posts with intention to clarify capabilities, design aspects and advanced feature use:
The LLVM community encouraged the general direction of moving reusable
components in Clang. The “new” Clang tool is called clang-repl. The motivation
behind the new name has two main aspects. Firstly, we need to ensure gradual
code reuse from clang-repl to downstream Cling and clashing class names is yet
another unnecessary complication. Secondly, some Cling features are tailored
towards HEP and hard to argue for wider use. Such an example is the implicit
auto
keyword injection or connecting ROOT files to the name lookup. In that
respect having a project named Cling in the Clang repository which differs in
functionality than the one in ROOT and HEP would create confusion and packaging
problems. And the final (bonus) argument is that ROOT will always require
occasional hot fixes in both Cling and LLVM which cannot be bound to the major
LLVM release schedule. For example, it would be unreasonable to wait for the
next LLVM release to address that (or even just the rigorous review procedure).
On 12th of May the initial, minimally functional, clang-repl landed in the LLVM repository. Hooray!
Although the acceptance of the initial clang-repl patch was a considerable success, it is essentially about initiating a new direction for the LLVM community. Such strategic choice came by the years-long effort within HEP to innovate, and also sustain, its technology advancement making it accessible to a broader audience.
Several of Cling’s technical aspects are now being discussed with the LLVM community, for instance how to implement reliable error recovery and code-removal mechanisms that free the unused underlying memory. These tasks have proven to be difficult for Cling being outside of the LLVM infrastructure. Thanks to John McCall and Richard Smith the sketch of the feature technical design within LLVM is sound and we are working towards it. However, this process poses an anticipated challenge – how to advance the technology in a slightly tangent direction while feeding it back to its major field of use?
Feeding back implementation from mainline LLVM to Cling and ROOT is a non-trivial task, partially because ROOT usually uses significantly older LLVM versions. The LLVM API does not promise backward compatibility and ROOT uses an intricate and vast API surface which makes transitions to newer versions essentially a development task. The goal of upstreaming parts of Cling in LLVM is to reduce the used API surface. We will make the upgrade procedure faster, though still measured in months to allow for extensive testing by the experiments software stacks. We cannot expect that ROOT can easily adopt each release of LLVM, without speaking of each commit. However, we can keep ROOT closer to LLVM mainline which makes backporting features from mainline easier.
The next piece of the puzzle is, if mainline functionality is successfully backported, how to evolve ROOT and Cling’s codebases incrementally towards it and how to ensure things work at the full scale of experiments. My personal take is that it is possible only if the two ends match by design. That is, when developing a patch against clang-repl we need to evaluate its reuse in Cling. This is easier said than done and we will need to learn through experience…
Sustainability in open source usually means having advanced users who can contribute back bug reports, code reviews, and code. Thus, a non-negligible part of this effort is outreach and community building for both clang-repl and Cling. Cling has been lucky to have people donating their time to help going towards LLVM mainline. Here I want to thank all of them. In particular Raphael Isemann, Jonas Hahnfeld and Pratyush Das who have each dedicated significant time to help our efforts and thereby reducing the accumulated technical debt in HEP.
The research and development efforts towards an interactive and incremental C++ in ROOT resulted in Cling, which became a cornerstone for data analysis in the field of HEP. Technical advancements in Cling enables new, previously unthought-of, abilities for clang and C++ such as template instantiations on demand, reflection, and language interoperability.
Thanks to support from CERN, USCMS, DIANA-HEP, Intel, the “technical debt” in the initial Cling implementation has been significantly reduced. Even so, much of that work is still ahead of us.
Cling is now used outside of HEP. We are excited to be working towards making it available to an even broader audience for instance by increasing Cling’s ties with the llvm project, while feeding back advancements from other communities to HEP through Cling and ROOT.
The author would like to thank Axel Naumann and David Lange who contributed to this post. You can find out more about our activities at https://compiler-research.org and https://root.cern/cling/
]]>Prior to the 6.20 release, a user couldn’t redefine a function, variable, or class whose definition was already provided for a particular interpreter session. If you have used ROOT for quite some time, it’s almost sure that you have seen this already:
While this behavior is expected from a ISO-compliant C++ compiler, it doesn’t seem convenient for interpreted C++ where users expect a behavior closer to a scripting language like Python. This issue was especially visible in Jupyter notebooks, where cells that provided a definition couldn’t be edited and re-run without restarting the C++ kernel. We knew it was annoying and we fixed it in the 6.20 release.
No. Support for redefinitions is automatically enabled for the ROOT prompt and Jupyter notebooks as of 6.20. Therefore, the following is now legal in a ROOT interpreter session:
However if you are using Cling standalone, this feature is considered optional and thus disabled at startup. In any case, you can manually turn it on/off as follows:
Formally, the ISO C++ one-definition-rule (ODR) forbids multiple definitions in order to ensure a consistent view of an entity across different translation units. The technique implemented in Cling does not, however, violate the ODR as each definition is internally enclosed in its own namespace. This ensures the uniqueness of the qualified (and mangled) name of each definition. The trick is completed by making the latest definition available in the global scope by fixing up the translation unit lookup table.
For more information, you can take a look at Cling issue #259, where part of the brainstorm took place. Also, you can refer to our conference paper published in the Proceedings of the 29th International Conference on Compiler Construction (CC 2020).
This feature allows a user to redefine functions, variables, or classes declared within the same interpreter session. We hope that our users will enjoy this as much as we enjoyed implementing it.
Special thanks go to Chandler Carruth, Axel Huebl et al. for providing some initial ideas on which the final design was built, and to Axel Naumann and Vassil Vassilev for reviewing the implementation and the submitted paper.
]]>latest-stable
that will be regularly updated after each release. If you want to know more, keep reading!
We have listened to you! This is for people who want the latest - without trying out things before they are released: a new major version twice a year, else bug fixes, i.e. always the latest version that went through the release validation.
Before this change, the instructions to build from source suggested to clone the latest tagged release (or patches branch). This turns out to be inconvenient, as the tag/branch names for the latest available version change after each release, e.g., as of this writing, the latest tagged release is v6-24-00
that will soon be superseded by v6-26-00
. As a consequence, a user that wants to build from source the latest release, had to check the tag/branch name before issuing the git clone
command.
Starting with the 6.24 release, we created the latest-stable
branch, which is targeted at users that regularly build ROOT from source. Furthermore, we will automatically provide updates to the latest tagged release to users that already checked out this branch.
Before getting hands-on, keep in mind that building from source is for advanced users. The preferred method for regular users to install ROOT is via pre-compiled packages. More on that can be found in the install guide.
That said, our aim was to make this really simple for those who are used to build ROOT from source. And that’s now as simple as:
$ git clone --branch latest-stable https://github.com/root-project/root.git root_src
Then, you can follow the rest of the instructions in build from source as usual.
But we didn’t stop there. This branch will be updated regularly after each release, which means that you can easily upgrade ROOT to the latest release by simply:
$ git pull
$ cd <empty build dir>
$ # build as usual
We hope that this change saves time (and avoids issues) for users that build ROOT from source. If you face any problem while using this new branch, please feel free to report it here.
Special thanks go to Jonas Hahnfeld for our discussions on the optimal approach, and to the rest of the ROOT team for providing useful feedback.
]]>Generally we have two big releases per year - but v6.24/00 took a bit longer: we really wanted to have the upgrade to LLVM 9 in, for full C++17 support and many bug fixes!
But there’s more:
ROOT::RDF::RunGraphs
can run multiple RDataFrame
s in parallel.
That’s an easy way to evaluate, for instance, uncertainty variations concurrently.
And look at this:
RDataFrame = ROOT.RDF.Experimental.Distributed.Spark.RDataFrame
It’s running your RDataFrame
on a Spark cluster!
To see how it works, check the new tutorial.
Once we are happy, this will be our recommended replacement for good old PROOF.
We have converted the blazingly fast RANLUX++ implementation from x86 assembly to portable C++. And it’s equally fast as the assembler version! It will be presented at vCHEP - including a fix for a bug in the assembly version, discovered by Martin Lüscher, the original author of RANLUX.
Still in the math department, TMVA can now interface with PyTorch as a more flexible alternative to PyKeras.
RooFit has several big improvements under the hood; we expect that since v6.20, typical uses of RooFit will be accelerated by a factor 4 to 16!
Several of you were using community-developed implementations called RooDSCBShape
and RooSDSCBShape
.
We have integrated them (and significantly improved them!) as RooCrystalBall
, so please switch!
If you want to see all the new features and all the squashed bugs then please check the release notes.
To get ROOT v6.24, you could for instance use conda
, snap
, MacPorts, Homebrew (soon), or some of the other ways (including downloading binaries).
If you’re using the LCG release: ROOT 6.24/00 is part of LCG100.
In the beginning of every year, the ROOT team discusses with the experiments what we will be working on during that year. We publish that, so you can track how often we manage to actually do what we plan to do! More seriously, it’s our way of inviting feedback for you to influence our priorities. So let’s do that: I tell you what we plan to work on, and you tell us what’s missing, and what’s more important than other items!
For us, for 2021, the main items are:
RDataFrame
, adding for instance Dask support;RDataFrame
into machine learning through python generators;RDataFrame
within a single event loop;RDataFrame
, so you can easily handle histograms from MC vs data samples;RBrowser
by defaultThere are a couple of “high intensity” developments ongoing for LHC’s high-luminosity future, such as RNTuple
, and more “software engineering” or framework-oriented ones such as modular CMake “superbuilds”, interpreter debug symbols and optimization - but that should either “just work” or will only arrive at your ROOT in the future.
As you can see and as you probably know, some developments need lots of time:
we need to do research, we need to benchmark, collect usage feedback, compare with alternatives, etc.
And we don’t always have the helping and coding hands we need.
So even if you don’t see ROOT’s new graphics system and new histograms up there: they are expected to progress, also in 2021.
We think they are crucial to make ROOT easier to use and future-proof, but we will only blog about them once we think you should try them out.
Please let us know below if ROOT is missing a crucial feature, for you, your analysis, or your way of working!
We have this new web site, people seem to like it, but it has a big problem: The 404-page isn’t customized! As we are using Jekyll, it’s super-easy to do. We just never felt like that’s more important than whatever else we have to do :-)
If you submit a pull request against our web site repo for a custom ROOT 404 page, and we select your proposal, then we’ll be sending you a Raspberry Pi Zero WH (i.e. with wifi, bluetooth, and pre-soldered GPIO headers), leftover from an earlier ROOT workshop, for you to play with, complete with HDMI and USB OTG adapter and SD card! And yes, that totally runs ROOT, too! If you win, the whole community will be enjoying your page whenever people look at ROOT forum posts from 20 years ago!
]]>A few weeks ago, I published a blog post about ROOT File Viewer, an extension to view ROOT Files directly in VS Code. On the discussion of said post the question of how to run ROOT Macros in VS Code arose, and I came up with an example repository to show how that can be accomplished. Let us now walk through this example repository and see how everything works.
To get started, we will start cloning and opening the repository. This can be done from the command line with:
git clone https://github.com/AlbertoPdRF/root-on-vscode.git
cd root-on-vscode
code .
After this is done, we need to manually set the path to our local ROOT installation in two files. We can do that quickly with the editor’s convenient global search and replace functionality:
Search
view with Ctrl + Shift + F
or clicking on the Search
icon in the Activity Bar
/home/apdrf/programs/root-6.22/install
, two occurrences should show upToggle Replace
by clicking on the caret on the left of the Search
input boxReplace
input boxCtrl + Alt + Enter
or clicking on Replace All
on the right of the input boxWe are almost ready to see everything in action, so let us open the workspace and get to it! There’s a few ways to do that, but probably the simplest one to do it is to:
root-on-vscode.code-workspace
, which is located inside the .vscode
folderOpen Workspace
After we have opened the workspace, a toast notification will appear at the bottom right of the editor asking us if we want to install the recommended extensions for the repository. Clicking on the Install
button will install both the C/C++ and ROOT File Viewer extensions. Please note that the second one is not strictly required for everything to work, it is just listed as a recommendation for convenience.
And that is everything that is needed! Just press F5
and see it for yourself: the example hsimple.C
ROOT Macro will run and the hsimple.root
file will be recreated.
With everything running, we can now take advantage of some of the awesome functionalities that we mentioned before to develop our ROOT Macros, for instance:
clang-format
to always get our code styled as we want itBut I know this all feels like magic, so if you want to know how everything works, keep reading!
What we do in this example is to define a VS Code Workspace with the necessary configuration for everything to work, so let us first see what a workspace is and how it is configured.
A VS Code Workspace is simply a collection of (one or more) folders that are opened in a VS Code instance (a window). For this example, we have defined a root-on-vscode
workspace through the root-on-vscode.code-workspace
JSON file located on the .vscode
folder. Said file contains the following configuration:
folders
: here we define the path to our workspace folder(s), relative to the location of this same filesettings
: through this object we tell VS Code to treat files with the .C
extension as C++ files and the path where it has to search for our header files, which in this case is ROOT’s include
directoryextensions
: here is were we recommend for the C/C++
and ROOT File Viewer
extensions to be installed for this workspaceApart from configuring the settings of the workspace, we have also defined a launch configuration, which is how we are going to be able to run ROOT Macros directly within VS Code. This is done in the launch.json
file, also located on the .vscode
folder. From this file I will just mention a few of its key points:
gdb
debugger under the hoodroot.exe
executable-l
to avoid showing ROOT’s banner-q
so the program quits after it finishes processing the macrohsimple.C+g
to tell the program which macro to run and to compile it with debugging symbols – this is what allows us to set break points through the macroAnd this is basically it, the rest of the things that the repository includes are:
.gitignore
file to not commit compilation artifacts to the repositoryhsimple.C
macro from ROOT Tutorialshsimple.root
fileREADME.md
file with some basic informationWith this blog post I just wanted to quickly illustrate how we can configure VS Code to run ROOT Macros directly in it. Doing so allows us to take advantage of some great functionalities of the editor that will make our lives way easier, and this way we can focus on what truly matters!
]]>With the coming release of ROOT v6-24-00 we are excited to launch a brand new PyTorch Interface for TMVA.
PyTorch is a Python-based scientific package supporting automatic differentiation. An open-source machine learning framework that accelerates the path from research prototyping to production deployment.
TMVA already has a PyKeras interface which we all love, especially with Keras’ simple high-level tensor-flow API. If your work involves some elementary experiments, Keras maybe the goto framework due to its plug and play spirit.
But things get interesting when one requires low level control and flexibility. That’s when the argument for Keras starts losing water. PyTorch on the other hand is amazing in terms of control, flexibility and raw power that it can provide to the user. Its lower-level approach is better suited for the more mathematically-inclined users.
PyTorch is widely used among researchers and hence has a large community around it.
ROOT + PyTorch: Allows to integrate ROOT methods which are good at handling HEP data and PyTorch for Machine Learning.
Power & Flexibility: Neural Nets are not easy to develop using TMVA, as they require complex configuration strings. Even with PyKeras Interface, designing custom layers is not feasible. PyTorch offers the power and flexibility to achieve complex models with custom layers, optimizers, loss functions and training methodologies.
Ease of Debugging: PyTorch models make use of dynamic computation graphs and are based on eager execution. This makes it easier to use debugging tools like pdb.
Performance: PyTorch is extremely fast due to its highly optimized C++ backend.
Designing a simple model in PyTorch using a PyTorch container is extremely simple. Here we use nn.Sequential
:
model = nn.Sequential()
model.add_module('linear_1', nn.Linear(in_features=4, out_features=64))
model.add_module('relu', nn.ReLU())
model.add_module('linear_2', nn.Linear(in_features=64, out_features=2))
model.add_module('softmax', nn.Softmax(dim=1))
See PyTorch docs for more tutorials.
As we mentioned earlier the power and flexibility comes in the form of designing custom layers as well writing a custom training loop.
loss = torch.nn.MSELoss()
optimizer = torch.optim.SGD
def train(model, train_loader, val_loader, num_epochs,
batch_size, optimizer, criterion, save_best, scheduler):
...
def predict(model, test_X, batch_size=32):
...
Defining a load_model_custom_objects
dictionary with the keys "optimizer"
, "criterion"
, "train_func"
and "predict_func"
is the only extra step required when using the PyTorch Interface in TMVA. Everything else is native PyTorch or TMVA.
load_model_custom_objects = {"optimizer": optimizer, "criterion": loss,
"train_func": train, "predict_func": predict}
In the end we book our TMVA method with kPyTorch
type and call the training method.
factory.BookMethod(dataloader, TMVA.Types.kPyTorch, 'PyTorch',
'H:!V:VarTransform=D,G:FilenameModel=model.pt:'
'NumEpochs=20:BatchSize=32')
factory.TrainAllMethods()
You can checkout a more detailed tutorial here as well as some examples in the ROOT repository. Read more about the development journey of TMVA PyTorch Interface at Anirudh’s GSoC blog.
]]>Before diving into the details, let us just quickly see the extension in action. Although the saying goes like:
An image is worth a thousand words
I think that a GIF will serve us better this time:
After seeing this, who would still want to open a terminal, launch ROOT, and create a TBrowser
instead? And I haven’t even mentioned yet that for this extension to work no local installation of ROOT is required!
If you just want to install the extension already and have a good time opening all your ROOT Files, you can do so by:
Quick Open
(Ctrl + P
), pasting ext install albertopdrf.root-file-viewer
, and pressing enterROOT File Viewer
directly within VS Code’s Extensions
view (Ctrl + Shift + X
or click on the Extensions
icon in the Activity Bar
)code --install-extension albertopdrf.root-file-viewer
from the command lineAnd if you want to know more, just keep reading!
VS Code is a free, open source, and very popular code editor developed by Microsoft with TypeScript, a superset of JavaScript. It runs on any operating system, supports many languages, has built-in support for Git, and much more. Moreover, its functionality can easily be extended thanks to the Extension API. This is where the fun begins!
In the case of the ROOT File Viewer extension, the Custom Editor API is leveraged to handle ROOT Files. The custom editor requires two parts: a view and a document model. The view of the file is implemented through the Webview API, and the document model is a custom RootFileDocument
class which we will keep simple by implementing a CustomReadonlyEditorProvider
, the RootFileEditorProvider
. We could go deep into details here, but that is probably outside of the scope of this post.
JavaScript ROOT brings ROOT to the browser. It is basically a drawing and I/O library that can be used to provide interactive plots and many other ROOT core functionalities, as can be seen on the examples page.
ROOT File Viewer makes use of the HierarchyPainter
object to do all the heavy lifting regarding the handling of the ROOT files and the drawing of the objects stored in it. It is configured with the tabs
layout and it gets passed the user’s VS Code theme background color so it integrates better with the editor.
The implementation of the extension boils down to how to glue the two awesome tools mentioned above together and, in all honesty, I tried to keep everything as simple as possible in order to have a proof of concept up and running quickly. All the magic happens on the rootFileEditor.ts
file, which contains both the implementation of the custom document and the webview.
The RootFileDocument
custom document is the object that gets created each time that a user opens a ROOT File document. For what it concerns us, it stores the path to the file that we want to create a view for.
The RootFileEditorProvider
is were all the functionality is implemented, which can be summarized in:
This last point is where JavaScript ROOT comes into play, as all the custom editor provider does at this point is to create a template HTML document with an embedded script where JavaScript ROOT gets passed the ROOT File path and the customization options mentioned before. Everything else just automagically works!
If you would like, you can check out (and even contribute to) the source code on ROOT File Viewer’s GitHub repository. And, of course, you also can (and are encouraged to) open an issue if a bug arises or you have a feature suggestion!
VS Code extensions receive automatic updates, so rest assured that you won’t miss any cool future features that may come!
To wrap everything up, with ROOT File Viewer I wanted to solve a pain point that I believe exists for more people than just me. I hope that glancing over the contents of a ROOT File is quicker and more practical now that this extension exists.
Working with such awesome tools as VS Code and JavaScript ROOT has been a ton of fun, and I would definitely recommend it to the geeks out there who enjoy getting to know new technologies and like to build things for people to interact with.
And, lastly, I would like to thank you for dedicating the time to read this post and everyone who has shown their support after the launch of ROOT File Viewer!
]]>Take a look at the store listing at https://snapcraft.io/root-framework, where you can find installations for some common distributions, e.g. Ubuntu, Debian, Fedora, OpenSUSE, CentOS, Arch, and more!
On Ubuntu, it is as simple as:
sudo snap install root-framework && root
You might be able to even just search for the ROOT Framework and install it in a single click!
This is a full-fat installation of ROOT, complete with its utilities such as hadd
, PyROOT via Python 3.8 (with SciPy, NumPy, Pandas and Matplotlib), and JupyROOT.
You will get these bundled by default, and since the whole package is based on container technology, they cannot interfere with any of your system libraries and can be easily upgraded (automatically!), removed, and mixed alongside other ROOT installations.
Just run root
in the terminal after installation and you can get to work instantly.
Give root --notebook
a go and try out the JupyROOT support.
As a special case, if you want PyROOT, you must run pyroot
rather than python
.
This ensures you get the bundled version of Python in the container rather than the host system, but from there you can import ROOT
normally and run your scripts.
You can also pass parameters to pyroot
as if it were python
, e.g. pyroot -i $(root-config --tutdir)/pyroot/fillrandom.py
.
There is no need to mount the $HOME
directory, graphical support should work by default, and a lot of optional packages are built by default.
The goal is to provide a Docker-like experience for ROOT but blur the distinction between the container and host environment, in a way that is convenient for users.
For example, by simply adding a shortcut to start ROOT in the start menu of most systems under the science section.
Most snap packages are under a sandboxing model that might subtly interfere with a user’s regular workflow.
This means that the ROOT snap cannot just access a user’s camera or microphone for example, since this makes little sense for ROOT. However, one notable feature is that ROOT is limited to accessing files in the users home directory (aside from over the network).
Furthermore, the snap will be prevented from accessing hidden files/folders in the top level of the home directory itself, such as $HOME/.ssh
.
To help make this work, the $HOME
variable, and gSystem->HomeDirectory()
will return a modified value for the users home directory, generally /home/example/snap/root-framework/current/
. If a user wants to make use of rootlogin.c
for the entire application, keep in mind it will look for it in $HOME
which points there instead.
If you make use of the parallel installation ability mentioned below, this can be an advantage as each installed version of the ROOT snap will have a unique $HOME
, and may have different rootlogin
files, history, etc.
To be clear, this does not prevent you from reading and writing to /home/example/desktop/
for example.
The value for the current working directory works the same as normal.
If ROOT is opened with the current working directory set to to /home/example
, you can access your desktop folder as simply ./desktop
.
$ pwd
/home/james
$ echo "Hello World" >> Example.txt
$ root
------------------------------------------------------------------
| Welcome to ROOT 6.22/06 https://root.cern |
| (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
| Built for linuxx8664gcc on Jan 08 2021, 20:08:00 |
| From tags/v6-22-06@v6-22-06 |
| Try '.help', '.demo', '.license', '.credits', '.quit'/'.q' |
------------------------------------------------------------------
root [0] .!cat Example.txt
Hello World
root [1] .!pwd
/home/james
Snaps update automatically, using delta patches to be bandwidth efficient. Due to the container properties, this should be a safe operation, so that jumping from one version of ROOT to the next can be done even if it upgrades to an entirely different compiler toolchain.
Nightly builds are produced and accessible with
sudo snap install root-framework --edge
.
If you are already using the snap and want to swap to the edge branch, use
sudo snap refresh root-framework --channel=edge
.
A track in Snapcraft terms is a separate branch of a project that can be downloaded instead of the default release. The default release is called “latest”, and its stable channel will generally follow the newest stable ROOT release. As a result, users will automatically update to newer branches of ROOT. However, in some scenarios people may like to use an older release, and tracks could be used to provide this in the future.
If there is a demand to produce these tracks, please provide some feedback and it can be looked into! The following example syntax would be usable if/when tracks are declared.
sudo snap install root-framework --channel=v6-22/stable
https://snapcraft.io/docs/parallel-installs
https://snapcraft.io/docs/commands-and-aliases
This feature is still experimental, but it is possible to have both the ROOT stable snap and the ROOT nightly snap alongside each other. However to make proper use of this functionality, it helps to understand Snap aliases.
The root
command itself is an alias, because the snap package is called root-framework
. When installed from the Snap Store, an alias is created automatically between root
and root-framework
.
For the other binaries such as hadd
, the original names are namespaced, so hadd
in the namespaced form is root-framework.hadd
.
When you install extra instances of a snap, you must decide which aliases you wish to use manually.
sudo snap set system experimental.parallel-instances=true
sudo snap install root-framework
sudo snap install root-framework_nightly --edge --unaliased
While root
will point to the stable version, you can run sudo snap prefer root-framework_nightly
so that the next invocation of root
will be from the nightly branch.
You can also alias individual commands, or simply use the unaliased binary names, such as root-framework_nightly.root
.
It is preferable to use Snap aliases rather than Bash aliases, a snap alias will affect all system users.
The additional installations of snaps will all have their own unique $HOME
values, so can have differing rootlogin
files and different history for each snap instance.
https://discourse.ubuntu.com/t/using-snapd-in-wsl2/12113
Officially, running snaps on WSL2 is unsupported. This is because WSL2 has a custom init system rather than systemd. Unofficially, it is possible to get it working reasonably well anyway, but this is not directly supported.
Having personally tried it, it is possible to get JupyROOT in the snap running in a web browser on Windows, and for some people, the snap on WSL2 might make sense.
Other virtual machine platforms, such as Virtualbox should be able to install Snaps without issues.
At the moment, CUDA is not supported, but this may change in the future.
OpenGL generally works fine in a Snap environment for most GPUs. A notable exception is amdgpu-pro drivers, and NVidia proprietary drivers on Debian and Debian derivatives (but excluding Ubuntu and Ubuntu derivatives). These issues can be resolved pending upstream work in the future.
There should not be any observable performance difference from the snap version of ROOT and any other version, a macro that takes an hour to run outside the snap should take an hour to run inside it.
Creating independent executables is not supported in the snap environment. The ABI is not stable, the compiler toolchain will be foreign to most systems, and the automatic updates would ruin this regularly even if you managed to hack it into working. If this is essential to your workflow, you are likely better suited with an alternative package.
Executing binaries from outside the snap environment from inside the environment itself will not work due to the sandboxing, and the image itself is by default inflexible, so that adding more Python modules for example involves either rebuilding the snap or using debug modes. If there are binaries and packages that might make sense inside the container, please give feedback and they can be considered to be default!
If you want to change the CMake parameters, add some extra packages or some extra Python modules, you might be pleasantly surprised with the Snapcraft build system and there are some instructions on my personal GitHub page on how to do it. https://github.com/MrCarroll/root-snap
Because the snap purposefully keeps its files away from the normal system, IDEs do not work with the ROOT snap.
Consider using root --notebook
to access JupyROOT for an IDE-like experience.
If this is insufficient, you are likely better suited with an alternative package.
Currently the ROOT snap is only built for AMD64/x86_64.
Snaps can work on other architectures, if there is a demand for alternative architectures such as ARM64, please give feedback and it can be considered. In the meantime, the Snapcraft build system should generally work on ARM64 and various other architectures, so you might be able to compile your own.
In summary, I hope there are a lot of users for whom a Snap package of ROOT might make sense. Prior to uploading this blog post, there are already several thousand downloads, going well above my personal expectations, and the issue tracker has not crashed yet so I am hopeful that it is being successful in helping get ROOT into people’s hands.
Please feel free to give your feedback on this package. Whilst not everything will be actionable, knowing what issues people have can help guide future improvements. In particular, feedback about additional python modules, issues with the sandboxing, and performance regressions are appreciated, though any feedback at all would be very much appreciated. You can get in touch with me on the ROOT forums as @james-carroll; or feel free to report issues at https://github.com/MrCarroll/root-snap, where you can also find information on building your own custom ROOT snap.
Special thanks go to Axel Naumann for being responsive and helping to reduce the bus factor of this package; thanks to my good friend Theodore Zorbas for giving me the inspiration to tackle this project and being my guinea pig for testing it, thanks to the ROOT community for already investing significant time in making ROOT easier to package, thanks to Canonical for the Snapcraft tooling, hosting, and build servers, and thanks to GitHub for their hosting and build servers too! Between Canonical and GitHub this entire package is built and distributed free of cost.
]]>The added TRandomRanluxpp
joins the group of available RNGs in ROOT:
TRandom1
is the RANLUX generator proposed by Martin Lüscher.
The implementation is based on the original description by Fred James and generates single precision values based on 24 bits of randomness.TRandom3
is based on the Mersenne Twister generator.
By default, this implementation is used for gRandom
and generates 32 bits of randomness.TRandomMixMax
is a 61 bits matrix recursive MIXMAX generator (described in https://mixmax.hepforge.org/).
It is based on a state of size 240, other sizes are available via MixMaxEngine
.There are a few more variants that are linked in the documentation.
There exist many properties to evaluate PRNGs, with the easiest being its period: it captures how many random numbers can be generated before the sequence “wraps around” and repeats. For obvious reasons, the period of a PRNG cannot be longer than its state. This is one of the problems of the oldest LCGs with a state of 32 bits or less.
However, there are many more statistical properties that are at least equally important. A widely used collection of empirical tests is TestU01 by L’Ecuyer and Simard. It has to be noted that the tests can only prove the existence of defects. That is, passing all tests does not mean that a PRNG is free of errors.
As with many other design issues, PRNGs need to make a certain trade-off: fast generators with a small state will usually fail many statistical tests. On the other hand, good generators are typically slower or use a much larger state. In fact, most generators will fail one test or another, including the widely used Mersenne Twister. Notable exceptions are RANLUX and MIXMAX that pass all currently known tests.
RANLUX++ is an LCG implementation derived from the ideas of RANLUX. It aims to provide better performance while maintaining the good properties of RANLUX. In particular, this includes the foundation based on mixing of classical mechanical systems. For this reason, the randomness of the generated sequence can be theoretically understood. The formulation as LCG makes it possible to reach even better properties without impact on performance.
The generator uses a single 576 bit number as state. When invoked, it hands out the next unused bits that the generator remembers as an offset. Once all bits are exhausted, the LCG performs a multiplication and a modulo operation to advance the state. Arithmetic on 576 bit numbers is way beyond what current hardware offers. Instead the state is portably stored as 9 numbers of 64 bits each. On this representation, multiplication and the modulo operation must thus be done in software. For the multiplication, this is very similar to the way multiplications are taught in high school. The modulo operation can take advantage of the known modulus (for details, refer to section 2.1 of the paper).
For efficiency reasons, Sibidanov implemented the routines in x86 assembly. This results in impressive performance, but it is not portable across all architectures supported by ROOT. So how much slower would a portable implementation be? After implementing RANLUX++ for ROOT, it turns out: not much!
The initial version added to ROOT takes around 30ns per random double
sampled.
For comparison, the implementation by Sibidanov comes in at around 8ns per number.
This was measured with 10 repetitions of a microbenchmark with a standard deviation of less than 1%.
The numbers are from a single core of an AMD Ryzen 9 3900 running CentOS 8.2.2004.
The default compiler is GCC in version 8.3.1, but I will present more numbers in the following.
30ns per number is already quite impressive and twice as fast compared to TRandom1
.
That implementation of RANLUX takes around 62ns per number and only generates single precision values.
It must be noted that a newer version by Lüscher was also measured at 30ns for double
values.
That said, can we do even better? Yes!
The graph below shows the evolution of performance after tuning. In addition to GCC 8.3.1, I also include numbers for Clang 9.0.1 and GCC 7.5.0. While the former two are packaged for CentOS 8, I compiled GCC 7.5.0 from the official sources. The dashed horizontal line represents the assembly version by Sibidanov. Details about the individual changes are described in the following sections.
The portable code relies on loops to implement the multiplication and modulo operation in software. This induces overhead when evaluating the exit condition and incrementing the loop variable. Furthermore, the loop structure leads to jumps in the code that hinder instruction level parallelism. Fortunately loop unrolling is a well known compiler optimization to tackle this problem: it duplicates the loop body to enable other optimization passes and generate more efficient code. In case of constant loop bounds, it is even possible to fully unroll the loop.
One disadvantage of loop unrolling is the increased code size.
For this reason, compilers are conservative in their unrolling heuristics.
During experiments, I found that unrolling of certain loops consistently improves performance.
By adding #pragma
directives, it is possible to help the compiler.
As can be seen in above graph, this improves performance for GCC 8.3.1 and Clang 9.0.1 by up to 30%.
GCC 7.5.0 does not benefit from the change because #pragma GCC unroll
was only added in GCC 8.
For the multiplication of two 576 bit numbers, the portable code is constrained by the available data types:
it can at most multiply two 32 bit values at a time to stay within the 64 bits of uint64_t
.
This makes it necessary to handle overflows when summing up the partial results.
By arranging the operations differently, it is possible to avoid conditional execution.
Together with full unrolling, this eliminates all jumps from the generated machine code for GCC 8.3.1.
During execution, this leads to performance improvements of up to 15% for the two versions of GCC.
There is no change for Clang because its optimizations already transformed the original code.
An even faster way is to let hardware handle the multiplication of two 64 bit numbers.
This is possible with the extension type __int128
supported by both GCC and Clang.
Using this type, it is possible to multiply 64 bit numbers with 128 bit precision.
For that, the compiler can make use of special instructions available for the targeted hardware.
Afterwards the upper and lower halves can be extracted as 64-bit numbers.
The change taking advantage of __int128
improves performance of all three compilers by more than 40%!
Additional tests showed inferior performance with older compilers.
Part of this is related to missing loop unrolling because the #pragma
is not available (see above).
An additional problem is the generation of jumping code to propagate carry bits in case of overflows.
Instead of relying on inline assembly code, I found a portable solution after refactoring the common code.
The modification results in good code for all tested compilers.
For the older GCC 7.5.0 in particular, it improves performance by more than 20%.
In this blog post, I discussed the implementation of RANLUX++ and its performance.
After tuning, the average time per number is down to around 9ns, very close to the assembly version.
It is slightly slower than TRandom3
currently used for gRandom
(around 3ns per number).
However, TRandom3
generates 32 bits of randomness while TRandomRanluxpp
uses 52 bits per double
value.
Furthermore RANLUX++ inherits and extends the mathematically proven properties of RANLUX.
As it also passes all tests in TestU01, it might replace the default gRandom
in the future.
But you do not need to wait until then: TRandomRanluxpp
will be available with version 6.24 of ROOT.
Once released, you can replace the default generator as described in the documentation.
Or use a nightly version right now via LCG or Conda!
We have investigated virus scanner reports on some of ROOT’s binaries. The ROOT team has invested about 2.5 work days into this investigation: we took this seriously. On top of that, CERN IT’s security team has investigated this independently, and also invested a notable amount of working hours. We were able to create binaries that were diagnosed as infected, by building hadd.exe
“from scratch”. We are convinced that these reports are false positives. We found out that upgrading the compiler works around the issue.
We could report this as false positive to the vendors whose virus scanner was reporting some of ROOT’s binaries. This has a high latency, especially when adding the latency between the virus scanner engines tweaking their patterns and Google removing root.cern from the list of flagged sites.
We could argue with Google. This does not seem like a fast track option either.
Instead, we have removed the file in question. We are upgrading the compiler on our build machines, which means that we cannot create Windows binaries for new patch releases of ROOT 6.20 and before: only newer ROOT versions can be built with the newest Visual Studio version.
Given the removal of the file reported by Google, we have asked Google for a review of root.cern. As Google puts it, “A review can take from a few days to a few weeks to complete.”
If you have ideas how to further improve the situation, please let us know by adding a comment below.
Please rest assured that we build our binaries on always up-to-date Windows installations running updated virus scanners, and that we do whatever we can to keep our binaries clean. This is the second time in 20+ years that ROOT files have been misdiagnosed as infected. To address this, we will scan binaries proactively on VirusTotal from here on, before offering them for download, to avoid false positives affecting us again.
Axel, for the ROOT team.
]]>To make it easier for you to submit issues, we now switched from Jira to GitHub Issues.
We hope that having a GitHub account is so common these days that you’re already logged in and used to GitHub’s interfaces.
Starting today, new bugs will only be reported in GitHub. What has been reported in Jira remains in Jira; we will keep fixing these Jira issues until they are all resolved. Jira now displays a message, asking you to use GitHub for issue submission instead.
For us, that change will help in triaging issues, tracking progress, and most importantly bringing issue handling way closer to our code. Pull requests, commits, issues, code: they can all cross-reference now, giving a consistent view.
Moving to a closed-source universe like GitHub isn’t exactly obvious for us. (But then again we’re coming from Jira…) On the other hand we did not want to use CERN’s GitLab because that requires a CERN account for issue submission, and exactly that was one of the biggest problems with Jira.
We hope that the move is not too painful for you. Maybe you even appreciate it! Let us know what can we do to make bug submission easier for you.
]]>If you are a ROOT power user with no time to lose, or you just want to quickly try out that new feature you heard about, or you really need a bug fix that was merged yesterday to get those fancy plots done, I have good news for you! Thanks to the great folks behind ROOT’s conda package, not only you can install the latest ROOT stable version on your computer in under 5 minutes, but from today you can also install the bleeding edge, unreleased development version of ROOT from yesterday with the following one-liner:
$ conda create --name root-nightly-env -c conda-forge -c https://root.cern/download/conda-nightly/latest root-nightly
That command creates a Conda environment called root-nightly-env
which contains the very latest ROOT with the very latest goodies.
To activate the environment and use that ROOT version, just call
$ conda activate root-nightly-env
The usual disclaimer about nightly builds applies: minor bugs might crop in and features might not yet be production-ready.
A big thank you goes to Chris Burr for all the help in making this happen. Also check out this nice blog post from Henry Schreiner, another maintainer of the ROOT Conda package, about the ins and outs of ROOT+Conda. All available distribution channels for ROOT nightly builds are listed at https://root.cern/install/nightlies.
Let us know what you think by clicking the comment button below!
]]>