ROOT Version 6.24 Release Notes

2022-09-29

Introduction

ROOT version 6.24/00 was released on April 14, 2022.

For more information, see:

http://root.cern

The following people have contributed to this new version:

Guilherme Amadio, CERN/SFT,
Bertrand Bellenot, CERN/SFT,
Josh Bendavid, CERN/CMS,
Jakob Blomer, CERN/SFT,
Rene Brun, CERN/SFT,
Philippe Canal, FNAL,
Olivier Couet, CERN/SFT,
Anirudh Dagar, CERN-SFT/GSOC,
Hans Dembinski, TU Dortmund/LHCb,
Massimiliano Galli, CERN/SFT,
Andrei Gheata, CERN/SFT,
Hadrien Grasland, IJCLab/LAL,
Enrico Guiraud, CERN/SFT,
Claire Guyot, CERN/SFT,
Jonas Hahnfeld, CERN/SFT,
Emmanouil Michalainas, CERN/SFT,
Stephan Hageboeck, CERN/SFT,
Sergey Linev, GSI,
Javier Lopez-Gomez, CERN/SFT,
Pere Mato, CERN/SFT,
Lorenzo Moneta, CERN/SFT,
Alja Mrak-Tadel, UCSD/CMS,
Axel Naumann, CERN/SFT,
Vincenzo Eduardo Padulano, CERN/SFT and UPV,
Danilo Piparo, CERN/SFT,
Fons Rademakers, CERN/SFT,
Jonas Rembser, CERN/SFT,
Andrea Sciandra, SCIPP-UCSC/Atlas,
Oksana Shadura, UNL/CMS,
Enric Tejedor Saavedra, CERN/SFT,
Christian Tacke, GSI,
Matevz Tadel, UCSD/CMS,
Vassil Vassilev, Princeton/CMS,
Wouter Verkerke, NIKHEF/Atlas,
Stefan Wunsch, CERN/SFT

General

Deprecation and Removal

Header Dependency Reduction

As always, ROOT tries to reduce the amount of code exposed through its headers. To that end, #includes were replaced by forward declarations in several headers. This might cause compilation errors (“missing definition of type…”) in your code, if that code was relying on indirect includes, instead of including the required headers itself. Please correct that simply by including the required header directly.

Core Libraries

Due to internal changes required to comply with the deprecation of Intel TBB’s task_scheduler_init and related interfaces in recent TBB versions, as of v6.24 ROOT will not honor a maximum concurrency level set with tbb::task_scheduler_init but will require instead the usage of tbb::global_control:

  //tbb::task_scheduler_init init(2); // does not affect the number of threads ROOT will use anymore

  tbb::global_control c(tbb::global_control::max_allowed_parallelism, 2);
  ROOT::TThreadExecutor p1;  // will use 2 threads
  ROOT::TThreadExecutor p2(/*nThreads=*/8); // will still use 2 threads

Note that the preferred way to steer ROOT’s concurrency level is still through ROOT::EnableImplicitMT or by passing the appropriate parameter to executors’ constructors, as in TThreadExecutor::TThreadExecutor.

See the discussion at ROOT-11014 for more context.

Dynamic Path: ROOT_LIBRARY_PATH

A new way to set ROOT’s “Dynamic Path” was added: the environment variable ROOT_LIBRARY_PATH. On Unix it should contain a colon separated list of paths, on Windows a semicolon separated list. It is intended to be cross platform and to be specific to ROOT (and thus not interfere with the system’s shared linker). The final “Dynamic Path” is now composed of these sources in order:

  1. ROOT_LIBRARY_PATH environment variable
  2. System specific shared linker environment variables like LD_LIBRARY_PATH, LIBPATH, or PATH.
  3. Setting from rootrc
  4. ROOT’s builtin library directory

Interpreter

Multithreading

I/O Libraries

TTree Libraries

RDataFrame

New features

Behavior changes

Notable bug fixes and improvements

The full list of bug fixes for this release is available below.

Distributed computing with RDataFrame

ROOT 6.24 introduces ROOT.RDF.Experimental.Distributed, an experimental python package that enhances RDataFrame with distributed computing capabilities. The new package allows distributing RDataFrame applications through one of the supported distributed backends. The package was designed so that different backends can be easily plugged in. Currently the Apache Spark backend is supported and support for Dask is coming soon. The backend submodules of this package expose their own RDataFrame objects. The only needed change in user code is to substitute ROOT.RDataFrame calls with such backend-specific RDataFrames. For example:

import ROOT

# Point RDataFrame calls to the Spark specific RDataFrame
RDataFrame = ROOT.RDF.Experimental.Distributed.Spark.RDataFrame

# It still accepts the same constructor arguments as traditional RDataFrame
df = RDataFrame("mytree","myfile.root")

# Continue the application with the traditional RDataFrame API

The main goal of this package is to support running any RDataFrame application distributedly. Nonetheless, not all RDataFrame operations currently work with this package. The subset that is currently available is:

with support for more operations coming in the future.

Any distributed RDataFrame backend inherits the dependencies of the underlying software needed to distribute the applications. The Spark backend for example has the following runtime dependencies (ROOT will build just fine without, but the feature will be unavailable without these packages):

Tests for the Spark backend can be turned ON/OFF with the new build option test_distrdf_pyspark (OFF by default).

Histogram Libraries

Math Libraries

Minuit2

TMVA

RooFit Libraries

Massive speed up of RooFit’s BatchMode on CPUs with vector extensions

RooFit’s BatchMode has been around since ROOT 6.20, but to fully use vector extensions of modern CPUs, a manual compilation of ROOT was necessary, setting the required compiler flags.

Now, RooFit comes with dedicated computation libraries, each compiled for a specific CPU architecture. When RooFit is loaded for the first time, ROOT inspects the CPU capabilities, and loads the fastest supported version of this computation library. This means that RooFit can now use vector extensions such as AVX2 without being recompiled, which enables a speed up of up to 4x for certain computations. Combined with better data access patterns (~3x speed up, ROOT 6.20), computations with optimised PDFs speed up between 4x and 16x.

The fast BatchMode now also works in combination with multi processing (NumCPU) and with binned data (RooDataHist).

See Demo notebook in SWAN, EPJ Web Conf. 245 (2020) 06007, arxiv:2012.02746.

RooBatchCompute Library

The library that contains the optimised computation functions is called RooBatchCompute. The PDFs contained in this library are highly optimized, and there is currently work in progress for further optimization using CUDA and multi-threaded computations. If you use PDFs that are not part of the official RooFit, you are very well invited to add them to RooFit by submitting a ticket or a pull request.

Benefiting from batch computations by overriding evaluateSpan()

For PDFs that are not part of RooFit, it is possible to benefit from batch computations without vector extensions. To do so, consult the RooBatchCompute readme.

Migrating PDFs that override the deprecated evaluateBatch()

In case you have created a custom PDF which overrides evaluateBatch(), please follow these steps to update your code to the newest version:

  1. Change the signature of the function both in the source and header file:
- RooSpan<double> RooGaussian::evaluateBatch(std::size_t begin, std::size_t batchSize) const
+ RooSpan<double> evaluateSpan(RooBatchCompute::RunContext& evalData, const RooArgSet* normSet) const
  1. Include RunContext.h and BracketAdapter.h.
  2. Use getValues() instead of getValBatch() to retrieve a RooSpan for the data of every value.
- auto xData = x.getValBatch(begin, batchSize);
+ auto xData = x->getValues(evalData,normSet);
  1. Retrieve the number of events by getting the maximum size of the input spans.
  size_t nEvents=0;
  for (auto& i:{xData,meanData,sigmaData})
    nEvents = std::max(nEvents,i.size());
  1. Create the output batch by calling RunContext::makeBatch()
- auto output = _batchData.makeWritableBatchUnInit(begin, batchSize);
+ auto output = evalData.makeBatch(this, nEvents);
  1. DO NOT use RooSpan::isBatch() and RooSpan::empty() methods! Instead, distinguish between scalar (RooSpan of size 1) and vector (RooSpan of size>1) parameters as shown below.
- const bool batchX = !xData.empty();
+ const bool batchX = xData.size()>1;
  1. Append RooBatchCompute:: to the classes that have been moved to the RooBatchCompute Library: RooSpan,BracketAdapterWithMask, BracketAdapter, RunContext. Alternatively, you can write
using namespace RooBatchCompute;
  1. Replace _rf_fast_<function> with RooBatchCompute::fast_<function> and include RooVDTHeaders.h (if applicable).
- output[i] = _rf_fast_exp(arg*arg * halfBySigmaSq);
+ output[i] = RooBatchCompute::fast_exp(arg*arg * halfBySigmaSq);

Unbiased binned fits

When RooFit performs binned fits, it takes the probability density at the bin centre as a proxy for the probability in the bin. This can lead to a bias. To alleviate this, the new class RooBinSamplingPdf has been added to RooFit. Also see arxiv:2012.02746.

More accurate residual and pull distributions

When making residual or pull distributions with RooPlot::residHist or RooPlot::pullHist, the histogram is now compared with the curve’s average values within a given bin by default, ensuring that residual and pull distributions are valid for strongly curved distributions. The old default behaviour was to interpolate the curve at the bin centres, which can still be enabled by setting the useAverage parameter of RooPlot::residHist or RooPlot::pullHist to false.

Improved recovery from invalid parameters

When a function in RooFit is undefined (Poisson with negative mean, PDF with negative values, etc), RooFit can now pass information about the “badness” of the violation to the minimiser. The minimiser can use this to compute a gradient to find its way out of the undefined region. This can drastically improve its ability to recover when unstable fit models are used, for example RooPolynomial.

For details, see the RooFit tutorial rf612_recoverFromInvalidParameters.C and arxiv:2012.02746.

Modernised RooDataHist

RooDataHist was partially modernised to improve const-correctness, to reduce side effects as well as its memory footprint, and to make it ready for RooFit’s faster batch evaluations. Derived classes that directly access protected members might need to be updated. This holds especially for direct accesses to _curWeight, _curWeightErrLo, etc, which have been removed. (It doesn’t make sense to write to these members from const functions when the same information can be retrieved using an index access operator of an array.) All similar accesses in derived classes should be replaced by the getters get_curWeight() or better get_wgt(i), which were also supported in ROOT <v6.24. More details on what happened:

  processEvent(dataHist.get(i), dataHist.weight()); // Dangerous! Order of evaluation is not guaranteed.

With the modernised interface, one would use:

  processEvent(dataHist.get(i), dataHist.weight(i));

To modernise old code, one should replace patterns like h.get(i); h.func() by h.func(i);. One may #define R__SUGGEST_NEW_INTERFACE to switch on deprecation warnings for the functions in question. Similarly, the bin content can now be set using an index, making prior loading of a certain coordinate unnecessary:

   for (int i=0 ; i<hist->numEntries() ; i++) {
-    hist->get(i) ;
-    hist->set(hist->weight() / sum);
+    hist->set(i, hist->weight(i) / sum, 0.);
   }
  // In a RooDataHist subclass:
  _vars = externalCoordinates;
  auto index = calcTreeIndex();

  // Or from the outside:
  auto index = dataHist.getIndex(externalCoordinates); // Side effect: Active bin is now `index`.

coordinates are now passed into calcTreeIndex without side effects:

  // In a subclass:
  auto index = calcTreeIndex(externalCoordinates, fast=<true/false>); // No side effect

  // From the outside:
  auto index = dataHist.getIndex(externalCoordinates); // No side effect

This will allow for marking more functions const, or for lying less about const correctness.

Fix bin volume correction logic in RooDataHist::sum()

The public member function RooDataHist::sum() has three overloads. Two of these overloads accept a sumSet parameter to not sum over all variables. These two overloads previously behaved inconsistently when the correctForBinSize or inverseBinCor flags were set. If you use the RooDataHist::sum() function in you own classes, please check that it can still be used with its new logic. The new and corrected bin correction behaviour is:

New fully parametrised Crystal Ball shape class

So far, the Crystal Ball distribution has been represented in RooFit only by the RooCBShape class, which has a Gaussian core and a single power-law tail on one side. This release introduces RooCrystalBall, which implements some common generalizations of the Crystal Ball shape:

The new RooCrystalBall class can substitute the RooDSCBShape and RooSDSCBShape, which were passed around in the community.

2D Graphics Libraries

Networking Libraries

Multithreaded support for FastCGI

Now when THttpServer creates FastCGI engine, 10 worker threads used to process requests received via FastCGI channel. This significantly increase a performance, especially when several clients are connected.

Better security for THttpServer with webgui

If THttpServer created for use with webgui widgets (RBrowser, RCanvas, REve), it only will provide access to the widgets via websocket connection - any other kind of requests like root.json or exe.json will be refused completely. Combined with connection tokens and https protocol, this makes usage of webgui components in public networks more secure.

Enabled WLCG Bearer Tokens support in RDavix

Bearer tokens are part of WLCG capability-based infrastructure with capability-based scheme which uses an infrastructure that describes what the bearer is allowed to do as opposed to who that bearer is. Token discovery procedure are developed according to the WLCG Bearer Token Discovery specification document. Short overview:

  1. If the BEARER_TOKEN environment variable is set, then the value is taken to be the token contents.
  2. If the BEARER_TOKEN_FILE environment variable is set, then its value is interpreted as a filename. The contents of the specified file are taken to be the token contents.
  3. If the XDG_RUNTIME_DIR environment variable is set, then take the token from the contents of $XDG_RUNTIME_DIR/bt_u$ID(this additional location is intended to provide improved security for shared login environments as $XDG_RUNTIME_DIR is defined to be user-specific as opposed to a system-wide directory.).
  4. Otherwise, take the token from /tmp/bt_u$ID.

Xrootd client support

ROOT can now be built with Xrootd 5 client libraries.

GUI Libraries

RBrowser improvements

JavaScript ROOT

Major JSROOT update to version 6

Class Reference Guide

One can now select a class’s documentation for a specific version. If a class does not exist in a given version, that version is grayed out, see for instance the documentation for ROOT::Experimental::RNTupleReader.

Build, Configuration and Testing Infrastructure

The following builtins have been updated:

PyROOT

Bugs and Issues fixed in this release

Release 6.24/02

Published on June 28, 2021

RDataFrame

Bugs and Issues fixed in this release

Release 6.24/04

Published on August 26, 2021

Bugs and Issues fixed in this release

Release 6.24/06

Published on September 1, 2021

Bugs and Issues fixed in this release

Release 6.24/08

Published on September 29, 2022

Bugs and Issues fixed in this release

HEAD of the v6-24-00-patches branch

These changes will be part of a future 6.24/10.