ROOT 6.14/05 Reference Guide |
The public interface to the RDataFrame federation of classes.
T | One of the "node" base types (e.g. RLoopManager, RFilterBase). The user never specifies this type manually. |
Definition at line 99 of file RDFInterface.hxx.
Public Member Functions | |
RInterface (const RInterface &)=default | |
Copy-ctor for RInterface. More... | |
RInterface (RInterface &&)=default | |
Move-ctor for RInterface. More... | |
template<typename T = Proxied, typename std::enable_if< std::is_same< T, RLoopManager >::value, int >::type = 0> | |
RInterface (const std::shared_ptr< Proxied > &proxied) | |
Only enabled when building a RInterface<RLoopManager> More... | |
template<typename AccFun , typename MergeFun , typename R = typename TTraits::CallableTraits<AccFun>::ret_type, typename ArgTypes = typename TTraits::CallableTraits<AccFun>::arg_types, typename ArgTypesNoDecay = typename TTraits::CallableTraits<AccFun>::arg_types_nodecay, typename U = TTraits::TakeFirstParameter_t<ArgTypes>, typename T = TTraits::TakeFirstParameter_t<TTraits::RemoveFirstParameter_t<ArgTypes>>> | |
RResultPtr< U > | Aggregate (AccFun aggregator, MergeFun merger, std::string_view columnName, const U &aggIdentity) |
Execute a user-defined accumulation operation on the processed column values in each processing slot. More... | |
template<typename AccFun , typename MergeFun , typename R = typename TTraits::CallableTraits<AccFun>::ret_type, typename ArgTypes = typename TTraits::CallableTraits<AccFun>::arg_types, typename U = TTraits::TakeFirstParameter_t<ArgTypes>, typename T = TTraits::TakeFirstParameter_t<TTraits::RemoveFirstParameter_t<ArgTypes>>> | |
RResultPtr< U > | Aggregate (AccFun aggregator, MergeFun merger, std::string_view columnName="") |
Execute a user-defined accumulation operation on the processed column values in each processing slot. More... | |
RInterface< Proxied, DS_t > | Alias (std::string_view alias, std::string_view columnName) |
Allow to refer to a column with a different name. More... | |
template<typename... ColumnTypes, typename Helper > | |
RResultPtr< typename Helper::Result_t > | Book (Helper &&h, const ColumnNames_t &columns={}) |
Book execution of a custom action using a user-defined helper object. More... | |
template<typename... BranchTypes> | |
RInterface< RLoopManager > | Cache (const ColumnNames_t &columnList) |
Save selected columns in memory. More... | |
RInterface< RLoopManager > | Cache (const ColumnNames_t &columnList) |
Save selected columns in memory. More... | |
RInterface< RLoopManager > | Cache (std::string_view columnNameRegexp="") |
Save selected columns in memory. More... | |
RResultPtr< ULong64_t > | Count () |
Return the number of entries processed (lazy action) More... | |
template<typename F , typename std::enable_if<!std::is_convertible< F, std::string >::value, int >::type = 0> | |
RInterface< Proxied, DS_t > | Define (std::string_view name, F expression, const ColumnNames_t &columns={}) |
Creates a custom column. More... | |
RInterface< Proxied, DS_t > | Define (std::string_view name, std::string_view expression) |
Creates a custom column. More... | |
template<typename F > | |
RInterface< Proxied, DS_t > | DefineSlot (std::string_view name, F expression, const ColumnNames_t &columns={}) |
Creates a custom column with a value dependent on the processing slot. More... | |
template<typename F > | |
RInterface< Proxied, DS_t > | DefineSlotEntry (std::string_view name, F expression, const ColumnNames_t &columns={}) |
Creates a custom column with a value dependent on the processing slot and the current entry. More... | |
template<typename FirstColumn , typename... OtherColumns, typename T > | |
RResultPtr< T > | Fill (T &&model, const ColumnNames_t &columnList) |
Return an object of type T on which T::Fill will be called once per event (lazy action) More... | |
template<typename T > | |
RResultPtr< T > | Fill (T &&model, const ColumnNames_t &bl) |
Return an object of type T on which T::Fill will be called once per event (lazy action) More... | |
template<typename F , typename std::enable_if<!std::is_convertible< F, std::string >::value, int >::type = 0> | |
RInterface< RDFDetail::RFilter< F, Proxied >, DS_t > | Filter (F f, const ColumnNames_t &columns={}, std::string_view name="") |
Append a filter to the call graph. More... | |
template<typename F , typename std::enable_if<!std::is_convertible< F, std::string >::value, int >::type = 0> | |
RInterface< RDFDetail::RFilter< F, Proxied >, DS_t > | Filter (F f, std::string_view name) |
Append a filter to the call graph. More... | |
template<typename F > | |
RInterface< RDFDetail::RFilter< F, Proxied >, DS_t > | Filter (F f, const std::initializer_list< std::string > &columns) |
Append a filter to the call graph. More... | |
RInterface< RDFDetail::RJittedFilter, DS_t > | Filter (std::string_view expression, std::string_view name="") |
Append a filter to the call graph. More... | |
template<typename F > | |
void | Foreach (F f, const ColumnNames_t &columns={}) |
Execute a user-defined function on each entry (instant action) More... | |
template<typename F > | |
void | ForeachSlot (F f, const ColumnNames_t &columns={}) |
Execute a user-defined function requiring a processing slot index on each entry (instant action) More... | |
ColumnNames_t | GetColumnNames () |
Returns the names of the available columns. More... | |
template<typename V = RDFDetail::TInferType> | |
RResultPtr<::TH1D > | Histo1D (const TH1DModel &model={"", "", 128u, 0., 0.}, std::string_view vName="") |
Fill and return a one-dimensional histogram with the values of a column (lazy action) More... | |
template<typename V = RDFDetail::TInferType> | |
RResultPtr<::TH1D > | Histo1D (std::string_view vName) |
template<typename V = RDFDetail::TInferType, typename W = RDFDetail::TInferType> | |
RResultPtr<::TH1D > | Histo1D (const TH1DModel &model, std::string_view vName, std::string_view wName) |
Fill and return a one-dimensional histogram with the weighted values of a column (lazy action) More... | |
template<typename V = RDFDetail::TInferType, typename W = RDFDetail::TInferType> | |
RResultPtr<::TH1D > | Histo1D (std::string_view vName, std::string_view wName) |
Fill and return a one-dimensional histogram with the weighted values of a column (lazy action) More... | |
template<typename V , typename W > | |
RResultPtr<::TH1D > | Histo1D (const TH1DModel &model={"", "", 128u, 0., 0.}) |
Fill and return a one-dimensional histogram with the weighted values of a column (lazy action) More... | |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType> | |
RResultPtr<::TH2D > | Histo2D (const TH2DModel &model, std::string_view v1Name="", std::string_view v2Name="") |
Fill and return a two-dimensional histogram (lazy action) More... | |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType, typename W = RDFDetail::TInferType> | |
RResultPtr<::TH2D > | Histo2D (const TH2DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view wName) |
Fill and return a weighted two-dimensional histogram (lazy action) More... | |
template<typename V1 , typename V2 , typename W > | |
RResultPtr<::TH2D > | Histo2D (const TH2DModel &model) |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType, typename V3 = RDFDetail::TInferType> | |
RResultPtr<::TH3D > | Histo3D (const TH3DModel &model, std::string_view v1Name="", std::string_view v2Name="", std::string_view v3Name="") |
Fill and return a three-dimensional histogram (lazy action) More... | |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType, typename V3 = RDFDetail::TInferType, typename W = RDFDetail::TInferType> | |
RResultPtr<::TH3D > | Histo3D (const TH3DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view v3Name, std::string_view wName) |
Fill and return a three-dimensional histogram (lazy action) More... | |
template<typename V1 , typename V2 , typename V3 , typename W > | |
RResultPtr<::TH3D > | Histo3D (const TH3DModel &model) |
template<typename T = RDFDetail::TInferType> | |
RResultPtr< RDFDetail::MaxReturnType_t< T > > | Max (std::string_view columnName="") |
Return the maximum of processed column values (lazy action) More... | |
template<typename T = RDFDetail::TInferType> | |
RResultPtr< double > | Mean (std::string_view columnName="") |
Return the mean of processed column values (lazy action) More... | |
template<typename T = RDFDetail::TInferType> | |
RResultPtr< RDFDetail::MinReturnType_t< T > > | Min (std::string_view columnName="") |
Return the minimum of processed column values (lazy action) More... | |
template<typename NewProxied > | |
operator RInterface< NewProxied > () | |
RInterface & | operator= (const RInterface &)=default |
Copy-assignment operator for RInterface. More... | |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType> | |
RResultPtr<::TProfile > | Profile1D (const TProfile1DModel &model, std::string_view v1Name="", std::string_view v2Name="") |
Fill and return a one-dimensional profile (lazy action) More... | |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType, typename W = RDFDetail::TInferType> | |
RResultPtr<::TProfile > | Profile1D (const TProfile1DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view wName) |
Fill and return a one-dimensional profile (lazy action) More... | |
template<typename V1 , typename V2 , typename W > | |
RResultPtr<::TProfile > | Profile1D (const TProfile1DModel &model) |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType, typename V3 = RDFDetail::TInferType> | |
RResultPtr<::TProfile2D > | Profile2D (const TProfile2DModel &model, std::string_view v1Name="", std::string_view v2Name="", std::string_view v3Name="") |
Fill and return a two-dimensional profile (lazy action) More... | |
template<typename V1 = RDFDetail::TInferType, typename V2 = RDFDetail::TInferType, typename V3 = RDFDetail::TInferType, typename W = RDFDetail::TInferType> | |
RResultPtr<::TProfile2D > | Profile2D (const TProfile2DModel &model, std::string_view v1Name, std::string_view v2Name, std::string_view v3Name, std::string_view wName) |
Fill and return a two-dimensional profile (lazy action) More... | |
template<typename V1 , typename V2 , typename V3 , typename W > | |
RResultPtr<::TProfile2D > | Profile2D (const TProfile2DModel &model) |
RInterface< RDFDetail::RRange< Proxied >, DS_t > | Range (unsigned int begin, unsigned int end, unsigned int stride=1) |
Creates a node that filters entries based on range: [begin, end) More... | |
RInterface< RDFDetail::RRange< Proxied >, DS_t > | Range (unsigned int end) |
Creates a node that filters entries based on range. More... | |
template<typename F , typename T = typename TTraits::CallableTraits<F>::ret_type> | |
RResultPtr< T > | Reduce (F f, std::string_view columnName="") |
Execute a user-defined reduce operation on the values of a column. More... | |
template<typename F , typename T = typename TTraits::CallableTraits<F>::ret_type> | |
RResultPtr< T > | Reduce (F f, std::string_view columnName, const T &redIdentity) |
Execute a user-defined reduce operation on the values of a column. More... | |
RResultPtr< RCutFlowReport > | Report () |
Gather filtering statistics. More... | |
template<typename... BranchTypes> | |
RResultPtr< RInterface< RLoopManager > > | Snapshot (std::string_view treename, std::string_view filename, const ColumnNames_t &columnList, const RSnapshotOptions &options=RSnapshotOptions()) |
Save selected columns to disk, in a new TTree treename in file filename . More... | |
RResultPtr< RInterface< RLoopManager > > | Snapshot (std::string_view treename, std::string_view filename, const ColumnNames_t &columnList, const RSnapshotOptions &options=RSnapshotOptions()) |
Save selected columns to disk, in a new TTree treename in file filename . More... | |
RResultPtr< RInterface< RLoopManager > > | Snapshot (std::string_view treename, std::string_view filename, std::string_view columnNameRegexp="", const RSnapshotOptions &options=RSnapshotOptions()) |
Save selected columns to disk, in a new TTree treename in file filename . More... | |
RResultPtr< RInterface< RLoopManager > > | Snapshot (std::string_view treename, std::string_view filename, std::initializer_list< std::string > columnList, const RSnapshotOptions &options=RSnapshotOptions()) |
Save selected columns to disk, in a new TTree treename in file filename . More... | |
template<typename T = RDFDetail::TInferType> | |
RResultPtr< RDFDetail::SumReturnType_t< T > > | Sum (std::string_view columnName="", const RDFDetail::SumReturnType_t< T > &initValue=RDFDetail::SumReturnType_t< T >{}) |
Return the sum of processed column values (lazy action) More... | |
template<typename T , typename COLL = std::vector<T>> | |
RResultPtr< COLL > | Take (std::string_view column="") |
Return a collection of values of a column (lazy action, returns a std::vector by default) More... | |
Protected Member Functions | |
RInterface (const std::shared_ptr< Proxied > &proxied, const std::weak_ptr< RLoopManager > &impl, const ColumnNames_t &validColumns, const std::shared_ptr< const ColumnNames_t > &datasetColumns, RDataSource *ds) | |
std::shared_ptr< RLoopManager > | GetLoopManager () |
Get the RLoopManager if reachable. If not, throw. More... | |
const std::shared_ptr< Proxied > & | GetProxiedPtr () const |
ColumnNames_t | GetValidatedColumnNames (const unsigned int nColumns, const ColumnNames_t &columns) |
Prepare the call to the GetValidatedColumnNames routine, making sure that GetBranchNames, which is expensive in terms of runtime, is called at most once. More... | |
Private Types | |
using | ColumnNames_t = RDFDetail::ColumnNames_t |
using | DS_t = DataSource |
using | RCustomColumnBase = RDFDetail::RCustomColumnBase |
using | RFilterBase = RDFDetail::RFilterBase |
using | RLoopManager = RDFDetail::RLoopManager |
using | RRangeBase = RDFDetail::RRangeBase |
Private Member Functions | |
void | AddDefaultColumns () |
template<typename... BranchTypes, std::size_t... S> | |
RInterface< RLoopManager > | CacheImpl (const ColumnNames_t &columnList, std::index_sequence< S... > s) |
Implementation of cache. More... | |
ColumnNames_t | ConvertRegexToColumns (std::string_view columnNameRegexp, std::string_view callerName) |
template<typename ActionType , typename... BranchTypes, typename ActionResultType , typename std::enable_if<!RDFInternal::TNeedJitting< BranchTypes... >::value, int >::type = 0> | |
RResultPtr< ActionResultType > | CreateAction (const ColumnNames_t &columns, const std::shared_ptr< ActionResultType > &r) |
template<typename ActionType , typename... BranchTypes, typename ActionResultType , typename std::enable_if< RDFInternal::TNeedJitting< BranchTypes... >::value, int >::type = 0> | |
RResultPtr< ActionResultType > | CreateAction (const ColumnNames_t &columns, const std::shared_ptr< ActionResultType > &r, const int nColumns=-1) |
template<typename F , typename CustomColumnType , typename RetType = typename TTraits::CallableTraits<F>::ret_type> | |
std::enable_if< std::is_default_constructible< RetType >::value, RInterface< Proxied, DS_t > >::type | DefineImpl (std::string_view name, F &&expression, const ColumnNames_t &columns) |
template<typename F , typename CustomColumnType , typename RetType = typename TTraits::CallableTraits<F>::ret_type> | |
std::enable_if<!std::is_convertible< F, std::string >::value &&!std::is_default_constructible< RetType >::value, RInterface< Proxied, DS_t > >::type | DefineImpl (std::string_view, F, const ColumnNames_t &) |
template<> | |
std::string | GetNodeTypeName () |
template<> | |
std::string | GetNodeTypeName () |
template<> | |
std::string | GetNodeTypeName () |
template<> | |
std::string | GetNodeTypeName () |
template<typename... ColumnTypes> | |
RResultPtr< RInterface< RLoopManager > > | SnapshotImpl (std::string_view treename, std::string_view filename, const ColumnNames_t &columnList, const RSnapshotOptions &options) |
Implementation of snapshot. More... | |
Static Private Member Functions | |
static std::string | GetNodeTypeName () |
Return string containing fully qualified type name of the node pointed by fProxied. More... | |
Private Attributes | |
std::shared_ptr< const ColumnNames_t > | fBranchNames |
Cache of the chain columns names. More... | |
RDataSource *const | fDataSource = nullptr |
Non-owning pointer to a data-source object. Null if no data-source. RLoopManager has ownership of the object. More... | |
const std::weak_ptr< RLoopManager > | fImplWeakPtr |
Weak pointer to the RLoopManager at the root of the graph. More... | |
const std::shared_ptr< Proxied > | fProxiedPtr |
Smart pointer to the graph node encapsulated by this RInterface. More... | |
ColumnNames_t | fValidCustomColumns |
Names of columns Define d for this branch of the functional graph. More... | |
Friends | |
template<typename T , typename W > | |
class | RInterface |
std::string | cling::printValue (::ROOT::RDataFrame *tdf) |
#include <ROOT/RDFInterface.hxx>
|
private |
Definition at line 101 of file RDFInterface.hxx.
|
private |
Definition at line 100 of file RDFInterface.hxx.
|
private |
Definition at line 104 of file RDFInterface.hxx.
|
private |
Definition at line 102 of file RDFInterface.hxx.
|
private |
Definition at line 105 of file RDFInterface.hxx.
|
private |
Definition at line 103 of file RDFInterface.hxx.
|
default |
Copy-ctor for RInterface.
|
default |
Move-ctor for RInterface.
|
inline |
Only enabled when building a RInterface<RLoopManager>
Definition at line 143 of file RDFInterface.hxx.
|
inlineprotected |
Definition at line 1760 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1456 of file RDFInterface.hxx.
|
inline |
Execute a user-defined accumulation operation on the processed column values in each processing slot.
F | The type of the aggregator callable. Automatically deduced. |
U | The type of the aggregator variable. Must be default-constructible, copy-constructible and copy-assignable. Automatically deduced. |
T | The type of the column to apply the reduction to. Automatically deduced. |
[in] | aggregator | A callable with signature U(U,T) or void(U&,T) , where T is the type of the column, U is the type of the aggregator variable |
[in] | merger | A callable with signature U(U,U) or void(std::vector<U>&) used to merge the results of the accumulations of each thread |
[in] | columnName | The column to be aggregated. If omitted, the first default column is used instead. |
[in] | aggIdentity | The aggregator variable of each thread is initialised to this value (or is default-constructed if the parameter is omitted) |
An aggregator callable takes two values, an aggregator variable and a column value. The aggregator variable is initialized to aggIdentity or default-constructed if aggIdentity is omitted. This action calls the aggregator callable for each processed entry, passing in the aggregator variable and the value of the column columnName. If the signature is U(U,T)
the aggregator variable is then copy-assigned the result of the execution of the callable. Otherwise the signature of aggregator must be void(U&,T)
.
The merger callable is used to merge the partial accumulation results of each processing thread. It is only called in multi-thread executions. If its signature is U(U,U)
the aggregator variables of each thread are merged two by two. If its signature is void(std::vector<U>& a)
it is assumed that it merges all aggregators in a[0].
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1362 of file RDFInterface.hxx.
|
inline |
Execute a user-defined accumulation operation on the processed column values in each processing slot.
F | The type of the aggregator callable. Automatically deduced. |
U | The type of the aggregator variable. Must be default-constructible, copy-constructible and copy-assignable. Automatically deduced. |
T | The type of the column to apply the reduction to. Automatically deduced. |
[in] | aggregator | A callable with signature U(U,T) or void(U,T) , where T is the type of the column, U is the type of the aggregator variable |
[in] | merger | A callable with signature U(U,U) or void(std::vector<U>&) used to merge the results of the accumulations of each thread |
[in] | columnName | The column to be aggregated. If omitted, the first default column is used instead. |
See previous Aggregate overload for more information.
Definition at line 1398 of file RDFInterface.hxx.
|
inline |
Allow to refer to a column with a different name.
[in] | alias | name of the column alias |
[in] | columnName | of the column to be aliased Aliasing an alias is supported. |
Definition at line 359 of file RDFInterface.hxx.
|
inline |
Book execution of a custom action using a user-defined helper object.
ColumnTypes | List of types of columns used by this action. |
Helper | The type of the user-defined helper. See below for the required interface it should expose. |
This method books a custom action for execution. The behavior of the action is completely dependent on the Helper object provided by the caller. The minimum required interface for the helper is the following (more methods can be present, e.g. a constructor that takes the number of worker threads is usually useful):
See $ROOTSYS/tree/treeplayer/inc/ROOT/RDFActionHelpers.hxx for the helpers used by standard RDF actions.
Definition at line 1439 of file RDFInterface.hxx.
|
inline |
Save selected columns in memory.
[in] | columns | to be cached in memory |
The content of the selected columns is saved in memory exploiting the functionality offered by the Take action. No extra copy is carried out when serving cached data to the actions and transformations requesting it.
Definition at line 503 of file RDFInterface.hxx.
|
inline |
Save selected columns in memory.
[in] | columns | to be cached in memory |
The content of the selected columns is saved in memory exploiting the functionality offered by the Take action. No extra copy is carried out when serving cached data to the actions and transformations requesting it.
Definition at line 516 of file RDFInterface.hxx.
|
inline |
Save selected columns in memory.
[in] | a | regular expression to select the columns |
The existing columns are matched against the regeular expression. If the string provided is empty, all columns are selected.
Definition at line 567 of file RDFInterface.hxx.
|
inlineprivate |
Implementation of cache.
Definition at line 1722 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1475 of file RDFInterface.hxx.
|
inline |
Return the number of entries processed (lazy action)
Useful e.g. for counting the number of entries passing a certain filter (see also Report
). This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 716 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1542 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1562 of file RDFInterface.hxx.
|
inline |
Creates a custom column.
[in] | name | The name of the custom column. |
[in] | expression | Function, lambda expression, functor class or any other callable object producing the temporary value. Returns the value that will be assigned to the custom column. |
[in] | columns | Names of the columns/branches in input to the producer function. |
Create a custom column that will be visible from all subsequent nodes of the functional chain. The expression
is only evaluated for entries that pass all the preceding filters. A new variable is created called name
, accessible as if it was contained in the dataset from subsequent transformations/actions.
Use cases include:
An exception is thrown if the name of the new column is already in use.
Definition at line 266 of file RDFInterface.hxx.
|
inline |
Creates a custom column.
[in] | name | The name of the custom column. |
[in] | expression | An expression in C++ which represents the temporary value |
The expression is just-in-time compiled and used to produce the column entries. It must be valid C++ syntax in which variable names are substituted with the names of branches/columns.
Refer to the first overload of this method for the full documentation.
Definition at line 339 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1587 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1638 of file RDFInterface.hxx.
|
inline |
Creates a custom column with a value dependent on the processing slot.
[in] | name | The name of the custom column. |
[in] | expression | Function, lambda expression, functor class or any other callable object producing the temporary value. Returns the value that will be assigned to the custom column. |
[in] | columns | Names of the columns/branches in input to the producer function (excluding the slot number). |
This alternative implementation of Define
is meant as a helper in writing thread-safe custom columns. The expression must be a callable of signature R(unsigned int, T1, T2, ...) where T1, T2...
are the types of the columns that the expression takes as input. The first parameter is reserved for an unsigned integer representing a "slot number". RDataFrame guarantees that different threads will invoke the expression with different slot numbers - slot numbers will range from zero to ROOT::GetImplicitMTPoolSize()-1.
The following two calls are equivalent, although DefineSlot
is slightly more performant:
See Define for more information.
Definition at line 294 of file RDFInterface.hxx.
|
inline |
Creates a custom column with a value dependent on the processing slot and the current entry.
[in] | name | The name of the custom column. |
[in] | expression | Function, lambda expression, functor class or any other callable object producing the temporary value. Returns the value that will be assigned to the custom column. |
[in] | columns | Names of the columns/branches in input to the producer function (excluding slot and entry). |
This alternative implementation of Define
is meant as a helper in writing entry-specific, thread-safe custom columns. The expression must be a callable of signature R(unsigned int, ULong64_t, T1, T2, ...) where T1, T2...
are the types of the columns that the expression takes as input. The first parameter is reserved for an unsigned integer representing a "slot number". RDataFrame guarantees that different threads will invoke the expression with different slot numbers - slot numbers will range from zero to ROOT::GetImplicitMTPoolSize()-1. The second parameter is reserved for a ULong64_t
representing the current entry being processed by the current thread.
The following two Define
s are equivalent, although DefineSlotEntry
is slightly more performant:
See Define for more information.
Definition at line 323 of file RDFInterface.hxx.
|
inline |
Return an object of type T on which T::Fill
will be called once per event (lazy action)
T must be a type that provides a copy- or move-constructor and a T::Fill
method that takes as many arguments as the column names pass as columnList. The arguments of T::Fill
must have type equal to the one of the specified columns (these types are passed as template parameters to this method).
FirstColumn | The first type of the column the values of which are used to fill the object. |
OtherColumns | A list of the other types of the columns the values of which are used to fill the object. |
T | The type of the object to fill. Automatically deduced. |
[in] | model | The model to be considered to build the new return value. |
[in] | columnList | A list containing the names of the columns that will be passed when calling Fill |
The user gives up ownership of the model object. The list of column names to be used for filling must always be specified. This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1154 of file RDFInterface.hxx.
|
inline |
Return an object of type T on which T::Fill
will be called once per event (lazy action)
This overload infers the types of the columns specified in columnList at runtime and just-in-time compiles the method with these types. See previous overload for more information.
T | The type of the object to fill. Automatically deduced. |
[in] | model | The model to be considered to build the new return value. |
[in] | columnList | The name of the columns read to fill the object. |
This overload of Fill
infers the type of the specified columns at runtime and just-in-time compiles the previous overload. Check the previous overload for more details on Fill
.
Definition at line 1175 of file RDFInterface.hxx.
|
inline |
Append a filter to the call graph.
[in] | f | Function, lambda expression, functor class or any other callable object. It must return a bool signalling whether the event has passed the selection (true) or not (false). |
[in] | columns | Names of the columns/branches in input to the filter function. |
[in] | name | Optional name of this filter. See Report . |
Append a filter node at the point of the call graph corresponding to the object this method is called on. The callable f
should not have side-effects (e.g. modification of an external or static variable) to ensure correct results when implicit multi-threading is active.
RDataFrame only evaluates filters when necessary: if multiple filters are chained one after another, they are executed in order and the first one returning false causes the event to be discarded. Even if multiple actions or transformations depend on the same filter, it is executed once per entry. If its result is requested more than once, the cached result is served.
Definition at line 170 of file RDFInterface.hxx.
|
inline |
Append a filter to the call graph.
[in] | f | Function, lambda expression, functor class or any other callable object. It must return a bool signalling whether the event has passed the selection (true) or not (false). |
[in] | name | Optional name of this filter. See Report . |
Refer to the first overload of this method for the full documentation.
Definition at line 194 of file RDFInterface.hxx.
|
inline |
Append a filter to the call graph.
[in] | f | Function, lambda expression, functor class or any other callable object. It must return a bool signalling whether the event has passed the selection (true) or not (false). |
[in] | columns | Names of the columns/branches in input to the filter function. |
Refer to the first overload of this method for the full documentation.
Definition at line 209 of file RDFInterface.hxx.
|
inline |
Append a filter to the call graph.
[in] | expression | The filter expression in C++ |
[in] | name | Optional name of this filter. See Report . |
The expression is just-in-time compiled and used to filter entries. It must be valid C++ syntax in which variable names are substituted with the names of branches/columns.
Refer to the first overload of this method for the full documentation.
Definition at line 224 of file RDFInterface.hxx.
|
inline |
Execute a user-defined function on each entry (instant action)
[in] | f | Function, lambda expression, functor class or any other callable object performing user defined calculations. |
[in] | columns | Names of the columns/branches in input to the user function. |
The callable f
is invoked once per entry. This is an instant action: upon invocation, an event loop as well as execution of all scheduled actions is triggered. Users are responsible for the thread-safety of this callable when executing with implicit multi-threading enabled (i.e. ROOT::EnableImplicitMT).
Definition at line 622 of file RDFInterface.hxx.
|
inline |
Execute a user-defined function requiring a processing slot index on each entry (instant action)
[in] | f | Function, lambda expression, functor class or any other callable object performing user defined calculations. |
[in] | columns | Names of the columns/branches in input to the user function. |
Same as Foreach
, but the user-defined function takes an extra unsigned int
as its first parameter, the processing slot index. This slot index will be assigned a different value, 0
to poolSize - 1
, for each thread of execution. This is meant as a helper in writing thread-safe Foreach
actions when using RDataFrame
after ROOT::EnableImplicitMT()
. The user-defined processing callable is able to follow different streams of processing indexed by the first parameter. ForeachSlot
works just as well with single-thread execution: in that case slot
will always be 0
.
Definition at line 647 of file RDFInterface.hxx.
|
inline |
Returns the names of the available columns.
This is not an action nor a transformation, just a simple utility to get columns names out of the RDataFrame nodes.
Definition at line 1307 of file RDFInterface.hxx.
|
inlineprotected |
Get the RLoopManager if reachable. If not, throw.
Definition at line 1751 of file RDFInterface.hxx.
|
inlinestaticprivate |
Return string containing fully qualified type name of the node pointed by fProxied.
The method is only defined for RInterface<{RFilterBase,RCustomColumnBase,RRangeBase,RLoopManager}> as it should only be called on "upcast" RInterfaces.
|
inlineprivate |
Definition at line 1787 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1793 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1799 of file RDFInterface.hxx.
|
inlineprivate |
Definition at line 1805 of file RDFInterface.hxx.
|
inlineprotected |
Definition at line 1768 of file RDFInterface.hxx.
|
inlineprotected |
Prepare the call to the GetValidatedColumnNames routine, making sure that GetBranchNames, which is expensive in terms of runtime, is called at most once.
Definition at line 1772 of file RDFInterface.hxx.
|
inline |
Fill and return a one-dimensional histogram with the values of a column (lazy action)
V | The type of the column used to fill the histogram. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | vName | The name of the column that will fill the histogram. |
Columns can be of a container type (e.g. std::vector<double>
), in which case the histogram is filled with each one of the elements of the container. In case multiple columns of container type are provided (e.g. values and weights) they must have the same length for each one of the events (but possibly different lengths between events). This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model histogram.
Definition at line 770 of file RDFInterface.hxx.
|
inline |
Definition at line 786 of file RDFInterface.hxx.
|
inline |
Fill and return a one-dimensional histogram with the weighted values of a column (lazy action)
V | The type of the column used to fill the histogram. |
W | The type of the column used as weights. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | vName | The name of the column that will fill the histogram. |
[in] | wName | The name of the column that will provide the weights. |
See the description of the first Histo1D overload for more details.
Definition at line 801 of file RDFInterface.hxx.
|
inline |
Fill and return a one-dimensional histogram with the weighted values of a column (lazy action)
V | The type of the column used to fill the histogram. |
W | The type of the column used as weights. |
[in] | vName | The name of the column that will fill the histogram. |
[in] | wName | The name of the column that will provide the weights. |
This overload uses a default model histogram TH1D("", "", 128u, 0., 0.). See the description of the first Histo1D overload for more details.
Definition at line 825 of file RDFInterface.hxx.
|
inline |
Fill and return a one-dimensional histogram with the weighted values of a column (lazy action)
V | The type of the column used to fill the histogram. |
W | The type of the column used as weights. |
[in] | model | The returned histogram will be constructed using this as a model. |
This overload will use the first two default columns as column names. See the description of the first Histo1D overload for more details.
Definition at line 839 of file RDFInterface.hxx.
|
inline |
Fill and return a two-dimensional histogram (lazy action)
V1 | The type of the column used to fill the x axis of the histogram. |
V2 | The type of the column used to fill the y axis of the histogram. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
Columns can be of a container type (e.g. std::vector<double>), in which case the histogram is filled with each one of the elements of the container. In case multiple columns of container type are provided (e.g. values and weights) they must have the same length for each one of the events (but possibly different lengths between events). This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model histogram.
Definition at line 860 of file RDFInterface.hxx.
|
inline |
Fill and return a weighted two-dimensional histogram (lazy action)
V1 | The type of the column used to fill the x axis of the histogram. |
V2 | The type of the column used to fill the y axis of the histogram. |
W | The type of the column used for the weights of the histogram. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
[in] | wName | The name of the column that will provide the weights. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model histogram.
Definition at line 893 of file RDFInterface.hxx.
|
inline |
Definition at line 911 of file RDFInterface.hxx.
|
inline |
Fill and return a three-dimensional histogram (lazy action)
V1 | The type of the column used to fill the x axis of the histogram. Inferred if not present. |
V2 | The type of the column used to fill the y axis of the histogram. Inferred if not present. |
V3 | The type of the column used to fill the z axis of the histogram. Inferred if not present. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
[in] | v3Name | The name of the column that will fill the z axis. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model histogram.
Definition at line 931 of file RDFInterface.hxx.
|
inline |
Fill and return a three-dimensional histogram (lazy action)
V1 | The type of the column used to fill the x axis of the histogram. Inferred if not present. |
V2 | The type of the column used to fill the y axis of the histogram. Inferred if not present. |
V3 | The type of the column used to fill the z axis of the histogram. Inferred if not present. |
W | The type of the column used for the weights of the histogram. Inferred if not present. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
[in] | v3Name | The name of the column that will fill the z axis. |
[in] | wName | The name of the column that will provide the weights. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model histogram.
Definition at line 966 of file RDFInterface.hxx.
|
inline |
Definition at line 985 of file RDFInterface.hxx.
|
inline |
Return the maximum of processed column values (lazy action)
T | The type of the branch/column. |
[in] | columnName | The name of the branch/column to be treated. |
If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. If the type of the column is inferred, the return type is double
, the type of the column otherwise.
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1216 of file RDFInterface.hxx.
|
inline |
Return the mean of processed column values (lazy action)
T | The type of the branch/column. |
[in] | columnName | The name of the branch/column to be treated. |
If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method.
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1235 of file RDFInterface.hxx.
|
inline |
Return the minimum of processed column values (lazy action)
T | The type of the branch/column. |
[in] | columnName | The name of the branch/column to be treated. |
If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. If the type of the column is inferred, the return type is double
, the type of the column otherwise.
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1196 of file RDFInterface.hxx.
|
inline |
Definition at line 121 of file RDFInterface.hxx.
|
default |
Copy-assignment operator for RInterface.
|
inline |
Fill and return a one-dimensional profile (lazy action)
V1 | The type of the column the values of which are used to fill the profile. Inferred if not present. |
V2 | The type of the column the values of which are used to fill the profile. Inferred if not present. |
[in] | model | The model to be considered to build the new return value. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model profile object.
Definition at line 1003 of file RDFInterface.hxx.
|
inline |
Fill and return a one-dimensional profile (lazy action)
V1 | The type of the column the values of which are used to fill the profile. Inferred if not present. |
V2 | The type of the column the values of which are used to fill the profile. Inferred if not present. |
W | The type of the column the weights of which are used to fill the profile. Inferred if not present. |
[in] | model | The model to be considered to build the new return value. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
[in] | wName | The name of the column that will provide the weights. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model profile object.
Definition at line 1037 of file RDFInterface.hxx.
|
inline |
Definition at line 1056 of file RDFInterface.hxx.
|
inline |
Fill and return a two-dimensional profile (lazy action)
V1 | The type of the column used to fill the x axis of the histogram. Inferred if not present. |
V2 | The type of the column used to fill the y axis of the histogram. Inferred if not present. |
V2 | The type of the column used to fill the z axis of the histogram. Inferred if not present. |
[in] | model | The returned profile will be constructed using this as a model. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
[in] | v3Name | The name of the column that will fill the z axis. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model profile.
Definition at line 1076 of file RDFInterface.hxx.
|
inline |
Fill and return a two-dimensional profile (lazy action)
V1 | The type of the column used to fill the x axis of the histogram. Inferred if not present. |
V2 | The type of the column used to fill the y axis of the histogram. Inferred if not present. |
V3 | The type of the column used to fill the z axis of the histogram. Inferred if not present. |
W | The type of the column used for the weights of the histogram. Inferred if not present. |
[in] | model | The returned histogram will be constructed using this as a model. |
[in] | v1Name | The name of the column that will fill the x axis. |
[in] | v2Name | The name of the column that will fill the y axis. |
[in] | v3Name | The name of the column that will fill the z axis. |
[in] | wName | The name of the column that will provide the weights. |
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation. The user gives up ownership of the model profile.
Definition at line 1112 of file RDFInterface.hxx.
|
inline |
Definition at line 1132 of file RDFInterface.hxx.
|
inline |
Creates a node that filters entries based on range: [begin, end)
[in] | begin | Initial entry number considered for this range. |
[in] | end | Final entry number (excluded) considered for this range. 0 means that the range goes until the end of the dataset. |
[in] | stride | Process one entry of the [begin, end) range every stride entries. Must be strictly greater than 0. |
Note that in case of previous Ranges and Filters the selected range refers to the transformed dataset. Ranges are only available if EnableImplicitMT has not been called. Multi-thread ranges are not supported.
Definition at line 583 of file RDFInterface.hxx.
|
inline |
Creates a node that filters entries based on range.
[in] | end | Final entry number (excluded) considered for this range. 0 means that the range goes until the end of the dataset. |
See the other Range overload for a detailed description.
Definition at line 607 of file RDFInterface.hxx.
|
inline |
Execute a user-defined reduce operation on the values of a column.
F | The type of the reduce callable. Automatically deduced. |
T | The type of the column to apply the reduction to. Automatically deduced. |
[in] | f | A callable with signature T(T,T) |
[in] | columnName | The column to be reduced. If omitted, the first default column is used instead. |
A reduction takes two values of a column and merges them into one (e.g. by summing them, taking the maximum, etc). This action performs the specified reduction operation on all processed column values, returning a single value of the same type. The callable f must satisfy the general requirements of a processing function besides having signature T(T,T)
where T
is the type of column columnName.
The returned reduced value of each thread (e.g. the initial value of a sum) is initialized to a default-constructed T object. This is commonly expected to be the neutral/identity element for the specific reduction operation f
(e.g. 0 for a sum, 1 for a product). If a default-constructed T does not satisfy this requirement, users should explicitly specify an initialization value for T by calling the appropriate Reduce
overload.
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 687 of file RDFInterface.hxx.
|
inline |
Execute a user-defined reduce operation on the values of a column.
F | The type of the reduce callable. Automatically deduced. |
T | The type of the column to apply the reduction to. Automatically deduced. |
[in] | f | A callable with signature T(T,T) |
[in] | columnName | The column to be reduced. If omitted, the first default column is used instead. |
[in] | redIdentity | The reduced object of each thread is initialised to this value. |
See the description of the first Reduce overload for more information.
Definition at line 705 of file RDFInterface.hxx.
|
inline |
Gather filtering statistics.
Calling Report
on the main RDataFrame
object gathers stats for all named filters in the call graph. Calling this method on a stored chain state (i.e. a graph node different from the first) gathers the stats for all named filters in the chain section between the original RDataFrame
and that node (included). Stats are gathered in the same order as the named filters have been added to the graph. A RResultPtr<RCutFlowReport> is returned to allow inspection of the effects cuts had.
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1282 of file RDFInterface.hxx.
|
inline |
Save selected columns to disk, in a new TTree treename
in file filename
.
BranchTypes | variadic list of branch/column types |
[in] | treename | The name of the output TTree |
[in] | filename | The name of the output TFile |
[in] | columnList | The list of names of the columns/branches to be written |
[in] | options | RSnapshotOptions struct with extra options to pass to TFile and TTree |
This function returns a RDataFrame
built with the output tree as a source.
Definition at line 391 of file RDFInterface.hxx.
|
inline |
Save selected columns to disk, in a new TTree treename
in file filename
.
[in] | treename | The name of the output TTree |
[in] | filename | The name of the output TFile |
[in] | columnList | The list of names of the columns/branches to be written |
[in] | options | RSnapshotOptions struct with extra options to pass to TFile and TTree |
This function returns a RDataFrame
built with the output tree as a source. The types of the columns are automatically inferred and do not need to be specified.
Definition at line 406 of file RDFInterface.hxx.
|
inline |
Save selected columns to disk, in a new TTree treename
in file filename
.
[in] | treename | The name of the output TTree |
[in] | filename | The name of the output TFile |
[in] | columnNameRegexp | The regular expression to match the column names to be selected. The presence of a '^' and a '$' at the end of the string is implicitly assumed if they are not specified. See the documentation of TRegexp for more details. An empty string signals the selection of all columns. |
[in] | options | RSnapshotOptions struct with extra options to pass to TFile and TTree |
This function returns a RDataFrame
built with the output tree as a source. The types of the columns are automatically inferred and do not need to be specified.
Definition at line 467 of file RDFInterface.hxx.
|
inline |
Save selected columns to disk, in a new TTree treename
in file filename
.
[in] | treename | The name of the output TTree |
[in] | filename | The name of the output TFile |
[in] | columnList | The list of names of the columns/branches to be written |
[in] | options | RSnapshotOptions struct with extra options to pass to TFile and TTree |
This function returns a RDataFrame
built with the output tree as a source. The types of the columns are automatically inferred and do not need to be specified.
Definition at line 486 of file RDFInterface.hxx.
|
inlineprivate |
Implementation of snapshot.
[in] | treename | The name of the TTree |
[in] | filename | The name of the TFile |
[in] | columnList | The list of names of the branches to be written The implementation exploits Foreach. The association of the addresses to the branches takes place at the first event. This is possible because since there are no copies, the address of the value passed by reference is the address pointing to the storage of the read/created object in/by the TTreeReaderValue/TemporaryBranch |
Definition at line 1656 of file RDFInterface.hxx.
|
inline |
Return the sum of processed column values (lazy action)
T | The type of the branch/column. |
[in] | columnName | The name of the branch/column. |
[in] | initValue | Optional initial value for the sum. If not present, the column values must be default-constructible. |
If T is not specified, RDataFrame will infer it from the data and just-in-time compile the correct template specialization of this method. If the type of the column is inferred, the return type is double
, the type of the column otherwise.
This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 1257 of file RDFInterface.hxx.
|
inline |
Return a collection of values of a column (lazy action, returns a std::vector by default)
T | The type of the column. |
COLL | The type of collection used to store the values. |
[in] | column | The name of the column to collect the values of. |
The collection type to be specified for C-style array columns is RVec<T>
. This action is lazy: upon invocation of this method the calculation is booked but not executed. See RResultPtr documentation.
Definition at line 738 of file RDFInterface.hxx.
|
friend |
Definition at line 108 of file RDFInterface.hxx.
|
friend |
|
private |
Cache of the chain columns names.
Definition at line 115 of file RDFInterface.hxx.
|
private |
Non-owning pointer to a data-source object. Null if no data-source. RLoopManager has ownership of the object.
Definition at line 114 of file RDFInterface.hxx.
|
private |
Weak pointer to the RLoopManager at the root of the graph.
Definition at line 111 of file RDFInterface.hxx.
|
private |
Smart pointer to the graph node encapsulated by this RInterface.
Definition at line 110 of file RDFInterface.hxx.
|
private |
Names of columns Define
d for this branch of the functional graph.
Definition at line 112 of file RDFInterface.hxx.