Schema evolution is the capability of the ROOT I/O to read data into in-memory models that are different but compatible to the on-disk schema.
Schema evolution allows for data models to evolve over time such that old data can be read into current models ("backward compatibility") and old software can read newer data models ("forward compatibility"). For instance, data model authors may over time add and reorder class members, change data types (e.g. std::vector<float> --> ROOT::RVec<double>), rename classes, etc.
ROOT applies automatic schema evolution rules for common, safe and unambiguous cases. Users can complement the automatic rules by manual schema evolution ("I/O customization rules") where custom code snippets implement the transformation logic. In case neither automatic nor any of the provided I/O customization rules suffice to transform the on-disk schema into the in-memory model, ROOT will error out and refrain from reading data.
This document describes schema evolution support implemented in RNTuple. For the most part, schema evolution works identical across the different ROOT I/O systems (TFile, TTree, RNTuple). The exceptions are listed in the last section of this document.
ROOT applies a number of rules to read data transparently into in-memory models that are not an exact match to the on-disk schema. The automatic rules apply recursively to compound types (classes, tuples, collections, etc.); the outer types are evolved before the inner types.
Automatic schema evolution rules transform native types as well as the shape of user-defined classes as listed in the following, exhaustive tables.
User-defined classes can automatically evolve their layout in the following ways. Note that users should increase the class version number when the layout changes.
| Layout Change | Comment |
|---|---|
| Remove member | Match by member name |
| Add member | Match by member name, new member default-initialized |
| Reorder members | Match by member name |
| Remove all base classes | |
| Add base class(es) where they were none | New base class members default initialized |
Reordering and incremental addition or removal of base classes is currently unsupported but may be supported in future RNTuple versions.
The class shape evolution also applies to untyped records. Note that untyped records cannot have base classes.
ROOT transparently reads into in-memory types that are different from but compatible to the on-disk type. In the following tables, ‘T’denotes a type that is compatible toT`. This includes user-defined types that are related via a renaming rule.
| In-memory type | Compatible on-disk types | Comment |
|---|---|---|
bool | char | |
std::[u]int[8,16,32,64]_t | ||
| enum | ||
| --------------------------— | --------------------------— | ----------------------— |
char | bool | |
std::[u]int[8,16,32,64]_t | with bounds check | |
| enum | with bounds check | |
| --------------------------— | --------------------------— | ----------------------— |
std::[u]int[8,16,32,64]_t | bool | |
char | ||
std::[u]int[8,16,32,64]_t | with bounds check | |
| enum | with bounds check | |
| --------------------------— | --------------------------— | ----------------------— |
| enum | enum of different type | with bounds check |
| on underlying integer | ||
| --------------------------— | --------------------------— | ----------------------— |
| float | double | with fp class check[^1] |
| --------------------------— | --------------------------— | ----------------------— |
| double | float | |
| --------------------------— | --------------------------— | ----------------------— |
std::atomic<T> | ‘T’` |
[^1]: The floating point class check ensures that the on-disk value and the in-memory value are of the same nature (NaN, +/-inf, zero, underflow, or normal value).
The different variable-length collections have the same on-disk representation and thus evolve naturally into one another. However, only those transformations that are guarantueed to work at runtime will be performed. For instance, a set can always be read as a vector but a vector does not necessarily fulfil the set property.
| In-memory type | Compatible on-disk types | Comment |
|---|---|---|
std::vector<T> | ‘ROOT::RVec<T’>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::array<T', N>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]set<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]map<K',V'>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::optional<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::unique_ptr<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> User-defined collection ofT'\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> Untyped collection ofT'\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> ---------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> -------------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> --------------------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone">ROOT::RVec<T>\ilinebr </td> <td class="markdownTableBodyNone">std::vector<T'>\ilinebr </td> <td class="markdownTableBodyNone"> with size check \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::array<T', N>\ilinebr </td> <td class="markdownTableBodyNone"> with size check \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]set<T'>\ilinebr </td> <td class="markdownTableBodyNone"> with size check \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]map<K',V'>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>, \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> with size check \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::optional<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::unique_ptr<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> User-defined collection ofT'\ilinebr </td> <td class="markdownTableBodyNone"> with size check \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> Untyped collectionofT'\ilinebr </td> <td class="markdownTableBodyNone"> with size check \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> ---------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> -------------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> --------------------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone">std::[unordered_]set<T>\ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_]set<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_]map<K',V'>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> ---------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> -------------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> --------------------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone">std::[unordered_]multiset<T>\ilinebr </td> <td class="markdownTableBodyNone">ROOT::RVec<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::vector<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::array<T', N>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]set<T'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]map<K',V'>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> User-defined collection ofT'\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> Untyped collection ofT'\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> ---------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> -------------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> --------------------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone">std::[unordered_]map<K,V>\ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_]map<K',V'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> ---------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> -------------------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> --------------------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone">std::[unordered_]multimap<K,V>\ilinebr </td> <td class="markdownTableBodyNone">ROOT::RVec<T>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::vector<T>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::array<T, N>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]set<T>\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">std::[unordered_][multi]map<K',V'>\ilinebr </td> <td class="markdownTableBodyNone"> \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> User-defined collection ofT\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>\ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone"> Untyped collection ofT\ilinebr </td> <td class="markdownTableBodyNone"> onlyT=std::[pair,tuple]<K,V>` |
There is no special automatic evolution for fixed-length collections (std::array<...>, std::bitset<...>). The length of the array must not change and there is no automatic transformation from variable-length to fixed-length collections. C style arrays and std::array<...> of the same type and length can be used interchangibly.
| In-memory type | Compatible on-disk types |
|---|---|
std::optional<T> | ‘std::unique_ptr<T’>\ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">T'\ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> ---------------------- \ilinebr </td> <td class="markdownTableBodyNone"> -------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone">std::unique_ptr<T>\ilinebr </td> <td class="markdownTableBodyNone">std::optional<T'>\ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone"> \ilinebr </td> <td class="markdownTableBodyNone">T'` |
| In-memory type | Compatible on-disk types |
|---|---|
std::pair<T,U> | ‘std::tuple<T’,U'>\ilinebr </td> </tr> <tr class="markdownTableRowEven"> <td class="markdownTableBodyNone"> ----------------------------- \ilinebr </td> <td class="markdownTableBodyNone"> ---------------------------------------- \ilinebr </td> </tr> <tr class="markdownTableRowOdd"> <td class="markdownTableBodyNone">std::tuple<T,U>\ilinebr </td> <td class="markdownTableBodyNone">std::pair<T',U'>` |
| --------------------------— | -------------------------------------— |
| Untyped record | User-defined class of compatible shape |
Note that for emulated classes, the in-memory untyped record is constructed from on-disk information.
All on-disk types ‘std::atomic<T’>can be read into aT` in-memory model.
If a class property changes from using an RNTuple streamer field to a using regular RNTuple class field, existing files with on-disk streamer fields will continue to read as streamer fields. This can be seen as "schema evolution out of streamer fields".
ROOT I/O customization rules allow for custom code handling the transformation from the on-disk schema to the in-memory model. Customization rules are part of the class dictionary. For the exact syntax of customization rules, please refer to the ROOT manual.
Generally, customization rules consist of
TClass::GetCheckSum().For illustration purposes, here is a concrete example of a customization rule
At runtime, for any given target member there must be at most be one applicable rule. A source member can be read into any type compatible to its on-disk type but any given source member can only be read into one type for a given target class (i.e. multiple rules for the same target/source class must not use different types for the same source member).
There are two special types of rules
Class rename rules (pure or not) are not transitive (if in-memory A can read from on-disk B and in-memory B can read from no-disk C, in-memory A can not automatically read from on-disk C).
Note that customization rules operate on partially read objects. Customization rules are executed after all members not subject to customization rules have been read from disk. Whole-object rules are executed after other rules. Otherwise, the scheduling of rules is unspecified.
The target members of I/O customization rules are exempt from automatic schema evolution (applies to the corresponding field of the target member and all its subfields). Otherwise, automatic and manual schema evolution work side by side. For instance, a renamed class is still subject to automatic schema evolution.
The source member of a customization rule is subject to the same automatic and manual schema evolution rules as if it was normally read, e.g. in an RNTupleView.
In contrast to RNTuple, TTree and TFile apply also the following automatic schema evolution rules
unique_ptr<T> --> ‘T’`