ROOT/TMVA SOFIE (___System for Optimized Fast Inference code Emit___) generates C++ functions easily invokable for the fast inference of trained neural network models. It takes ONNX model files as inputs and produces C++ header files that can be included and utilized in a “plug-and-go” style.
This is a new development in TMVA and is currently in early experimental stage. Bug reports and suggestions for improvements are warmly welcomed.
Build ROOT with the cmake option tmva-sofie enabled.
SOFIE works in a parser-generator working architecture. With SOFIE, the user gets an ONNX, Keras and a PyTorch parser for translating models in respective formats into SOFIE's internal representation.
From ROOT command line, or in a ROOT macro, we can proceed with an ONNX model:
And an C++ header file and a .dat file containing the model weights will be generated. You can also use
to check the required size and type of input tensor for that particular model, and use
to check the tensors (weights) already included in the model.
To use the generated inference code:
With the default settings, the weights are contained in a separate binary file, but if the user instead wants them to be in the generated header file itself, they can use approproiate generation options.
Other such options includes Options::kNoSession (for not generating the Session class, and instead keeping the infer function independent). SOFIE also supports generating inference code with RDataFrame as inputs, refer to the tutorials below for examples.