{
"cells": [
{
"cell_type": "markdown",
"id": "c758264a",
"metadata": {},
"source": [
"# rf404_categories\n",
"Data and categories: working with RooCategory objects to describe discrete variables\n",
"\n",
"\n",
"\n",
"\n",
"**Author:** Wouter Verkerke \n",
"This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, March 19, 2024 at 07:16 PM."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b3c8e68a",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:10.651587Z",
"iopub.status.busy": "2024-03-19T19:16:10.651210Z",
"iopub.status.idle": "2024-03-19T19:16:10.716671Z",
"shell.execute_reply": "2024-03-19T19:16:10.708724Z"
}
},
"outputs": [],
"source": [
"%%cpp -d\n",
"#include \"RooRealVar.h\"\n",
"#include \"RooDataSet.h\"\n",
"#include \"RooPolynomial.h\"\n",
"#include \"RooCategory.h\"\n",
"#include \"Roo1DTable.h\"\n",
"#include \"RooGaussian.h\"\n",
"#include \"TCanvas.h\"\n",
"#include \"TAxis.h\"\n",
"#include \"RooPlot.h\"\n",
"#include \n",
"using namespace RooFit;"
]
},
{
"cell_type": "markdown",
"id": "be914839",
"metadata": {},
"source": [
"Construct a category with labels\n",
"----------------------------------------------------------------"
]
},
{
"cell_type": "markdown",
"id": "ef11fa4a",
"metadata": {},
"source": [
"Define a category with labels only"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a14110a4",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:10.761991Z",
"iopub.status.busy": "2024-03-19T19:16:10.723398Z",
"iopub.status.idle": "2024-03-19T19:16:11.604052Z",
"shell.execute_reply": "2024-03-19T19:16:11.602427Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"RooCategory::tagCat = Lepton(idx = 0)\n",
"\n"
]
}
],
"source": [
"RooCategory tagCat(\"tagCat\", \"Tagging category\");\n",
"tagCat.defineType(\"Lepton\");\n",
"tagCat.defineType(\"Kaon\");\n",
"tagCat.defineType(\"NetTagger-1\");\n",
"tagCat.defineType(\"NetTagger-2\");\n",
"tagCat.Print();"
]
},
{
"cell_type": "markdown",
"id": "ed82857f",
"metadata": {},
"source": [
"Construct a category with labels and indices\n",
"----------------------------------------------------------------------------------------"
]
},
{
"cell_type": "markdown",
"id": "96fed657",
"metadata": {},
"source": [
"Define a category with explicitly numbered states"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "203664f5",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:11.611412Z",
"iopub.status.busy": "2024-03-19T19:16:11.610826Z",
"iopub.status.idle": "2024-03-19T19:16:11.876753Z",
"shell.execute_reply": "2024-03-19T19:16:11.870607Z"
}
},
"outputs": [],
"source": [
"RooCategory b0flav(\"b0flav\", \"B0 flavour eigenstate\");\n",
"b0flav[\"B0\"] = -1;\n",
"b0flav[\"B0bar\"] = 1;"
]
},
{
"cell_type": "markdown",
"id": "4d966fc6",
"metadata": {},
"source": [
"Print it in \"verbose\" mode to see all states."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "06d18641",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:11.885369Z",
"iopub.status.busy": "2024-03-19T19:16:11.884909Z",
"iopub.status.idle": "2024-03-19T19:16:12.105753Z",
"shell.execute_reply": "2024-03-19T19:16:12.104702Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--- RooAbsArg ---\n",
" Value State: clean\n",
" Shape State: clean\n",
" Attributes: \n",
" Address: 0x7f6628017000\n",
" Clients: \n",
" Servers: \n",
" Proxies: \n",
"--- RooAbsCategory ---\n",
" Value = -1 \"B0)\n",
" Possible states:\n",
" B0\t-1\n",
" B0bar\t1\n"
]
}
],
"source": [
"b0flav.Print(\"V\");"
]
},
{
"cell_type": "markdown",
"id": "32e20663",
"metadata": {},
"source": [
"Alternatively, define many states at once. The function takes\n",
"a map with std::string --> index mapping."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "43218d61",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:12.117914Z",
"iopub.status.busy": "2024-03-19T19:16:12.117541Z",
"iopub.status.idle": "2024-03-19T19:16:12.521584Z",
"shell.execute_reply": "2024-03-19T19:16:12.514738Z"
}
},
"outputs": [],
"source": [
"RooCategory largeCat(\"largeCat\", \"A category with many states\");\n",
"largeCat.defineTypes({\n",
" {\"A\", 0}, {\"b\", 2}, {\"c\", 8}, {\"dee\", 4},\n",
" {\"F\", 133}, {\"g\", 15}, {\"H\", -20}\n",
"});"
]
},
{
"cell_type": "markdown",
"id": "b1b8a64b",
"metadata": {},
"source": [
"Iterate, query and set states\n",
"--------------------------------------------------------"
]
},
{
"cell_type": "markdown",
"id": "02749476",
"metadata": {},
"source": [
"One can iterate through the {index,name} pair of category objects"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3b95d8b0",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:12.544102Z",
"iopub.status.busy": "2024-03-19T19:16:12.543713Z",
"iopub.status.idle": "2024-03-19T19:16:13.023383Z",
"shell.execute_reply": "2024-03-19T19:16:13.021992Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"This is the for loop over states of 'largeCat':\n",
"\tA\t0\n",
"\tF\t133\n",
"\tH\t-20\n",
"\tb\t2\n",
"\tc\t8\n",
"\tdee\t4\n",
"\tg\t15\n",
"\n"
]
}
],
"source": [
"std::cout << \"\\nThis is the for loop over states of 'largeCat':\";\n",
"for (const auto& idxAndName : largeCat)\n",
" std::cout << \"\\n\\t\" << idxAndName.first << \"\\t\" << idxAndName.second;\n",
"std::cout << '\\n' << std::endl;"
]
},
{
"cell_type": "markdown",
"id": "8698ccd3",
"metadata": {},
"source": [
"To ask whether a state is valid use:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "84ab86ed",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:13.033908Z",
"iopub.status.busy": "2024-03-19T19:16:13.033470Z",
"iopub.status.idle": "2024-03-19T19:16:13.383465Z",
"shell.execute_reply": "2024-03-19T19:16:13.382328Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Has label 'A': 1\n",
"Has index '-20': 1"
]
}
],
"source": [
"std::cout << \"Has label 'A': \" << largeCat.hasLabel(\"A\");\n",
"std::cout << \"\\nHas index '-20': \" << largeCat.hasIndex(-20);"
]
},
{
"cell_type": "markdown",
"id": "d20bc89b",
"metadata": {},
"source": [
"To retrieve names or state numbers:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "92a89751",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:13.412958Z",
"iopub.status.busy": "2024-03-19T19:16:13.412555Z",
"iopub.status.idle": "2024-03-19T19:16:13.648449Z",
"shell.execute_reply": "2024-03-19T19:16:13.631030Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Label corresponding to '2' is b\n",
"Index corresponding to 'A' is 0"
]
}
],
"source": [
"std::cout << \"\\nLabel corresponding to '2' is \" << largeCat.lookupName(2);\n",
"std::cout << \"\\nIndex corresponding to 'A' is \" << largeCat.lookupIndex(\"A\");"
]
},
{
"cell_type": "markdown",
"id": "1ff1dd86",
"metadata": {},
"source": [
"To get the current state:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "3e92b536",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:13.653093Z",
"iopub.status.busy": "2024-03-19T19:16:13.652748Z",
"iopub.status.idle": "2024-03-19T19:16:13.985622Z",
"shell.execute_reply": "2024-03-19T19:16:13.984401Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Current index is 0\n",
"Current label is A\n"
]
}
],
"source": [
"std::cout << \"\\nCurrent index is \" << largeCat.getCurrentIndex();\n",
"std::cout << \"\\nCurrent label is \" << largeCat.getCurrentLabel();\n",
"std::cout << std::endl;"
]
},
{
"cell_type": "markdown",
"id": "0dbf77cb",
"metadata": {},
"source": [
"To set the state, use one of the two:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "8ae41154",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:13.990632Z",
"iopub.status.busy": "2024-03-19T19:16:13.990234Z",
"iopub.status.idle": "2024-03-19T19:16:14.224601Z",
"shell.execute_reply": "2024-03-19T19:16:14.203239Z"
}
},
"outputs": [],
"source": [
"largeCat.setIndex(8);\n",
"largeCat.setLabel(\"c\");"
]
},
{
"cell_type": "markdown",
"id": "5a26f285",
"metadata": {},
"source": [
"Generate dummy data for tabulation demo\n",
"----------------------------------------------------------------------------"
]
},
{
"cell_type": "markdown",
"id": "6a015812",
"metadata": {},
"source": [
"Generate a dummy dataset"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "8ecd5a25",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:14.238118Z",
"iopub.status.busy": "2024-03-19T19:16:14.237743Z",
"iopub.status.idle": "2024-03-19T19:16:14.714726Z",
"shell.execute_reply": "2024-03-19T19:16:14.712318Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_57:3:1: warning: 'data' shadows a declaration with the same name in the 'std' namespace; use '::data' to reference this declaration\n",
"std::unique_ptr data{RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)};\n",
"^\n"
]
}
],
"source": [
"RooRealVar x(\"x\", \"x\", 0, 10);\n",
"std::unique_ptr data{RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)};"
]
},
{
"cell_type": "markdown",
"id": "cc36bcad",
"metadata": {},
"source": [
"Print tables of category contents of datasets\n",
"------------------------------------------------------------------------------------------"
]
},
{
"cell_type": "markdown",
"id": "060ce175",
"metadata": {},
"source": [
"Tables are equivalent of plots for categories"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "4fbcc8bb",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:14.755519Z",
"iopub.status.busy": "2024-03-19T19:16:14.755083Z",
"iopub.status.idle": "2024-03-19T19:16:14.979492Z",
"shell.execute_reply": "2024-03-19T19:16:14.978210Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_58:2:23: error: reference to 'data' is ambiguous\n",
" Roo1DTable *btable = data->table(b0flav);\n",
" ^\n",
"input_line_57:3:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)};\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n"
]
}
],
"source": [
"Roo1DTable *btable = data->table(b0flav);\n",
"btable->Print();\n",
"btable->Print(\"v\");"
]
},
{
"cell_type": "markdown",
"id": "45bbc3f6",
"metadata": {},
"source": [
"Create table for subset of events matching cut expression"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "5d8512a9",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:15.033084Z",
"iopub.status.busy": "2024-03-19T19:16:15.032699Z",
"iopub.status.idle": "2024-03-19T19:16:15.268068Z",
"shell.execute_reply": "2024-03-19T19:16:15.259026Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_59:2:23: error: reference to 'data' is ambiguous\n",
" Roo1DTable *ttable = data->table(tagCat, \"x>8.23\");\n",
" ^\n",
"input_line_57:3:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)};\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n"
]
}
],
"source": [
"Roo1DTable *ttable = data->table(tagCat, \"x>8.23\");\n",
"ttable->Print();\n",
"ttable->Print(\"v\");"
]
},
{
"cell_type": "markdown",
"id": "d87a267c",
"metadata": {},
"source": [
"Create table for all (tagCat x b0flav) state combinations"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "83eb620d",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:15.299872Z",
"iopub.status.busy": "2024-03-19T19:16:15.299506Z",
"iopub.status.idle": "2024-03-19T19:16:15.533549Z",
"shell.execute_reply": "2024-03-19T19:16:15.532394Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_60:2:24: error: reference to 'data' is ambiguous\n",
" Roo1DTable *bttable = data->table(RooArgSet(tagCat, b0flav));\n",
" ^\n",
"input_line_57:3:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)};\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n"
]
}
],
"source": [
"Roo1DTable *bttable = data->table(RooArgSet(tagCat, b0flav));\n",
"bttable->Print(\"v\");"
]
},
{
"cell_type": "markdown",
"id": "4376ccfa",
"metadata": {},
"source": [
"Retrieve number of events from table\n",
"Number can be non-integer if source dataset has weighed events"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "b9380642",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:15.544390Z",
"iopub.status.busy": "2024-03-19T19:16:15.544002Z",
"iopub.status.idle": "2024-03-19T19:16:15.788446Z",
"shell.execute_reply": "2024-03-19T19:16:15.773227Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_62:2:3: error: use of undeclared identifier 'btable'\n",
" (btable->get(\"B0\"))\n",
" ^\n",
"Error in : Error evaluating expression (btable->get(\"B0\"))\n",
"Execution of your code was aborted.\n"
]
}
],
"source": [
"double nb0 = btable->get(\"B0\");\n",
"std::cout << \"Number of events with B0 flavor is \" << nb0 << std::endl;"
]
},
{
"cell_type": "markdown",
"id": "d3a5231c",
"metadata": {},
"source": [
"Retrieve fraction of events with \"Lepton\" tag"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "ec260a54",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:15.804377Z",
"iopub.status.busy": "2024-03-19T19:16:15.803900Z",
"iopub.status.idle": "2024-03-19T19:16:16.024434Z",
"shell.execute_reply": "2024-03-19T19:16:16.019190Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_64:2:3: error: use of undeclared identifier 'ttable'\n",
" (ttable->getFrac(\"Lepton\"))\n",
" ^\n",
"Error in : Error evaluating expression (ttable->getFrac(\"Lepton\"))\n",
"Execution of your code was aborted.\n"
]
}
],
"source": [
"double fracLep = ttable->getFrac(\"Lepton\");\n",
"std::cout << \"Fraction of events tagged with Lepton tag is \" << fracLep << std::endl;"
]
},
{
"cell_type": "markdown",
"id": "c1a8a2db",
"metadata": {},
"source": [
"Defining ranges for plotting, fitting on categories\n",
"------------------------------------------------------------------------------------------------------"
]
},
{
"cell_type": "markdown",
"id": "418f4097",
"metadata": {},
"source": [
"Define named range as comma separated list of labels"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "ee0c952a",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:16.043054Z",
"iopub.status.busy": "2024-03-19T19:16:16.042650Z",
"iopub.status.idle": "2024-03-19T19:16:16.285526Z",
"shell.execute_reply": "2024-03-19T19:16:16.268089Z"
}
},
"outputs": [],
"source": [
"tagCat.setRange(\"good\", \"Lepton,Kaon\");"
]
},
{
"cell_type": "markdown",
"id": "3c45080f",
"metadata": {},
"source": [
"Or add state names one by one"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "0d8c41b5",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:16.307600Z",
"iopub.status.busy": "2024-03-19T19:16:16.306953Z",
"iopub.status.idle": "2024-03-19T19:16:16.528451Z",
"shell.execute_reply": "2024-03-19T19:16:16.527216Z"
}
},
"outputs": [],
"source": [
"tagCat.addToRange(\"soso\", \"NetTagger-1\");\n",
"tagCat.addToRange(\"soso\", \"NetTagger-2\");"
]
},
{
"cell_type": "markdown",
"id": "8d092812",
"metadata": {},
"source": [
"Use category range in dataset reduction specification"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "fe712113",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:16:16.533460Z",
"iopub.status.busy": "2024-03-19T19:16:16.533053Z",
"iopub.status.idle": "2024-03-19T19:16:16.780845Z",
"shell.execute_reply": "2024-03-19T19:16:16.767539Z"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"input_line_67:2:39: error: reference to 'data' is ambiguous\n",
" std::unique_ptr goodData{data->reduce(CutRange(\"good\"))};\n",
" ^\n",
"input_line_57:3:29: note: candidate found by name lookup is 'data'\n",
"std::unique_ptr data{RooPolynomial(\"p\", \"p\", x).generate({x, b0flav, tagCat}, 10000)};\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'\n",
" data(initializer_list<_Tp> __il) noexcept\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'\n",
" data(const _Container& __cont) noexcept(noexcept(__cont.data()))\n",
" ^\n",
"/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'\n",
" data(_Tp (&__array)[_Nm]) noexcept\n",
" ^\n"
]
}
],
"source": [
"std::unique_ptr goodData{data->reduce(CutRange(\"good\"))};\n",
"static_cast(*goodData).table(tagCat)->Print(\"v\");"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ROOT C++",
"language": "c++",
"name": "root"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"file_extension": ".C",
"mimetype": " text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 5
}