{
"cells": [
{
"cell_type": "markdown",
"id": "0424e5e2",
"metadata": {},
"source": [
"# df012_DefinesAndFiltersAsStrings\n",
"Use just-in-time-compiled Filters and Defines for quick prototyping.\n",
"\n",
"This tutorial illustrates how to save some typing when using RDataFrame\n",
"by invoking functions that perform jit-compiling at runtime.\n",
"\n",
"\n",
"\n",
"\n",
"**Author:** Guilherme Amadio (CERN) \n",
"This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, March 19, 2024 at 07:06 PM."
]
},
{
"cell_type": "markdown",
"id": "10dd574a",
"metadata": {},
"source": [
"We will inefficiently calculate an approximation of pi by generating\n",
"some data, and doing very simple filtering and analysis on it."
]
},
{
"cell_type": "markdown",
"id": "f3d5be93",
"metadata": {},
"source": [
"We start by creating an empty dataframe where we will insert 10 million\n",
"random points in a square of side 2.0 (that is, with an inscribed circle\n",
"of radius 1.0)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "38c26857",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:06:57.349755Z",
"iopub.status.busy": "2024-03-19T19:06:57.349365Z",
"iopub.status.idle": "2024-03-19T19:06:58.341829Z",
"shell.execute_reply": "2024-03-19T19:06:58.340696Z"
}
},
"outputs": [],
"source": [
"size_t npoints = 10000000;\n",
"ROOT::RDataFrame df(npoints);"
]
},
{
"cell_type": "markdown",
"id": "46d415da",
"metadata": {},
"source": [
"Define what we want inside the dataframe. We do not need to define p as an array,\n",
"but we do it here to demonstrate how to use jitting with RDataFrame."
]
},
{
"cell_type": "markdown",
"id": "ec964021",
"metadata": {},
"source": [
"NOTE: Although it's possible to use \"for (auto&& x : p)\" below, it will\n",
"shadow the name of the data column \"x\", and may cause compilation failures\n",
"if the local variable and the data column are of different types, or the\n",
"local x variable is declared in the global scope of the lambda function."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "6e102a9e",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:06:58.364863Z",
"iopub.status.busy": "2024-03-19T19:06:58.364458Z",
"iopub.status.idle": "2024-03-19T19:06:58.800708Z",
"shell.execute_reply": "2024-03-19T19:06:58.799550Z"
}
},
"outputs": [],
"source": [
"auto pidf = df.Define(\"x\", \"gRandom->Uniform(-1.0, 1.0)\")\n",
" .Define(\"y\", \"gRandom->Uniform(-1.0, 1.0)\")\n",
" .Define(\"p\", \"std::array v{x, y}; return v;\")\n",
" .Define(\"r\", \"double r2 = 0.0; for (auto&& x : p) r2 += x*x; return sqrt(r2);\");"
]
},
{
"cell_type": "markdown",
"id": "5d4dbbfe",
"metadata": {},
"source": [
"Now we have a dataframe with columns x, y, p (which is a point based on x\n",
"and y), and the radius r = sqrt(x*x + y*y). In order to approximate pi, we\n",
"need to know how many of our data points fall inside the unit circle compared\n",
"with the total number of points. The ratio of the areas is\n",
"\n",
"A_circle / A_square = pi r*r / l * l, where r = 1.0, and l = 2.0\n",
"\n",
"Therefore, we can approximate pi with four times the number of points inside the\n",
"unit circle over the total number of points in our dataframe:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "6808b994",
"metadata": {
"collapsed": false,
"execution": {
"iopub.execute_input": "2024-03-19T19:06:58.825717Z",
"iopub.status.busy": "2024-03-19T19:06:58.825292Z",
"iopub.status.idle": "2024-03-19T19:07:06.009862Z",
"shell.execute_reply": "2024-03-19T19:07:06.008537Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"pi is approximately equal to 3.14146\n"
]
}
],
"source": [
"auto incircle = *(pidf.Filter(\"r <= 1.0\").Count());\n",
"\n",
"double pi_approx = 4.0 * incircle / npoints;\n",
"\n",
"std::cout << \"pi is approximately equal to \" << pi_approx << std::endl;"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "ROOT C++",
"language": "c++",
"name": "root"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"file_extension": ".C",
"mimetype": " text/x-c++src",
"name": "c++"
}
},
"nbformat": 4,
"nbformat_minor": 5
}