{ "cells": [ { "cell_type": "markdown", "id": "69b4d278", "metadata": {}, "source": [ "# testTMPIFile\n", "This macro shows the usage of TMPIFile to simulate event\n", "reconstruction and merging them in parallel.\n", "The JetEvent class is in $ROOTSYS/tutorials/io/tree/JetEvent.h,cxx\n", "\n", "To run this macro do the following:\n", "```bash\n", "mpirun -np 4 root -b -l -q testTMPIFile.C\n", "```\n", "\n", "\n", "\n", "\n", "**Author:** Taylor Childers, Yunsong Wang \n", "This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, December 30, 2025 at 02:42 PM." ] }, { "cell_type": "markdown", "id": "826dc50e", "metadata": {}, "source": [ " Definition of a helper function: " ] }, { "cell_type": "code", "execution_count": 1, "id": "f34fa11e", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:40.848209Z", "iopub.status.busy": "2025-12-30T13:42:40.848074Z", "iopub.status.idle": "2025-12-30T13:42:40.851299Z", "shell.execute_reply": "2025-12-30T13:42:40.850917Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_48:2:10: fatal error: 'TMPIFile.h' file not found\n", "#include \"TMPIFile.h\"\n", " ^~~~~~~~~~~~\n" ] } ], "source": [ "%%cpp -d\n", "\n", "#include \"TMPIFile.h\"\n", "\n", "#ifdef TMPI_SECOND_RUN\n", "\n", "#include \n", "#include \n", "\n", "/* ---------------------------------------------------------------------------\n", "\n", "The idea of TMPIFile is to run N MPI ranks where some ranks are\n", "producing data (called workers), while other ranks are collecting data and\n", "writing it to disk (called collectors). The number of collectors can be\n", "configured and this should be optimized for each workflow and data size.\n", "\n", "This example uses a typical event processing loop, where every N events the\n", "TMPIFile::Sync() function is called. This call triggers the local TTree data\n", "to be sent via MPI to the collector rank where it is merged with all the\n", "other worker rank data and written to a TFile.\n", "\n", "An MPI Sub-Communictor is created for each collector which equally distributes\n", "the remaining ranks to distribute the workers among collectors.\n", "\n", "--------------------------------------------------------------------------- */\n", "\n", "void test_tmpi()\n", "{\n", "\n", " Int_t N_collectors = 2; // specify how many collectors to run\n", " Int_t sync_rate = 2; // workers sync every sync_rate events\n", " Int_t events_per_rank = 6; // total events each rank will produce then exit\n", " Int_t sleep_mean = 5; // simulate compute time for event processing\n", " Int_t sleep_sigma = 2; // variation in compute time\n", "\n", " // using JetEvent generator to create a data structure\n", " // these parameters control this generator\n", " Int_t jetm = 25;\n", " Int_t trackm = 60;\n", " Int_t hitam = 200;\n", " Int_t hitbm = 100;\n", "\n", " std::string treename = \"test_tmpi\";\n", " std::string branchname = \"event\";\n", "\n", " // set output filename\n", " std::stringstream smpifname;\n", " smpifname << \"/tmp/merged_output_\" << getpid() << \".root\";\n", "\n", " // Create new TMPIFile, passing the filename, setting read/write permissions\n", " // and setting the number of collectors.\n", " // If MPI_INIT has not been called already, the constructor of TMPIFile\n", " // will call this.\n", " TMPIFile *newfile = new TMPIFile(smpifname.str().c_str(), \"RECREATE\", N_collectors);\n", " // set random number seed that is based on MPI rank\n", " // this avoids producing the same events in each MPI rank\n", " gRandom->SetSeed(gRandom->GetSeed() + newfile->GetMPIGlobalRank());\n", "\n", " // only print log messages in MPI Rank 0\n", " if (newfile->GetMPIGlobalRank() == 0) {\n", " Info(\"test_tmpi\", \" running with parallel ranks: %d\", newfile->GetMPIGlobalSize());\n", " Info(\"test_tmpi\", \" running with collecting ranks: %d\", N_collectors);\n", " Info(\"test_tmpi\", \" running with working ranks: %d\", (newfile->GetMPIGlobalSize() - N_collectors));\n", " Info(\"test_tmpi\", \" running with sync rate: %d\", sync_rate);\n", " Info(\"test_tmpi\", \" running with events per rank: %d\", events_per_rank);\n", " Info(\"test_tmpi\", \" running with sleep mean: %d\", sleep_mean);\n", " Info(\"test_tmpi\", \" running with sleep sigma: %d\", sleep_sigma);\n", " Info(\"test_tmpi\", \" running with seed: %d\", gRandom->GetSeed());\n", " }\n", "\n", " // print filename for each collector Rank\n", " if (newfile->IsCollector()) {\n", " Info(\"Collector\", \"[%d]\\troot output filename = %s\", newfile->GetMPIGlobalRank(), smpifname.str().c_str());\n", " }\n", "\n", " // This if statement splits the run-time functionality of\n", " // workers and collectors.\n", " if (newfile->IsCollector()) {\n", " // Run by collector ranks\n", " // This will run until all workers have exited\n", " newfile->RunCollector();\n", " } else {\n", " // Run by worker ranks\n", " // these ranks generate data to be written to TMPIFile\n", "\n", " // create a TTree to store event data\n", " TTree *tree = new TTree(treename.c_str(), \"Event example with Jets\");\n", " // set the AutoFlush rate to be the same as the sync_rate\n", " // this synchronizes the TTree branch compression\n", " tree->SetAutoFlush(sync_rate);\n", "\n", " // Create our fake event data generator\n", " JetEvent *event = new JetEvent;\n", "\n", " // add our branch to the TTree\n", " tree->Branch(branchname.c_str(), \"JetEvent\", &event, 8000, 2);\n", "\n", " // monitor timing\n", " auto sync_start = std::chrono::high_resolution_clock::now();\n", "\n", " // generate the specified number of events\n", " for (int i = 0; i < events_per_rank; i++) {\n", "\n", " auto start = std::chrono::high_resolution_clock::now();\n", " // Generate one event\n", " event->Build(jetm, trackm, hitam, hitbm);\n", "\n", " auto evt_built = std::chrono::high_resolution_clock::now();\n", " double build_time = std::chrono::duration_cast>(evt_built - start).count();\n", "\n", " Info(\"Rank\", \"[%d] [%d]\\tevt = %d;\\tbuild_time = %f\", newfile->GetMPIColor(), newfile->GetMPILocalRank(), i,\n", " build_time);\n", "\n", " // if our build time was significant, subtract that from the sleep time\n", " auto adjusted_sleep = (int)(sleep_mean - build_time);\n", " auto sleep = abs(gRandom->Gaus(adjusted_sleep, sleep_sigma));\n", "\n", " // simulate the time taken by more complicated event generation\n", " std::this_thread::sleep_for(std::chrono::seconds(int(sleep)));\n", "\n", " // Fill the tree\n", " tree->Fill();\n", "\n", " // every sync_rate events, call the TMPIFile::Sync() function\n", " // to trigger MPI collection of local data\n", " if ((i + 1) % sync_rate == 0) {\n", " // call TMPIFile::Sync()\n", " newfile->Sync();\n", "\n", " auto end = std::chrono::high_resolution_clock::now();\n", " double sync_time = std::chrono::duration_cast>(end - sync_start).count();\n", " Info(\"Rank\", \"[%d] [%d]\\tevent collection time: %f\", newfile->GetMPIColor(), newfile->GetMPILocalRank(),\n", " sync_time);\n", " sync_start = std::chrono::high_resolution_clock::now();\n", " }\n", " }\n", "\n", " // synchronize any left over events\n", " if (events_per_rank % sync_rate != 0) {\n", " newfile->Sync();\n", " }\n", " }\n", "\n", " // call Close on the file for clean exit.\n", " Info(\"Rank\", \"[%d] [%d]\\tclosing file\", newfile->GetMPIColor(), newfile->GetMPILocalRank());\n", " newfile->Close();\n", "\n", " // open file and test contents\n", " if (newfile->GetMPILocalRank() == 0) {\n", " TString filename = newfile->GetMPIFilename();\n", " Info(\"Rank\", \"[%d] [%d]\\topening file: %s\", newfile->GetMPIColor(), newfile->GetMPILocalRank(), filename.Data());\n", " TFile file(filename.Data());\n", " if (file.IsOpen()) {\n", " file.ls();\n", " TTree *tree = (TTree *)file.Get(treename.c_str());\n", " if (tree)\n", " tree->Print();\n", "\n", " Info(\"Rank\", \"[%d] [%d]\\tfile should have %d events and has %lld\", newfile->GetMPIColor(),\n", " newfile->GetMPILocalRank(), (newfile->GetMPILocalSize() - 1) * events_per_rank, tree->GetEntries());\n", " }\n", " }\n", "}" ] }, { "cell_type": "code", "execution_count": 2, "id": "7035e7fd", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:40.852565Z", "iopub.status.busy": "2025-12-30T13:42:40.852445Z", "iopub.status.idle": "2025-12-30T13:42:41.176936Z", "shell.execute_reply": "2025-12-30T13:42:41.176457Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_50:2:3: error: use of undeclared identifier 'MPI_Initialized'\n", " (MPI_Initialized(&((*(Int_t*)0x7f8528093008))))\n", " ^\n", "Error in : Error evaluating expression (MPI_Initialized(&((*(Int_t*)0x7f8528093008))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "Int_t flag;\n", "MPI_Initialized(&flag);\n", "if (!flag) {\n", " MPI_Init(NULL, NULL);\n", "}" ] }, { "cell_type": "markdown", "id": "1eb9cb22", "metadata": {}, "source": [ "Get rank and size" ] }, { "cell_type": "code", "execution_count": 3, "id": "54fb0ca7", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:41.178363Z", "iopub.status.busy": "2025-12-30T13:42:41.178233Z", "iopub.status.idle": "2025-12-30T13:42:41.390580Z", "shell.execute_reply": "2025-12-30T13:42:41.390006Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_59:2:2: warning: 'rank' shadows a declaration with the same name in the 'std' namespace; use '::rank' to reference this declaration\n", " Int_t rank, size;\n", " ^\n", "input_line_59:2:2: warning: 'size' shadows a declaration with the same name in the 'std' namespace; use '::size' to reference this declaration\n", "input_line_60:2:17: error: use of undeclared identifier 'MPI_COMM_WORLD'\n", " (MPI_Comm_rank(MPI_COMM_WORLD, &((*(Int_t*)0x7f8528002000))))\n", " ^\n", "Error in : Error evaluating expression (MPI_Comm_rank(MPI_COMM_WORLD, &((*(Int_t*)0x7f8528002000))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "Int_t rank, size;\n", "MPI_Comm_rank(MPI_COMM_WORLD, &rank);\n", "MPI_Comm_size(MPI_COMM_WORLD, &size);" ] }, { "cell_type": "markdown", "id": "462cb0df", "metadata": {}, "source": [ "Procecss 0 generates JetEvent library" ] }, { "cell_type": "code", "execution_count": 4, "id": "027097d6", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:41.392336Z", "iopub.status.busy": "2025-12-30T13:42:41.392131Z", "iopub.status.idle": "2025-12-30T13:42:41.604551Z", "shell.execute_reply": "2025-12-30T13:42:41.603132Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_61:2:6: error: reference to 'rank' is ambiguous\n", " if (rank == 0) {\n", " ^\n", "input_line_59:2:8: note: candidate found by name lookup is 'rank'\n", " Int_t rank, size;\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/type_traits:1369:12: note: candidate found by name lookup is 'std::rank'\n", " struct rank\n", " ^\n", "input_line_61:2:6: error: reference to 'rank' is ambiguous\n", " if (rank == 0) {\n", " ^\n", "input_line_59:2:8: note: candidate found by name lookup is 'rank'\n", " Int_t rank, size;\n", " ^\n", "/usr/lib/gcc/x86_64-redhat-linux/11/../../../../include/c++/11/type_traits:1369:12: note: candidate found by name lookup is 'std::rank'\n", " struct rank\n", " ^\n", "input_line_61:2:11: error: expected unqualified-id\n", " if (rank == 0) {\n", " ^\n" ] } ], "source": [ "if (rank == 0) {\n", " TString tutdir = gROOT->GetTutorialDir();\n", " gSystem->Exec(\"cp \" + tutdir + \"/tree/JetEvent* .\");\n", " gROOT->ProcessLine(\".L JetEvent.cxx+\");\n", "}" ] }, { "cell_type": "markdown", "id": "5a129710", "metadata": {}, "source": [ "Wait until it's done" ] }, { "cell_type": "code", "execution_count": 5, "id": "68be09fe", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:41.609963Z", "iopub.status.busy": "2025-12-30T13:42:41.609802Z", "iopub.status.idle": "2025-12-30T13:42:41.814601Z", "shell.execute_reply": "2025-12-30T13:42:41.814081Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_63:2:15: error: use of undeclared identifier 'MPI_COMM_WORLD'\n", " (MPI_Barrier(MPI_COMM_WORLD))\n", " ^\n", "Error in : Error evaluating expression (MPI_Barrier(MPI_COMM_WORLD))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "MPI_Barrier(MPI_COMM_WORLD);\n", "\n", "gROOT->ProcessLine(\"#define TMPI_SECOND_RUN yes\");\n", "gROOT->ProcessLine(\"#include \\\"\" __FILE__ \"\\\"\");\n", "gROOT->ProcessLine(\"testTMPIFile(true)\");" ] }, { "cell_type": "markdown", "id": "833bb84a", "metadata": {}, "source": [ "TMPIFile will do MPI_Finalize() when closing the file" ] }, { "cell_type": "code", "execution_count": 6, "id": "ed415288", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:41.816353Z", "iopub.status.busy": "2025-12-30T13:42:41.816219Z", "iopub.status.idle": "2025-12-30T13:42:42.018943Z", "shell.execute_reply": "2025-12-30T13:42:42.018464Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "input_line_65:2:3: error: use of undeclared identifier 'MPI_Finalized'\n", " (MPI_Finalized(&((*(Int_t*)0x7f8518077000))))\n", " ^\n", "Error in : Error evaluating expression (MPI_Finalized(&((*(Int_t*)0x7f8518077000))))\n", "Execution of your code was aborted.\n" ] } ], "source": [ "Int_t finalized = 0;\n", "MPI_Finalized(&finalized);\n", "if (!finalized) {\n", " MPI_Finalize();\n", "}" ] }, { "cell_type": "markdown", "id": "4317cee3", "metadata": {}, "source": [ "Draw all canvases " ] }, { "cell_type": "code", "execution_count": 7, "id": "e08dcdb1", "metadata": { "collapsed": false, "execution": { "iopub.execute_input": "2025-12-30T13:42:42.023850Z", "iopub.status.busy": "2025-12-30T13:42:42.023699Z", "iopub.status.idle": "2025-12-30T13:42:42.229700Z", "shell.execute_reply": "2025-12-30T13:42:42.228452Z" } }, "outputs": [], "source": [ "gROOT->GetListOfCanvases()->Draw()" ] } ], "metadata": { "kernelspec": { "display_name": "ROOT C++", "language": "c++", "name": "root" }, "language_info": { "codemirror_mode": "text/x-c++src", "file_extension": ".C", "mimetype": " text/x-c++src", "name": "c++" } }, "nbformat": 4, "nbformat_minor": 5 }