Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/join_type_grid/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/13][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:37.175964 123867557879616 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189791 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:37.176596 123867557879616 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.370582 123867557879616 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:37.469477 123867557879616 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.292881 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.491477 123867557879616 generateRocpd.cpp:583] writing SQL database for process 2521681 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:37.492263 123867557879616 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521681_results.db (UUID=0001fa6d-6676-7676-95c8-56b2704f2ea5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.574602 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.575801 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.577449 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001633 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.587775 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008384 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.923165 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.335374 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.925533 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002352 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.925551 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.934596 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009038 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.934610 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.934617 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.934623 123867557879616 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.934727 123867557879616 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.934911 123867557879616 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.443434 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.937831 123867557879616 simple_timer.cpp:55] [rocprofv3] output generation ::     0.466993 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:37.937936 123867557879616 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.468415 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521681_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/13][Approximate profiling time left: 25 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:39.487558 134563496550208 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189989 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:39.488194 134563496550208 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:39.681855 134563496550208 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:39.771324 134563496550208 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.283130 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:39.793491 134563496550208 generateRocpd.cpp:583] writing SQL database for process 2521690 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:39.794335 134563496550208 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521690_results.db (UUID=0001fa6d-6f7e-7f7e-bba6-d794afd97207)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:39.877007 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007952 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:39.878230 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001206 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:39.879827 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001582 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:39.890213 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008417 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.201911 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.311681 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.204339 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002398 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.204358 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.213138 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008773 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.213152 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.213159 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.213165 134563496550208 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.213274 134563496550208 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.213483 134563496550208 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.419992 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.216436 134563496550208 simple_timer.cpp:55] [rocprofv3] output generation ::     0.443637 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:40.216534 134563496550208 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.445161 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521690_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/13][Approximate profiling time left: 22 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:41.745078 138776476196672 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191766 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:41.745720 138776476196672 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:41.943692 138776476196672 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:42.026292 138776476196672 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.280573 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.048419 138776476196672 generateRocpd.cpp:583] writing SQL database for process 2521708 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:42.049214 138776476196672 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521708_results.db (UUID=0001fa6d-784e-784e-a53b-b5157c66e641)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.131335 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007888 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.132538 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.134243 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001691 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.144890 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008457 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.442713 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.297808 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.445065 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002333 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.445082 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.453791 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008702 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.453806 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.453812 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.453819 138776476196672 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.453936 138776476196672 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.454167 138776476196672 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.405749 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.457104 138776476196672 simple_timer.cpp:55] [rocprofv3] output generation ::     0.429333 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:42.457202 138776476196672 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.430864 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521708_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/13][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:43.988743 140462389272384 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189374 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:43.989327 140462389272384 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.182562 140462389272384 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:44.266518 140462389272384 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.288723 140462389272384 generateRocpd.cpp:583] writing SQL database for process 2521717 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:44.289491 140462389272384 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521717_results.db (UUID=0001fa6d-8114-7114-8dee-a96933e2799a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.372865 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008146 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.374098 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001215 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.376024 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001912 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.386404 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008377 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.673300 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.286881 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.675660 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002342 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.675678 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.684420 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008735 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.684434 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.684441 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.684447 140462389272384 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.684556 140462389272384 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.684774 140462389272384 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.396052 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.687711 140462389272384 simple_timer.cpp:55] [rocprofv3] output generation ::     0.419533 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:44.687798 140462389272384 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.421243 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521717_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/13][Approximate profiling time left: 18 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:46.210612 127248257081152 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.188411 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:46.211213 127248257081152 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.404265 127248257081152 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:46.486350 127248257081152 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275137 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.508614 127248257081152 generateRocpd.cpp:583] writing SQL database for process 2521725 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:46.509449 127248257081152 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521725_results.db (UUID=0001fa6d-89c3-79c3-b2a9-8cbd3329d770)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.591243 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007995 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.592452 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001191 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.594018 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001551 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.604604 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008643 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.884663 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.280044 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.887004 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002324 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.887022 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.896470 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009434 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.896485 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.896491 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.896498 127248257081152 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.896609 127248257081152 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.896825 127248257081152 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.388212 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.899746 127248257081152 simple_timer.cpp:55] [rocprofv3] output generation ::     0.411942 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:46.899841 127248257081152 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.413442 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521725_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/13][Approximate profiling time left: 15 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:48.410269 140554677845824 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190883 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:48.410850 140554677845824 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.604470 140554677845824 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:48.691519 140554677845824 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.280669 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.713925 140554677845824 generateRocpd.cpp:583] writing SQL database for process 2521734 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:48.714747 140554677845824 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521734_results.db (UUID=0001fa6d-9258-7258-a36d-c86cb3c87436)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.795602 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007670 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.796769 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001150 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.798322 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001538 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.808738 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008500 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.817250 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.008497 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.819333 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002068 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.819350 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.828151 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008793 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.828166 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.828172 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.828179 140554677845824 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.828282 140554677845824 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.828467 140554677845824 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.114543 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.831378 140554677845824 simple_timer.cpp:55] [rocprofv3] output generation ::     0.138557 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:48.831432 140554677845824 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.139867 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521734_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/13][Approximate profiling time left: 13 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:50.369708 125471585476416 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.192215 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:50.370295 125471585476416 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:50.568309 125471585476416 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:50.652629 125471585476416 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282333 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:50.675355 125471585476416 generateRocpd.cpp:583] writing SQL database for process 2521742 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:50.676189 125471585476416 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521742_results.db (UUID=0001fa6d-99fe-79fe-a331-0005b6a9bb54)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:50.758168 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008262 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:50.759357 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:50.761326 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001955 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:50.771774 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008506 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.181834 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.410045 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.184250 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002397 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.184268 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.193561 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009286 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.193576 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.193582 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.193589 125471585476416 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.193696 125471585476416 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.193916 125471585476416 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.518562 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.196890 125471585476416 simple_timer.cpp:55] [rocprofv3] output generation ::     0.542447 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:51.197010 125471585476416 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.544331 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521742_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/13][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:52.725794 138226252054336 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.188243 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:52.726372 138226252054336 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:52.920970 138226252054336 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:53.005809 138226252054336 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279437 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.028806 138226252054336 generateRocpd.cpp:583] writing SQL database for process 2521750 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:53.029618 138226252054336 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521750_results.db (UUID=0001fa6d-a336-7336-ab80-b141f0029cf8)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.111451 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008083 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.112664 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001196 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.114632 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001953 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.125018 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008416 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.527763 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.402723 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.530067 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002285 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.530084 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.538609 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008518 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.538624 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.538630 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.538637 138226252054336 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.538745 138226252054336 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.538956 138226252054336 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.510151 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.541875 138226252054336 simple_timer.cpp:55] [rocprofv3] output generation ::     0.533634 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:53.541980 138226252054336 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.536123 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521750_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/13][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:55.116475 130790193602368 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.198656 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:55.117075 130790193602368 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:55.311934 130790193602368 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:55.394491 130790193602368 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277416 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:55.416483 130790193602368 generateRocpd.cpp:583] writing SQL database for process 2521758 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:55.417287 130790193602368 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521758_results.db (UUID=0001fa6d-ac82-7c82-b285-181334a97155)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:55.499541 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008310 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:55.500764 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:55.502732 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001953 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:55.513284 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008583 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.095932 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.582631 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.099155 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.003185 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.099185 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000005 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.107932 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008735 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.107947 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.107953 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.107960 130790193602368 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.108082 130790193602368 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000115 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.108317 130790193602368 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.691834 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.111261 130790193602368 simple_timer.cpp:55] [rocprofv3] output generation ::     0.715382 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:56.111396 130790193602368 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.716858 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521758_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/13][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:57.646226 133503854145344 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191275 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:57.646862 133503854145344 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:57.840973 133503854145344 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:57.923189 133503854145344 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276328 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:57.945393 133503854145344 generateRocpd.cpp:583] writing SQL database for process 2521767 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:51:57.946195 133503854145344 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521767_results.db (UUID=0001fa6d-b66b-766b-bc25-0dad015b6ec9)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.029363 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.030594 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001213 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.032774 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002165 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.043455 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008401 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.385634 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.342162 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.387923 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002271 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.387941 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.396772 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008824 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.396787 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.396793 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.396799 133503854145344 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.396907 133503854145344 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.397129 133503854145344 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.451736 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.400041 133503854145344 simple_timer.cpp:55] [rocprofv3] output generation ::     0.475255 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:51:58.400147 133503854145344 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.476911 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521767_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/13][Approximate profiling time left: 4 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:59.950832 133760065036096 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190864 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:51:59.951422 133760065036096 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.147126 133760065036096 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:00.229018 133760065036096 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277597 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.251528 133760065036096 generateRocpd.cpp:583] writing SQL database for process 2521775 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:52:00.252300 133760065036096 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521775_results.db (UUID=0001fa6d-bf6c-7f6c-94a8-19c48e556767)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.335871 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.337434 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001545 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.339401 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001952 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.349822 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008412 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.681467 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.331629 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.683806 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002316 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.683824 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.693531 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009700 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.693545 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.693552 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.693559 133760065036096 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.693692 133760065036096 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.693896 133760065036096 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.442369 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.696798 133760065036096 simple_timer.cpp:55] [rocprofv3] output generation ::     0.466058 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:00.696891 133760065036096 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.467811 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521775_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/13][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:02.279403 132731292950336 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.195064 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:02.279993 132731292950336 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:02.475755 132731292950336 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:02.561892 132731292950336 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.281899 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:02.584768 132731292950336 generateRocpd.cpp:583] writing SQL database for process 2521783 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:52:02.585571 132731292950336 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521783_results.db (UUID=0001fa6d-c881-7881-96c3-80abdc364b94)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:02.669624 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008164 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:02.670844 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:02.672847 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001987 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:02.683313 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008457 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.201642 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.518313 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.204110 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002439 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.204128 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.213978 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009842 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.213994 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.214001 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.214008 132731292950336 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.214142 132731292950336 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000126 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.214391 132731292950336 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.629623 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.217535 132731292950336 simple_timer.cpp:55] [rocprofv3] output generation ::     0.653824 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:03.217668 132731292950336 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.655732 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521783_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 13/13][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/join_type_grid/MI200/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:04.779474 139676856950592 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190546 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:04.780091 139676856950592 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:04.973067 139676856950592 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:52:05.056464 139676856950592 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276373 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.078794 139676856950592 generateRocpd.cpp:583] writing SQL database for process 2521792 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:52:05.079614 139676856950592 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_grid/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521792_results.db (UUID=0001fa6d-d249-7249-a67c-ece1c74467a8)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.163811 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.165015 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001188 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.166688 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001650 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.177389 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008530 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.498227 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.320822 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.500580 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002321 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.500599 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.510041 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009434 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.510056 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.510063 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.510070 139676856950592 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.510177 139676856950592 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.510399 139676856950592 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.431605 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.513291 139676856950592 simple_timer.cpp:55] [rocprofv3] output generation ::     0.455351 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:52:05.513392 139676856950592 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.456878 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_grid/MI200/out/pmc_1/2521792_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Checking for roofline.csv in tests/workloads/join_type_grid/MI200
[roofline] Benchmark execution failed: 'L1'. Skipping roofline.
