Following counters might not be supported by rocprof: SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_MFMA_MOPS_F32
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: ['6:8']
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[ 33%] Built target fmt
[ 33%] Built target gsl_assert
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:29.463568 136588306935616 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307728 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:29.473787 136588306935616 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:29.685138 136588306935616 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:29.810951 136588306935616 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.337164 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:29.849954 136588306935616 generateRocpd.cpp:582] writing SQL database for process 2386485 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:29.851258 136588306935616 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386485_results.db (UUID=0000431c-31ec-71ec-a66e-5664e24c0b83)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:29.941373 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014369 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:29.942489 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001079 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:29.944970 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002453 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:29.949901 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003012 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.019041 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.069111 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.021891 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002820 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.021921 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.037324 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015388 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.037352 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.037364 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.037376 136588306935616 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.037586 136588306935616 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.038034 136588306935616 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.188081 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.043801 136588306935616 simple_timer.cpp:55] [rocprofv3] output generation ::     0.230291 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:30.043889 136588306935616 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.232854 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 2/12][Approximate profiling time left: 32 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:32.556908 128811944152896 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303807 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:32.567042 128811944152896 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:32.777819 128811944152896 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:32.906180 128811944152896 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.339139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:32.945439 128811944152896 generateRocpd.cpp:582] writing SQL database for process 2386496 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:32.946737 128811944152896 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386496_results.db (UUID=0000431c-3e05-7e05-a94b-a8bfc2ad2710)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.033516 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014141 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.034596 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001048 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.037017 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002392 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.041933 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003060 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.104863 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.062899 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.107299 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002404 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.107328 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.123514 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.123546 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.123558 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.123570 128811944152896 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.123803 128811944152896 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000212 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.124330 128811944152896 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.178891 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.130266 128811944152896 simple_timer.cpp:55] [rocprofv3] output generation ::     0.221587 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:33.130352 128811944152896 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.224120 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 3/12][Approximate profiling time left: 28 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:35.622216 129695851339584 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304402 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:35.632178 129695851339584 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:35.843578 129695851339584 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:35.966433 129695851339584 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.334255 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.005631 129695851339584 generateRocpd.cpp:582] writing SQL database for process 2386507 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:36.006918 129695851339584 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386507_results.db (UUID=0000431c-49fe-79fe-83ce-839cfd8df78d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.096764 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014394 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.097939 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001145 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.100542 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002575 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.105562 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003076 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.173833 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068243 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.176662 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002798 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.176691 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.192813 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016107 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.192843 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.192855 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.192868 129695851339584 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.193085 129695851339584 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.193546 129695851339584 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.187915 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.199411 129695851339584 simple_timer.cpp:55] [rocprofv3] output generation ::     0.230474 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:36.199506 129695851339584 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.233026 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 4/12][Approximate profiling time left: 25 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:38.702286 132769848790848 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.302296 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:38.712336 132769848790848 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:38.926101 132769848790848 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:39.048996 132769848790848 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.336660 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.088034 132769848790848 generateRocpd.cpp:582] writing SQL database for process 2386517 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:39.089315 132769848790848 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386517_results.db (UUID=0000431c-5608-7608-935d-7cb8d1326db1)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.177736 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014356 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.178865 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.181324 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002431 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.186285 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003046 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.258472 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.072159 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.261286 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002783 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.261315 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.277619 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016289 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.277652 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.277664 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.277677 132769848790848 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.277901 132769848790848 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.278511 132769848790848 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.190478 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.284481 132769848790848 simple_timer.cpp:55] [rocprofv3] output generation ::     0.232950 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:39.284575 132769848790848 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.235529 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 5/12][Approximate profiling time left: 21 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:41.779041 136522245173056 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304930 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:41.787679 136522245173056 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:41.999162 136522245173056 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:42.124816 136522245173056 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.337137 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.163851 136522245173056 generateRocpd.cpp:582] writing SQL database for process 2386527 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:42.165152 136522245173056 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386527_results.db (UUID=0000431c-620a-720a-85d8-97840dea4686)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.255268 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014298 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.256389 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001090 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.258549 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.263612 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.326514 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.062873 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.329372 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002828 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.329401 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.345322 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015906 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.345349 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.345361 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.345373 136522245173056 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.345565 136522245173056 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000176 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.345990 136522245173056 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.182139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.351926 136522245173056 simple_timer.cpp:55] [rocprofv3] output generation ::     0.224560 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:42.352023 136522245173056 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.227158 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 6/12][Approximate profiling time left: 18 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:44.854221 130705869250368 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306300 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:44.864693 130705869250368 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.075427 130705869250368 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:45.200761 130705869250368 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.336068 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.240098 130705869250368 generateRocpd.cpp:582] writing SQL database for process 2386537 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:45.241369 130705869250368 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386537_results.db (UUID=0000431c-6e0c-7e0c-856e-2b70dabaa8b3)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.329416 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.330539 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001091 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.333021 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002454 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.338093 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003109 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.392081 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.053960 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.394921 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002809 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.394950 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.410273 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015308 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.410301 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.410314 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.410325 130705869250368 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.410528 130705869250368 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000182 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.410950 130705869250368 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.170853 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.416688 130705869250368 simple_timer.cpp:55] [rocprofv3] output generation ::     0.213300 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:45.416772 130705869250368 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.215961 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 7/12][Approximate profiling time left: 15 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:47.912190 138711107239744 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305091 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:47.921864 138711107239744 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.132704 138711107239744 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:48.257036 138711107239744 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.335172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.296012 138711107239744 generateRocpd.cpp:582] writing SQL database for process 2386547 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:48.297298 138711107239744 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386547_results.db (UUID=0000431c-79ff-79ff-9e30-c344be97311c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.385161 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014092 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.386262 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001069 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.388412 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.393414 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003084 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.444421 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.050979 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.447108 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002658 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.447137 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.462630 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015478 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.462661 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.462673 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.462685 138711107239744 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.462897 138711107239744 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.463396 138711107239744 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.167385 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.469295 138711107239744 simple_timer.cpp:55] [rocprofv3] output generation ::     0.209719 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:48.469379 138711107239744 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.212293 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 8/12][Approximate profiling time left: 12 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:50.950219 138048757571392 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304163 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:50.960574 138048757571392 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.171603 138048757571392 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:51.297474 138048757571392 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.336901 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.336798 138048757571392 generateRocpd.cpp:582] writing SQL database for process 2386558 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:51.338118 138048757571392 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386558_results.db (UUID=0000431c-85de-75de-8a17-dc64f4dfd514)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.429235 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014715 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.430401 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.433014 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002578 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.438140 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003145 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.532915 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094747 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.535770 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002825 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.535801 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.551341 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015525 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.551368 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.551381 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.551392 138048757571392 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.551604 138048757571392 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000191 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.552072 138048757571392 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.215274 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.557834 138048757571392 simple_timer.cpp:55] [rocprofv3] output generation ::     0.257609 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:51.557920 138048757571392 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.260395 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 9/12][Approximate profiling time left: 9 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:54.059745 130436183654208 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304238 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:54.070135 130436183654208 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.280601 130436183654208 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:54.407523 130436183654208 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.337388 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.447053 130436183654208 generateRocpd.cpp:582] writing SQL database for process 2386568 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:54.448357 130436183654208 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386568_results.db (UUID=0000431c-9204-7204-8248-f444a23c6fa9)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.533001 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014504 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.534062 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001031 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.536487 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002396 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.541247 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002941 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.634349 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.093074 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.637219 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002821 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.637249 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.653422 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016156 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.653450 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.653466 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.653480 130436183654208 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.653676 130436183654208 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000184 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.654184 130436183654208 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.207131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.659987 130436183654208 simple_timer.cpp:55] [rocprofv3] output generation ::     0.249843 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:54.660076 130436183654208 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.252501 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 10/12][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:56.941514 126405896220480 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.309801 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:56.951489 126405896220480 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.162308 126405896220480 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:57:57.291247 126405896220480 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.339758 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.325654 126405896220480 generateRocpd.cpp:582] writing SQL database for process 2386580 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:57:57.326689 126405896220480 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386580_results.db (UUID=0000431c-9d40-7d40-9328-57030b6085f9)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.401371 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.011260 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.402341 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000947 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.404401 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002038 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.408578 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002513 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.512532 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.103932 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.514943 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002376 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.514966 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.527303 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.012319 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.527324 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.527333 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.527342 126405896220480 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.527501 126405896220480 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000148 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.527832 126405896220480 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.202179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.532064 126405896220480 simple_timer.cpp:55] [rocprofv3] output generation ::     0.238342 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:57:57.532137 126405896220480 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.240806 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 11/12][Approximate profiling time left: 3 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:58:00.051958 134347152338752 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.311432 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:58:00.062071 134347152338752 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.273394 134347152338752 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:58:00.397816 134347152338752 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.335745 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.436623 134347152338752 generateRocpd.cpp:582] writing SQL database for process 2386590 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:58:00.437950 134347152338752 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386590_results.db (UUID=0000431c-a964-7964-b4ea-b23590142ef2)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.528989 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014915 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.530156 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001136 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.532783 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002599 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.537843 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003113 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.661773 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.123902 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.664653 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002850 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.664682 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.680350 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015653 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.680379 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.680391 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.680403 134347152338752 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.680616 134347152338752 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000193 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.681071 134347152338752 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.244449 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.686912 134347152338752 simple_timer.cpp:55] [rocprofv3] output generation ::     0.286528 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:00.687018 134347152338752 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.289149 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:58:03.207954 138831293124416 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307354 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:58:03.217073 138831293124416 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.428198 138831293124416 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:58:03.552092 138831293124416 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.335019 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.590938 138831293124416 generateRocpd.cpp:582] writing SQL database for process 2386613 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:58:03.592216 138831293124416 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_6_8/MI100/out/pmc_1/dl385-20-mi100-3c48/2386613_results.db (UUID=0000431c-b5bd-75bd-8694-d90b63092af2)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.680953 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014309 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.682101 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001106 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.684675 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002546 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.689711 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.765846 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076107 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.768660 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002784 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.768690 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.784451 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015746 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.784481 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.784493 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.784505 138831293124416 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.784716 138831293124416 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.785251 138831293124416 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.194313 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.791158 138831293124416 simple_timer.cpp:55] [rocprofv3] output generation ::     0.236574 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:58:03.791252 138831293124416 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.239110 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
No GPU kernel data collected. The workload may not have dispatched any GPU kernels.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
