Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/kernel_substr/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: ['vecCopy']
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/13][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:05.329437 129107433168704 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.192443 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:05.330058 129107433168704 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:05.529610 129107433168704 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:05.622953 129107433168704 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.292895 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:05.645513 129107433168704 generateRocpd.cpp:583] writing SQL database for process 2522684 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:05.646323 129107433168704 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522684_results.db (UUID=0001fa70-938d-738d-9e1b-e4ac754e58fa)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:05.729173 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007982 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:05.730348 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001160 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:05.732245 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001882 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:05.742565 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008364 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.067659 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.325079 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.069968 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002289 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.069985 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.079420 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009428 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.079435 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.079441 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.079447 129107433168704 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.079557 129107433168704 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.079773 129107433168704 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.434260 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.082779 129107433168704 simple_timer.cpp:55] [rocprofv3] output generation ::     0.458089 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:06.082884 129107433168704 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.459883 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522684_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/13][Approximate profiling time left: 25 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:07.631199 130712972099392 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191085 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:07.631819 130712972099392 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:07.823904 130712972099392 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:07.908057 130712972099392 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276238 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:07.930461 130712972099392 generateRocpd.cpp:583] writing SQL database for process 2522695 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:07.931272 130712972099392 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522695_results.db (UUID=0001fa70-9c8c-7c8c-9d5e-588a28728790)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.015241 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007918 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.016454 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001197 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.018433 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001964 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.028811 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008346 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.343958 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.315132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.346230 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002254 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.346247 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.354891 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008636 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.354906 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.354912 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.354919 130712972099392 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.355053 130712972099392 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000125 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.355302 130712972099392 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.424841 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.358253 130712972099392 simple_timer.cpp:55] [rocprofv3] output generation ::     0.448552 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:08.358350 130712972099392 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.450246 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522695_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/13][Approximate profiling time left: 22 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:09.922530 124620756107072 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190816 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:09.923147 124620756107072 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.115238 124620756107072 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:10.200588 124620756107072 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277442 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.222842 124620756107072 generateRocpd.cpp:583] writing SQL database for process 2522703 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:10.223642 124620756107072 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522703_results.db (UUID=0001fa70-a580-7580-90c8-e3fe0dad83d5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.306390 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007946 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.307607 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.309213 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001591 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.319579 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008403 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.619467 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.299874 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.621800 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002315 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.621817 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.631016 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.631038 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.631045 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.631057 124620756107072 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.631184 124620756107072 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.631425 124620756107072 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.408584 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.634377 124620756107072 simple_timer.cpp:55] [rocprofv3] output generation ::     0.432406 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:10.634474 124620756107072 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.433837 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522703_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/13][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:12.167242 129687527571264 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.193137 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:12.167856 129687527571264 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.361647 129687527571264 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:12.443622 129687527571264 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275766 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.465852 129687527571264 generateRocpd.cpp:583] writing SQL database for process 2522712 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:12.466650 129687527571264 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522712_results.db (UUID=0001fa70-ae43-7e43-a669-3a129514d36d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.550659 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008054 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.551887 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001212 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.554072 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.565195 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.009010 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.862973 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.297688 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.865380 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002367 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.865398 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.883106 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.017700 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.883124 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.883130 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.883136 129687527571264 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.883289 129687527571264 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000146 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.884614 129687527571264 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.418762 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.889016 129687527571264 simple_timer.cpp:55] [rocprofv3] output generation ::     0.444022 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:12.889304 129687527571264 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.445633 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522712_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/13][Approximate profiling time left: 18 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:14.446093 132997897002816 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.195510 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:14.446791 132997897002816 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:14.640842 132997897002816 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:14.733309 132997897002816 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.286518 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:14.755751 132997897002816 generateRocpd.cpp:583] writing SQL database for process 2522728 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:14.756589 132997897002816 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522728_results.db (UUID=0001fa70-b727-7727-9229-748db593c53b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:14.838153 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007892 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:14.839280 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001111 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:14.840845 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001551 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:14.851076 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008276 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.134269 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.283173 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.136658 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002346 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.136675 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.146204 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009521 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.146219 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.146225 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.146232 132997897002816 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.146350 132997897002816 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.146573 132997897002816 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.390822 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.149626 132997897002816 simple_timer.cpp:55] [rocprofv3] output generation ::     0.414737 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:15.149718 132997897002816 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.416358 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522728_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/13][Approximate profiling time left: 15 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:16.661627 137155953704768 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.183624 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:16.662244 137155953704768 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:16.866972 137155953704768 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:16.956477 137155953704768 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.294233 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:16.978489 137155953704768 generateRocpd.cpp:583] writing SQL database for process 2522738 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:16.979281 137155953704768 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522738_results.db (UUID=0001fa70-bfda-7fda-94c7-f0daf9841d0d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.062706 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007784 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.063901 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.065527 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001611 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.076101 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008544 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.084695 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.008579 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.086798 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002088 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.086815 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.095172 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008350 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.095187 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.095193 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.095199 137155953704768 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.095296 137155953704768 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000089 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.095484 137155953704768 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.116995 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.098375 137155953704768 simple_timer.cpp:55] [rocprofv3] output generation ::     0.140423 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:17.098420 137155953704768 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.141904 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522738_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/13][Approximate profiling time left: 13 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:18.605683 138283730952000 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.188147 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:18.606272 138283730952000 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:18.799689 138283730952000 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:18.884987 138283730952000 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.278715 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:18.907309 138283730952000 generateRocpd.cpp:583] writing SQL database for process 2522747 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:18.908101 138283730952000 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522747_results.db (UUID=0001fa70-c76e-776e-a398-00e7fd57cb7a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:18.991160 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:18.992252 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001075 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:18.994163 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001897 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.004584 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008469 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.414509 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.409910 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.416833 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002303 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.416851 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.425332 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008474 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.425347 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.425354 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.425361 138283730952000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.425492 138283730952000 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000123 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.425734 138283730952000 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.518425 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.428692 138283730952000 simple_timer.cpp:55] [rocprofv3] output generation ::     0.541989 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:19.428805 138283730952000 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.543753 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522747_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/13][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:20.954904 139428973342528 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189553 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:20.955538 139428973342528 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.147780 139428973342528 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:21.239972 139428973342528 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.284434 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.262615 139428973342528 generateRocpd.cpp:583] writing SQL database for process 2522755 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:21.263407 139428973342528 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522755_results.db (UUID=0001fa70-d09a-709a-acc7-f820d6eed437)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.348091 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008228 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.349340 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001227 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.351526 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.362049 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008363 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.769466 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.407402 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.772016 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002531 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.772043 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.780663 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008606 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.780680 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.780689 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.780701 139428973342528 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.780806 139428973342528 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.781028 139428973342528 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.518413 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.783979 139428973342528 simple_timer.cpp:55] [rocprofv3] output generation ::     0.542258 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:21.784102 139428973342528 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.544085 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522755_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/13][Approximate profiling time left: 9 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:23.359698 123862459301696 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.199235 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:23.360303 123862459301696 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:23.553817 123862459301696 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:23.641228 123862459301696 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.280925 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:23.663642 123862459301696 generateRocpd.cpp:583] writing SQL database for process 2522763 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:23.664456 123862459301696 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522763_results.db (UUID=0001fa70-d9f5-79f5-9c9f-6fc828dc2c26)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:23.749399 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008330 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:23.750641 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001225 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:23.752830 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002174 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:23.763609 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008617 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.358079 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.594405 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.360520 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002418 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.360537 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.369607 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009063 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.369623 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.369629 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.369635 123862459301696 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.369751 123862459301696 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.369977 123862459301696 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.706335 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.372889 123862459301696 simple_timer.cpp:55] [rocprofv3] output generation ::     0.730054 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:24.373015 123862459301696 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.731737 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522763_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/13][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:25.925290 132639022456640 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190748 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:25.925892 132639022456640 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.117533 132639022456640 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:26.209882 132639022456640 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.283990 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.232427 132639022456640 generateRocpd.cpp:583] writing SQL database for process 2522771 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:26.233217 132639022456640 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522771_results.db (UUID=0001fa70-e403-7403-b9c5-b02f8c843027)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.313215 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008025 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.314416 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001185 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.316533 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.326894 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008321 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.667932 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.341023 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.670154 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.670172 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.679348 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.679363 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.679369 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.679376 132639022456640 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.679487 132639022456640 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.679696 132639022456640 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.447269 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.682673 132639022456640 simple_timer.cpp:55] [rocprofv3] output generation ::     0.470981 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:26.682782 132639022456640 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.472858 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522771_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/13][Approximate profiling time left: 4 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:28.214584 124544340082496 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.195701 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:28.215184 124544340082496 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.409192 124544340082496 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:28.492249 124544340082496 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277066 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.514769 124544340082496 generateRocpd.cpp:583] writing SQL database for process 2522780 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:28.515588 124544340082496 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522780_results.db (UUID=0001fa70-ecef-7cef-95ae-80feb883a199)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.599791 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.601015 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.603001 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001971 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.613494 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008441 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.946823 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.333314 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.949222 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002375 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.949240 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.958823 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009575 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.958837 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.958844 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.958850 124544340082496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.958958 124544340082496 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.959171 124544340082496 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.444402 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.962116 124544340082496 simple_timer.cpp:55] [rocprofv3] output generation ::     0.468194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:28.962210 124544340082496 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.469909 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522780_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/13][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:30.516509 124049212997440 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.197337 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:30.517158 124049212997440 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:30.710851 124049212997440 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:30.805376 124049212997440 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.288218 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:30.827755 124049212997440 generateRocpd.cpp:583] writing SQL database for process 2522788 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:30.828538 124049212997440 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522788_results.db (UUID=0001fa70-f5ec-75ec-b643-44968b64bc70)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:30.911687 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008210 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:30.912904 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:30.914884 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001965 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:30.925218 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008365 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.444111 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.518878 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.446364 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002232 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.446381 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.456250 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009862 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.456265 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.456271 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.456278 124049212997440 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.456392 124049212997440 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.456638 124049212997440 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.628883 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.459689 124049212997440 simple_timer.cpp:55] [rocprofv3] output generation ::     0.652760 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:31.459809 124049212997440 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.654393 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522788_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 13/13][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/kernel_substr/MI200/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:32.997243 137571919667008 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191243 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:32.997837 137571919667008 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.190865 137571919667008 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:55:33.273210 137571919667008 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275374 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.295664 137571919667008 generateRocpd.cpp:583] writing SQL database for process 2522796 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:55:33.296461 137571919667008 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_substr/MI200/out/pmc_1/smc4124-25-mi210-3c48/2522796_results.db (UUID=0001fa70-ffa2-7fa2-bfd7-baab2e502f70)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.378991 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008027 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.380214 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001207 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.381808 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001579 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.392264 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008473 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.716064 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.323785 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.718365 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002284 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.718382 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.728177 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009788 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.728191 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.728197 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.728204 137571919667008 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.728321 137571919667008 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.728557 137571919667008 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.432893 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.731479 137571919667008 simple_timer.cpp:55] [rocprofv3] output generation ::     0.456605 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:55:33.731575 137571919667008 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.458318 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_substr/MI200/out/pmc_1/2522796_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Checking for roofline.csv in tests/workloads/kernel_substr/MI200
[roofline] Benchmark execution failed: 'L1'. Skipping roofline.
