alias: cu_ins, block id: 10
alias: cu_pipe, block id: 11
alias: cpc, block id: 5
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ_CPC/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: ['cu_ins', 'cu_pipe', 'cpc']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[ 33%] Built target fmt
[ 33%] Built target gsl_assert
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/7][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:35.318889 134850181816128 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.183005 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:35.319519 134850181816128 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.511375 134850181816128 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:35.606406 134850181816128 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.286887 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.628650 134850181816128 generateRocpd.cpp:583] writing SQL database for process 2525008 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:35.629446 134850181816128 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525008_results.db (UUID=0001fa78-5bbc-7bbc-b5d3-0ba76aa654ca)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.710882 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007781 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.712123 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001225 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.713691 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001552 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.724183 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008592 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.821758 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.097560 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.824042 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002269 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.824059 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.832418 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008351 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.832433 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.832439 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.832446 134850181816128 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.832548 134850181816128 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.832773 134850181816128 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.204124 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.835611 134850181816128 simple_timer.cpp:55] [rocprofv3] output generation ::     0.227718 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:35.835677 134850181816128 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.229226 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525008_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/7][Approximate profiling time left: 10 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:37.337114 135541431680832 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.183992 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:37.337699 135541431680832 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.531164 135541431680832 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:37.612623 135541431680832 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.274925 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.634696 135541431680832 generateRocpd.cpp:583] writing SQL database for process 2525017 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:37.635496 135541431680832 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525017_results.db (UUID=0001fa78-639d-739d-b191-efdfd2951e2e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.717773 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007780 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.718992 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.720695 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001688 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.731251 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008365 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.820040 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.088763 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.822360 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002304 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.822377 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.830973 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008588 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.830987 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.830994 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.831000 135541431680832 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.831108 135541431680832 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.831309 135541431680832 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.196614 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.834225 135541431680832 simple_timer.cpp:55] [rocprofv3] output generation ::     0.220258 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:37.834292 135541431680832 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.221626 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525017_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/7][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:39.325333 128635208351552 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.182047 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:39.325933 128635208351552 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.519619 128635208351552 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:39.605498 128635208351552 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279565 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.629108 128635208351552 generateRocpd.cpp:583] writing SQL database for process 2525025 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:39.629837 128635208351552 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525025_results.db (UUID=0001fa78-6b64-7b64-90a8-033625900a5e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.709860 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007769 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.710969 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.712532 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001547 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.722976 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008526 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.799659 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076667 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.801898 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002222 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.801914 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.811563 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009642 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.811580 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.811586 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.811592 128635208351552 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.811698 128635208351552 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.811952 128635208351552 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.182845 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.814976 128635208351552 simple_timer.cpp:55] [rocprofv3] output generation ::     0.207970 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:39.815089 128635208351552 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.209506 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525025_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/7][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:41.338502 140472299503424 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.184086 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:41.339124 140472299503424 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.534687 140472299503424 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:41.617552 140472299503424 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.278428 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.639767 140472299503424 generateRocpd.cpp:583] writing SQL database for process 2525034 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:41.640602 140472299503424 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525034_results.db (UUID=0001fa78-733f-733f-af3c-ffeb4081b42b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.722467 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007851 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.723652 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.725242 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001575 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.735532 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008340 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.792068 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.056522 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.794313 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002229 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.794330 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.803267 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008929 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.803282 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.803288 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.803294 140472299503424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.803402 140472299503424 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.803628 140472299503424 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.163861 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.806596 140472299503424 simple_timer.cpp:55] [rocprofv3] output generation ::     0.187556 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:41.806660 140472299503424 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.189060 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525034_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/7][Approximate profiling time left: 4 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:43.289230 135851595079488 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.182614 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:43.289816 135851595079488 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.483772 135851595079488 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:43.565049 135851595079488 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275233 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.587664 135851595079488 generateRocpd.cpp:583] writing SQL database for process 2525044 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:43.588486 135851595079488 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525044_results.db (UUID=0001fa78-7adf-7adf-b0dd-0090bd421da8)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.670499 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007726 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.671704 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.673400 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001681 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.684059 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008445 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.699574 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.015499 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.701701 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002112 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.701719 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.711012 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009286 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.711026 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.711040 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.711050 135851595079488 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.711149 135851595079488 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.711334 135851595079488 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.123671 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.714117 135851595079488 simple_timer.cpp:55] [rocprofv3] output generation ::     0.147405 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:43.714166 135851595079488 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.149069 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525044_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/7][Approximate profiling time left: 1 second]...
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:45.195896 136033633574720 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.187057 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:45.196524 136033633574720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.391244 136033633574720 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:45.483807 136033633574720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.287283 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.505937 136033633574720 generateRocpd.cpp:583] writing SQL database for process 2525052 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:45.506743 136033633574720 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525052_results.db (UUID=0001fa78-824d-724d-ac63-3e08751344d4)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.589772 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007769 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.590980 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.592591 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001596 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.602822 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008269 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.698637 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.095800 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.700850 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.700869 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.710312 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009436 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.710326 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.710332 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.710339 136033633574720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.710456 136033633574720 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000108 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.710686 136033633574720 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.204749 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.713598 136033633574720 simple_timer.cpp:55] [rocprofv3] output generation ::     0.228507 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:45.713663 136033633574720 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.229808 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525052_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/7][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ_CPC/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:47.202440 140682474291008 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.185998 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:47.203049 140682474291008 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.398256 140682474291008 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:47.489478 140682474291008 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.286429 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.511753 140682474291008 generateRocpd.cpp:583] writing SQL database for process 2525061 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:47.512558 140682474291008 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525061_results.db (UUID=0001fa78-8a25-7a25-bf2d-ed1282085b7d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.594348 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007743 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.595534 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.597130 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001582 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.607236 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008164 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.696681 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.089430 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.698884 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.698901 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.707408 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008500 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.707422 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.707428 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.707435 140682474291008 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.707538 140682474291008 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.707769 140682474291008 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.196016 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.710708 140682474291008 simple_timer.cpp:55] [rocprofv3] output generation ::     0.219792 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:47.710776 140682474291008 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.221247 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ_CPC/MI200/out/pmc_1/2525061_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
