Following counters might not be supported by rocprof: SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F16
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:47.830860 125038026383168 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305151 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:47.840856 125038026383168 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.052398 125038026383168 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:48.188034 125038026383168 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.227316 125038026383168 generateRocpd.cpp:582] writing SQL database for process 2381881 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:46:48.228601 125038026383168 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381881_results.db (UUID=00004312-678e-778e-bf3c-458e641f3911)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.315941 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.317131 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001160 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.319707 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002547 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.324460 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002852 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.393346 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068855 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.395828 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002453 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.395857 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.411730 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015858 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.411758 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.411770 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.411782 125038026383168 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.412007 125038026383168 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000206 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.412447 125038026383168 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.185131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.418217 125038026383168 simple_timer.cpp:55] [rocprofv3] output generation ::     0.227677 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:48.418294 125038026383168 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.230209 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381881_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:50.696838 133401098018624 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306665 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:50.706853 133401098018624 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:50.922283 133401098018624 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:51.053720 133401098018624 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.346867 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.092795 133401098018624 generateRocpd.cpp:582] writing SQL database for process 2381891 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:46:51.094084 133401098018624 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381891_results.db (UUID=00004312-72be-72be-998a-f59713e3c50d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.179894 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014493 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.181001 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001073 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.183444 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002414 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.188218 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002973 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.251724 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063470 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.254389 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002631 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.254418 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.270143 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015711 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.270170 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.270183 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.270194 133401098018624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.270401 133401098018624 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.270830 133401098018624 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.178036 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.276906 133401098018624 simple_timer.cpp:55] [rocprofv3] output generation ::     0.220714 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:51.277001 133401098018624 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.223231 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381891_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 27 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:53.582161 138397799022400 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306838 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:53.591461 138397799022400 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:53.801807 138397799022400 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:53.933992 138397799022400 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342531 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:53.973331 138397799022400 generateRocpd.cpp:582] writing SQL database for process 2381902 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:46:53.974623 138397799022400 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381902_results.db (UUID=00004312-7e03-7e03-8051-d21a808701a9)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.061046 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014585 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.062216 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001138 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.064690 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002445 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.069657 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003072 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.137379 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.067694 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.139902 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002493 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.139932 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.156162 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016209 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.156194 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.156206 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.156218 138397799022400 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.156444 138397799022400 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000205 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.157077 138397799022400 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183747 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.163086 138397799022400 simple_timer.cpp:55] [rocprofv3] output generation ::     0.226638 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:54.163181 138397799022400 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.229136 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381902_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 24 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:56.446255 133188364889920 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305644 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:56.455773 133188364889920 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:56.666675 133188364889920 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:56.798181 133188364889920 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342408 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:56.837096 133188364889920 generateRocpd.cpp:582] writing SQL database for process 2381913 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:46:56.838355 133188364889920 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381913_results.db (UUID=00004312-8935-7935-be4e-da954eced122)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:56.927913 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014351 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:56.929063 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001116 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:56.931762 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002671 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:56.937033 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003138 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.009092 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.072031 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.011695 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002573 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.011724 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.027147 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015408 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.027175 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.027187 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.027199 133188364889920 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.027401 133188364889920 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000184 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.027828 133188364889920 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.190733 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.033597 133188364889920 simple_timer.cpp:55] [rocprofv3] output generation ::     0.232936 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:57.033674 133188364889920 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.235443 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381913_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:59.322812 125411139489600 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305614 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:59.333131 125411139489600 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.544837 125411139489600 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:59.676686 125411139489600 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343555 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.716475 125411139489600 generateRocpd.cpp:582] writing SQL database for process 2381923 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:46:59.717736 125411139489600 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381923_results.db (UUID=00004312-9471-7471-96b7-3326bee06ed5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.808697 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014358 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.809854 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.812190 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002307 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.817323 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003135 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.880501 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.883227 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002697 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.883257 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.899073 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015801 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.899101 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.899113 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.899125 125411139489600 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.899337 125411139489600 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.899751 125411139489600 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183277 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.905569 125411139489600 simple_timer.cpp:55] [rocprofv3] output generation ::     0.226391 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:59.905664 125411139489600 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.228926 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381923_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:02.178243 124626109939520 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306061 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:02.188104 124626109939520 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.400814 124626109939520 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:02.535456 124626109939520 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347352 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.575095 124626109939520 generateRocpd.cpp:582] writing SQL database for process 2381933 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:02.576361 124626109939520 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381933_results.db (UUID=00004312-9f98-7f98-bbfc-6fa28f16177c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.665962 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014159 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.667115 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001107 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.669860 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002716 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.675093 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003219 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.728663 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.053541 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.731178 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002486 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.731209 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.747504 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016281 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.747537 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.747549 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.747561 124626109939520 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.747782 124626109939520 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000207 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.748294 124626109939520 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.173199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.754247 124626109939520 simple_timer.cpp:55] [rocprofv3] output generation ::     0.216312 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:02.754338 124626109939520 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.218831 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381933_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:05.052320 140530508193600 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.310615 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:05.062210 140530508193600 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.274288 140530508193600 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:05.404653 140530508193600 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342443 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.443700 140530508193600 generateRocpd.cpp:582] writing SQL database for process 2381943 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:05.444956 140530508193600 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381943_results.db (UUID=00004312-aace-7ace-a653-b53b49d093a2)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.530813 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013731 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.531929 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001086 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.534122 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002164 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.539098 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003129 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.589773 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.050641 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.592498 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002696 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.592527 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.608588 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016046 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.608620 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.608632 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.608644 140530508193600 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.608873 140530508193600 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.609480 140530508193600 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.165780 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.615457 140530508193600 simple_timer.cpp:55] [rocprofv3] output generation ::     0.208302 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:05.615559 140530508193600 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.210853 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381943_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:07.909826 131995248783168 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306231 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:07.919910 131995248783168 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.134821 131995248783168 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:08.267870 131995248783168 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347960 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.307843 131995248783168 generateRocpd.cpp:582] writing SQL database for process 2381953 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:08.309119 131995248783168 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381953_results.db (UUID=00004312-b5fc-75fc-9ecd-53fb93698a70)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.391856 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.392962 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001075 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.395327 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002327 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.400368 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.494660 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094264 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.497263 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002573 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.497292 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.513100 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015791 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.513127 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.513142 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.513155 131995248783168 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.513345 131995248783168 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000178 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.513798 131995248783168 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.205956 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.519837 131995248783168 simple_timer.cpp:55] [rocprofv3] output generation ::     0.249479 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:08.519924 131995248783168 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.252002 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381953_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:11.061386 133648586731328 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304354 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:11.071343 133648586731328 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.282215 133648586731328 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:11.413218 133648586731328 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341876 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.452429 133648586731328 generateRocpd.cpp:582] writing SQL database for process 2381963 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:11.453707 133648586731328 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381963_results.db (UUID=00004312-c24d-724d-b1b4-398b41fccfe9)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.540391 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014442 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.541459 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001035 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.543943 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002455 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.549054 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003154 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.641720 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.092638 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.644420 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002670 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.644448 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.660766 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016296 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.660802 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.660824 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.660846 133648586731328 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.661083 133648586731328 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000214 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.661795 133648586731328 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.209366 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.667792 133648586731328 simple_timer.cpp:55] [rocprofv3] output generation ::     0.252116 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:11.667897 133648586731328 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.254627 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381963_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:14.219901 134750277214016 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.313419 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:14.229712 134750277214016 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.440630 134750277214016 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:14.575084 134750277214016 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345372 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.614577 134750277214016 generateRocpd.cpp:582] writing SQL database for process 2381974 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:14.615835 134750277214016 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381974_results.db (UUID=00004312-ce9b-7e9b-8601-c7a477c71184)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.704625 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014958 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.705772 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001113 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.708372 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002572 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.713534 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003134 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.848236 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.134672 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.851073 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002805 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.851102 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.867010 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015893 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.867038 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.867050 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.867061 134750277214016 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.867277 134750277214016 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.867724 134750277214016 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.253148 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.873509 134750277214016 simple_timer.cpp:55] [rocprofv3] output generation ::     0.295933 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:14.873602 134750277214016 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.298457 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381974_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:17.197013 136541947387712 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.309707 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:17.207531 136541947387712 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.418617 136541947387712 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:17.553225 136541947387712 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345694 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.591968 136541947387712 generateRocpd.cpp:582] writing SQL database for process 2381984 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:17.593285 136541947387712 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381984_results.db (UUID=00004312-da3f-7a3f-8b24-f2bda65ca399)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.684638 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.015009 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.685787 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.688393 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002576 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.693605 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.818013 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.124375 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.820721 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002678 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.820751 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.836757 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015992 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.836787 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.836799 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.836811 136541947387712 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.837058 136541947387712 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000224 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.837619 136541947387712 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.245651 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.843549 136541947387712 simple_timer.cpp:55] [rocprofv3] output generation ::     0.287851 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:17.843647 136541947387712 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.290368 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381984_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:20.118647 140297539952448 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307216 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:20.128568 140297539952448 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.339559 140297539952448 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:47:20.473293 140297539952448 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344725 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.513247 140297539952448 generateRocpd.cpp:582] writing SQL database for process 2381996 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:47:20.514513 140297539952448 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/dl385-20-mi100-3c48/2381996_results.db (UUID=00004312-e5ab-75ab-8a94-cbac3ddb0f2c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.603751 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014280 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.604869 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001086 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.607367 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002469 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.612327 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.688305 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.075949 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.690858 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002520 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.690887 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.706659 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015757 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.706688 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.706700 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.706712 140297539952448 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.706917 140297539952448 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.707382 140297539952448 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.194136 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.713367 140297539952448 simple_timer.cpp:55] [rocprofv3] output generation ::     0.237603 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:47:20.713463 140297539952448 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.240118 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/no_roof/MI100/out/pmc_1/2381996_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
