Following counters might not be supported by rocprof: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_MFMA_MOPS_F64
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:26.996408 140553281572672 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303045 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:27.006324 140553281572672 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.217386 140553281572672 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:27.350088 140553281572672 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343765 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.389565 140553281572672 generateRocpd.cpp:582] writing SQL database for process 2387031 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:27.390844 140553281572672 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387031_results.db (UUID=0000431d-fd0e-7d0e-8659-32cb469c2d72)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.480628 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014313 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.481751 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001091 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.484278 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002499 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.489223 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003031 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.558486 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.069235 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.561315 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002798 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.561344 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.577198 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015838 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.577225 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.577237 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.577249 140553281572672 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.577444 140553281572672 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000178 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.577859 140553281572672 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.188295 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.583639 140553281572672 simple_timer.cpp:55] [rocprofv3] output generation ::     0.231019 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:27.583723 140553281572672 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.233584 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387031_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:29.846242 127373906112320 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307540 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:29.856075 127373906112320 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.069052 127373906112320 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:30.200176 127373906112320 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.239153 127373906112320 generateRocpd.cpp:582] writing SQL database for process 2387041 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:30.240477 127373906112320 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387041_results.db (UUID=0000431e-082b-782b-87d3-4977944cb56a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.331798 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014599 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.332961 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.335507 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002508 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.340541 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003084 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.403775 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063206 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.406657 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002852 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.406686 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.422297 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015594 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.422328 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.422340 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.422352 127373906112320 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.422554 127373906112320 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.423066 127373906112320 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183915 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.429030 127373906112320 simple_timer.cpp:55] [rocprofv3] output generation ::     0.226325 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:30.429123 127373906112320 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.228896 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387041_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 27 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:32.706227 139589549911872 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304786 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:32.715831 139589549911872 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:32.926965 139589549911872 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:33.059140 139589549911872 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343309 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.098193 139589549911872 generateRocpd.cpp:582] writing SQL database for process 2387052 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:33.099458 139589549911872 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387052_results.db (UUID=0000431e-135a-735a-89e4-07e5493e481a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.192358 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014589 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.193493 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.196060 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002538 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.201172 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003126 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.270136 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068935 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.273025 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002858 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.273054 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.288573 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015505 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.288601 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.288613 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.288624 139589549911872 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.288829 139589549911872 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.289328 139589549911872 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.191135 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.295192 139589549911872 simple_timer.cpp:55] [rocprofv3] output generation ::     0.233597 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:33.295275 139589549911872 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.236084 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387052_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 24 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:35.576216 130220087119680 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303302 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:35.584689 130220087119680 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:35.796744 130220087119680 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:35.928258 130220087119680 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343569 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:35.963091 130220087119680 generateRocpd.cpp:582] writing SQL database for process 2387062 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:35.964437 130220087119680 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387062_results.db (UUID=0000431e-1e91-7e91-8385-eb89c9f52dbd)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.044640 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.011888 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.045670 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.047904 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002210 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.052374 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002701 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.109800 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.057403 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.112402 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002576 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.112429 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000005 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.125539 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.013099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.125562 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.125572 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.125583 130220087119680 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.125740 130220087119680 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000141 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.126140 130220087119680 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.163051 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.131135 130220087119680 simple_timer.cpp:55] [rocprofv3] output generation ::     0.200084 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:36.131246 130220087119680 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.202784 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387062_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:38.404662 134055536770880 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303851 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:38.414659 134055536770880 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.625680 134055536770880 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:38.762833 134055536770880 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.348174 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.801760 134055536770880 generateRocpd.cpp:582] writing SQL database for process 2387073 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:38.803063 134055536770880 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387073_results.db (UUID=0000431e-299d-799d-897e-ceab2435dad2)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.894380 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014499 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.895500 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001089 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.897693 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002165 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.902858 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003215 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.965837 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.062950 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.968763 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002897 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.968791 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.984462 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015656 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.984493 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.984505 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.984517 134055536770880 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.984733 134055536770880 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.985319 134055536770880 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183560 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.991691 134055536770880 simple_timer.cpp:55] [rocprofv3] output generation ::     0.226357 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:38.991785 134055536770880 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.228900 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387073_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:41.257179 138732061998912 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305602 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:41.267383 138732061998912 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.479101 138732061998912 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:41.607861 138732061998912 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340478 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.647067 138732061998912 generateRocpd.cpp:582] writing SQL database for process 2387083 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:41.648411 138732061998912 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387083_results.db (UUID=0000431e-34c0-74c0-b4d3-5793a6acba8e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.738640 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014401 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.739754 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001083 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.742317 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002535 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.747345 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.801545 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.054172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.804384 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002809 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.804414 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.820046 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015616 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.820074 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.820086 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.820098 138732061998912 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.820302 138732061998912 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.820771 138732061998912 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.173704 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.826647 138732061998912 simple_timer.cpp:55] [rocprofv3] output generation ::     0.216300 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:41.826729 138732061998912 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.218818 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387083_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:44.100437 135393534775104 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.302641 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:44.110352 135393534775104 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.325434 135393534775104 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:44.453418 135393534775104 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343066 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.491993 135393534775104 generateRocpd.cpp:582] writing SQL database for process 2387094 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:44.493241 135393534775104 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387094_results.db (UUID=0000431e-3fde-7fde-967b-81a936c00e7c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.583515 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014281 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.584655 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.586886 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002201 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.591997 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003108 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.643049 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.051023 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.645769 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002692 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.645798 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.661450 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015637 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.661477 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.661488 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.661500 135393534775104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.661704 135393534775104 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.662119 135393534775104 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.170126 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.667799 135393534775104 simple_timer.cpp:55] [rocprofv3] output generation ::     0.211886 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:44.667873 135393534775104 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.214406 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387094_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:46.941742 131798600662848 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306046 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:46.951690 131798600662848 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.167189 131798600662848 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:47.299606 131798600662848 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347916 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.339348 131798600662848 generateRocpd.cpp:582] writing SQL database for process 2387104 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:47.340638 131798600662848 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387104_results.db (UUID=0000431e-4af4-7af4-8ec8-ae16dacf652b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.431496 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.432653 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.435189 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002508 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.440226 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.534654 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094399 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.537493 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002808 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.537522 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.553576 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.553605 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.553617 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.553629 131798600662848 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.553854 131798600662848 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000207 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.554430 131798600662848 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.215082 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.560598 131798600662848 simple_timer.cpp:55] [rocprofv3] output generation ::     0.258499 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:47.560701 131798600662848 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.261043 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387104_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:49.853075 130346635726656 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306124 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:49.861807 130346635726656 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.072806 130346635726656 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:50.204369 130346635726656 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342563 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.243819 130346635726656 generateRocpd.cpp:582] writing SQL database for process 2387114 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:50.245123 130346635726656 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387114_results.db (UUID=0000431e-5653-7653-91f5-84d635ea0486)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.335855 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014406 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.337016 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001129 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.339525 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002480 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.344664 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.437840 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.093147 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.440662 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002792 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.440691 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.456597 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015891 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.456625 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.456637 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.456649 130346635726656 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.456857 130346635726656 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.457323 130346635726656 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.213505 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.463477 130346635726656 simple_timer.cpp:55] [rocprofv3] output generation ::     0.256663 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:50.463563 130346635726656 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.259143 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387114_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:52.790189 130391659491136 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.310669 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:52.798721 130391659491136 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.010493 130391659491136 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:53.142384 130391659491136 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343663 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.180640 130391659491136 generateRocpd.cpp:582] writing SQL database for process 2387125 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:53.181891 130391659491136 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387125_results.db (UUID=0000431e-61c8-71c8-a2bc-b910fd4f3b69)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.273362 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.015039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.274489 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.277075 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002558 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.282165 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003124 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.417047 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.134854 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.419886 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002808 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.419914 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.435334 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015406 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.435361 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.435373 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.435385 130391659491136 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.435597 130391659491136 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000193 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.436112 130391659491136 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.255472 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.441904 130391659491136 simple_timer.cpp:55] [rocprofv3] output generation ::     0.297060 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:53.442000 130391659491136 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.299566 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387125_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:55.767260 135332492971840 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.310454 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:55.777322 135332492971840 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:55.992294 135332492971840 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:56.125090 135332492971840 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347768 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.164325 135332492971840 generateRocpd.cpp:582] writing SQL database for process 2387135 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:56.165616 135332492971840 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387135_results.db (UUID=0000431e-6d69-7d69-9fa4-b2eb682b39a3)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.257128 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.015019 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.258268 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001109 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.260793 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002497 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.265896 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.390656 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.124731 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.393451 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002765 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.393480 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.409411 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015916 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.409439 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.409451 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.409463 135332492971840 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.409690 135332492971840 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000209 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.410256 135332492971840 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.245931 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.416574 135332492971840 simple_timer.cpp:55] [rocprofv3] output generation ::     0.289009 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:56.416679 135332492971840 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.291541 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387135_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:58.701342 124555138260800 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304523 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:58.711286 124555138260800 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:58.922109 124555138260800 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:59:59.051640 124555138260800 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340354 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.090905 124555138260800 generateRocpd.cpp:582] writing SQL database for process 2387145 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:59:59.092197 124555138260800 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/dl385-20-mi100-3c48/2387145_results.db (UUID=0000431e-78e5-78e5-9bab-7fcd93af6b53)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.203830 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014435 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.205001 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001136 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.207520 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002490 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.212556 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003091 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.289491 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076906 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.292334 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002814 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.292364 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.308509 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016130 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.308539 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.308552 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.308563 124555138260800 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.308781 124555138260800 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.309376 124555138260800 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.218472 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.316077 124555138260800 simple_timer.cpp:55] [rocprofv3] output generation ::     0.261946 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:59:59.316169 124555138260800 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.264477 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/device_inv_int/MI100/out/pmc_1/2387145_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
