Following counters might not be supported by rocprof: SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_TRANS_F64
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:30.568997 126422318931776 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.308095 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:30.578705 126422318931776 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:30.793257 126422318931776 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:30.927835 126422318931776 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.349130 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:30.967152 126422318931776 generateRocpd.cpp:582] writing SQL database for process 2381519 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:30.968490 126422318931776 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381519_results.db (UUID=00004311-39bd-79bd-8b0b-7bd5fbbe4d4b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.056951 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014317 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.058053 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001057 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.060590 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002508 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.065551 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003046 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.134428 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068849 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.136918 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002458 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.136947 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.153222 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016260 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.153260 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.153272 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.153285 126422318931776 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.153514 126422318931776 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000212 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.154127 126422318931776 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.186976 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.160177 126422318931776 simple_timer.cpp:55] [rocprofv3] output generation ::     0.229821 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:31.160285 126422318931776 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.232400 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381519_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:33.416318 132884319207232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306549 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:33.426386 132884319207232 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.637484 132884319207232 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:33.769925 132884319207232 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343540 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.809095 132884319207232 generateRocpd.cpp:582] writing SQL database for process 2381529 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:33.810359 132884319207232 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381529_results.db (UUID=00004311-44de-74de-bdae-2d46c1ad4b90)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.899587 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014429 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.900681 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001063 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.903111 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002403 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.908072 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003044 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.971068 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.062968 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.973910 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002812 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.973938 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.989917 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015964 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.989947 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.989959 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.989971 132884319207232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.990192 132884319207232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000195 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.990717 132884319207232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.181623 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.996590 132884319207232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.224173 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:33.996675 132884319207232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.226690 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381529_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 27 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:36.265705 134784093667136 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307172 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:36.276003 134784093667136 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.490361 134784093667136 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:36.621916 134784093667136 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345913 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.660927 134784093667136 generateRocpd.cpp:582] writing SQL database for process 2381539 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:36.662230 134784093667136 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381539_results.db (UUID=00004311-4ffe-7ffe-9490-c59a9c7783fe)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.746568 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013951 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.747696 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.750044 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002317 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.754846 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002909 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.822601 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.067727 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.825478 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002848 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.825507 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.841352 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015831 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.841388 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.841400 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.841412 134784093667136 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.841635 134784093667136 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000210 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.842207 134784093667136 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.181280 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.848089 134784093667136 simple_timer.cpp:55] [rocprofv3] output generation ::     0.223654 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:36.848177 134784093667136 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.226210 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381539_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 24 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:39.123604 135794008936256 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306587 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:39.133492 135794008936256 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.343871 135794008936256 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:39.475032 135794008936256 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341540 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.514522 135794008936256 generateRocpd.cpp:582] writing SQL database for process 2381550 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:39.515809 135794008936256 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381550_results.db (UUID=00004311-5b29-7b29-a70e-52ebe412cba5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.600885 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014539 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.602087 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.604641 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002525 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.609237 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002832 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.680869 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.071603 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.683450 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002552 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.683479 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.699253 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015758 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.699288 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.699305 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.699319 135794008936256 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.699524 135794008936256 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.700077 135794008936256 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.185556 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.706034 135794008936256 simple_timer.cpp:55] [rocprofv3] output generation ::     0.228511 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:39.706126 135794008936256 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.231042 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381550_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:41.985480 140453931216704 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.301446 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:41.995436 140453931216704 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.206618 140453931216704 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:42.336007 140453931216704 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340572 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.375276 140453931216704 generateRocpd.cpp:582] writing SQL database for process 2381561 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:42.376590 140453931216704 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381561_results.db (UUID=00004311-665c-765c-a593-26e365aec479)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.465858 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014212 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.466969 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001079 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.469212 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.474321 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003147 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.537207 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.062857 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.539716 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002480 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.539746 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.555316 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015555 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.555351 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.555364 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.555375 140453931216704 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.555605 140453931216704 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.556204 140453931216704 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.180928 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.562167 140453931216704 simple_timer.cpp:55] [rocprofv3] output generation ::     0.223647 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:42.562257 140453931216704 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.226198 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381561_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:44.815880 137574950317888 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303008 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:44.825686 137574950317888 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.031991 137574950317888 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:45.161845 137574950317888 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.336159 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.200501 137574950317888 generateRocpd.cpp:582] writing SQL database for process 2381572 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:45.201784 137574950317888 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381572_results.db (UUID=00004311-7169-7169-acfa-a1203dba86a7)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.287018 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013868 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.288165 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.290592 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002399 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.295519 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003040 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.349238 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.053683 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.351964 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002694 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.352004 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.367240 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015221 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.367270 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.367282 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.367294 137574950317888 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.367517 137574950317888 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.368073 137574950317888 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.167572 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.373992 137574950317888 simple_timer.cpp:55] [rocprofv3] output generation ::     0.209640 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:45.374074 137574950317888 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.212177 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381572_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:47.649516 128537606897472 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.314824 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:47.659422 128537606897472 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:47.870819 128537606897472 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:48.003493 128537606897472 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344071 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.042713 128537606897472 generateRocpd.cpp:582] writing SQL database for process 2381582 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:48.044001 128537606897472 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381582_results.db (UUID=00004311-7c6f-7c6f-8ed5-65195f267b4d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.126613 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013973 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.127729 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001083 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.129805 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002050 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.134553 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002946 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.187109 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.052528 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.189735 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002597 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.189763 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.204784 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015006 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.204813 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.204825 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.204836 128537606897472 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.205060 128537606897472 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.205539 128537606897472 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.162827 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.211038 128537606897472 simple_timer.cpp:55] [rocprofv3] output generation ::     0.205099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:48.211111 128537606897472 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.207568 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381582_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:50.479424 134621426118464 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305865 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:50.489188 134621426118464 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:50.699628 134621426118464 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:50.829877 134621426118464 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340689 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:50.869283 134621426118464 generateRocpd.cpp:582] writing SQL database for process 2381592 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:50.870549 134621426118464 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381592_results.db (UUID=00004311-8786-7786-a9d7-d54460121dc0)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:50.957396 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014241 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:50.958496 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001066 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:50.960987 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002463 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:50.966001 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003106 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.060360 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094330 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.063176 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002786 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.063205 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.078629 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015409 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.078657 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.078669 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.078681 134621426118464 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.078901 134621426118464 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.079364 134621426118464 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.210081 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.085421 134621426118464 simple_timer.cpp:55] [rocprofv3] output generation ::     0.253053 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:51.085508 134621426118464 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.255581 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381592_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:53.363039 131872789512000 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.308531 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:53.372736 131872789512000 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.584095 131872789512000 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:53.715522 131872789512000 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342786 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.754910 131872789512000 generateRocpd.cpp:582] writing SQL database for process 2381603 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:53.756186 131872789512000 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381603_results.db (UUID=00004311-92c6-72c6-a452-979655fcd0bf)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.846541 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014554 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.847760 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.850356 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002568 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.855419 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.948152 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.092703 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.950835 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002652 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.950863 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.966885 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016008 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.966914 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.966926 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.966938 131872789512000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.967164 131872789512000 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000205 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.967781 131872789512000 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.212872 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.973804 131872789512000 simple_timer.cpp:55] [rocprofv3] output generation ::     0.255786 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:53.973907 131872789512000 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.258332 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381603_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:56.278909 128464817205056 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.310466 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:56.288599 128464817205056 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.498853 128464817205056 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:56.631864 128464817205056 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343265 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.672145 128464817205056 generateRocpd.cpp:582] writing SQL database for process 2381613 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:56.673395 128464817205056 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381613_results.db (UUID=00004311-9e29-7e29-bcdf-921098459ac1)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.763918 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014912 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.765081 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.767648 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002534 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.772833 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003175 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.907513 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.134652 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.910186 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002640 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.910215 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.926484 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016254 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.926518 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.926531 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.926543 128464817205056 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.926776 128464817205056 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000213 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.927390 128464817205056 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.255246 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.933532 128464817205056 simple_timer.cpp:55] [rocprofv3] output generation ::     0.299145 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:56.933635 128464817205056 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.301722 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381613_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:59.261105 126571942604608 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.312956 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:59.270904 126571942604608 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.482028 126571942604608 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:45:59.616152 126571942604608 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345249 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.655048 126571942604608 generateRocpd.cpp:582] writing SQL database for process 2381623 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:45:59.656364 126571942604608 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381623_results.db (UUID=00004311-a9cc-79cc-8ca9-3335d080eb4c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.741457 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014247 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.742614 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.745184 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002542 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.749790 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002805 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.873750 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.123927 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.876431 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002651 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.876460 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.892234 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015759 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.892266 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.892279 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.892291 126571942604608 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.892508 126571942604608 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.893189 126571942604608 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.238142 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.899093 126571942604608 simple_timer.cpp:55] [rocprofv3] output generation ::     0.280439 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:45:59.899189 126571942604608 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.282989 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381623_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:02.166843 125581514223424 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306484 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:02.176686 125581514223424 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.388507 125581514223424 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:46:02.521770 125581514223424 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345084 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.560768 125581514223424 generateRocpd.cpp:582] writing SQL database for process 2381633 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:46:02.562037 125581514223424 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/dl385-20-mi100-3c48/2381633_results.db (UUID=00004311-b52d-752d-b53a-d91c9f026749)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.648347 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014548 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.649455 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001078 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.652009 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002525 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.656795 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002952 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.732963 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076140 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.735655 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002641 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.735699 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.751529 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015812 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.751563 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.751576 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.751588 125581514223424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.751804 125581514223424 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.752369 125581514223424 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.191601 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.758229 125581514223424 simple_timer.cpp:55] [rocprofv3] output generation ::     0.233946 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:46:02.758324 125581514223424 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.236501 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/vcopy/MI100/out/pmc_1/2381633_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
