I’ve been working on extracting a cycle-accurate timing diagram from GPGPU-Sim while simulating a CUDA vector addition kernel on an NVIDIA Turing RTX 2060 model. My goal was to analyze warp execution timing, instruction issue cycles, and memory access latencies to create a pipeline timing diagram similar to what you’d see in architectural studies.
I wanted to ask how should be gpgpusim.config set for this?