What is the range of time supported by NVIDIA Visual Profiler? For example, is it in seconds, milliseconds, microseconds, or nanoseconds?
Time units are shown at all places. For ex - in the timeline, it is shown at the top bar. If you zoom in/out, unit might change between s (sec), ms (millisecond), us (microsecond) etc.
2-1. When horizontal bars for cudaMalloc or cudaMemory appear, is the CPU in a run state, or is it just waiting?
For CUDA APIs, these bars represent the entire duration of the API, starting from when CUDA starts processing it to when it finishes. It is not necessary that CPU is busy all the time during this duration.
2-2. Are MemCpy(HtoD) mean and MemCpy(DtoH) mean the actual data transfers or other?
Yes, activities shown under the CUDA device and context represent those activities which are executed on the CUDA device. MemCpy trace represent data transfers.