How to read data during kernel execution

Mikhail_M · November 19, 2022, 11:36am

I want to read data (with CPU) while my GPU kernel is executing, but this looks almost impossible task. 1) It doesn’t work through cudaMemcpyAsync and 2) works only sometimes through cudaHostAlloc(… cudaHostAllocMapped). At least on Windows 10 since people report 1) works on Linux visual studio - Why GPU memory updates are not read? - Stack Overflow

What I mean by “sometimes” in 2) is that if we take code from cuda - How can I check the progress of matrix multiplication? - Stack Overflow, it can break very easily. After removing printf from kernel, by replacing “for (int i = 0; i < 10; i++)” with “for (int i = 0; i < 10000000; i++)” or “while (true)”. Or after adding additional file to the project (even though added code is not called).

Is this incorrect idea, to read data from kernel in real-time? Is it hard for drivers?

I used Windows 10 Pro 19044.2251, GF 2060 mobile, driver 517.00, CUDA Toolkit 10.2 and Visual Studio 2019 16.0.2. Installed CUDA Toolkit 11.7 instead - nothing improved, but also kernel debugging has broken. Before it stopped on breakpoint in kernel, if it is not run from different stream (i.e. worked for kernel<<<1, 128>>>(params); and not for kernel<<<1, 128, 0, stream2>>>(params);). But still works in another project and project from the second link.

Robert_Crovella · November 19, 2022, 2:32pm

Reading data from a kernel is pretty easy on linux. Your code works on linux (for myself and another as indicated in comments to your question). I’m not aware of any needs beyond your posted code.

On windows WDDM, it is “more difficult”. Some of the reasons for that are indicated in the question you linked. The second kernel listed there does not have any printf in it, and works fine as attested by myself and others historically, even on Windows WDDM.

Topic		Replies	Views
Overlapping kernel execution and data transfer CUDA Programming and Performance	9	3327	May 10, 2017
What's the fastest way for the CPU to update a variable in the GPU memory when a kernel is running? CUDA Programming and Performance	1	589	January 30, 2016
Debugging through kernel code CUDA Programming and Performance	3	3999	October 15, 2009
stymied by my first cuda simple test, need help! CUDA Programming and Performance	4	3870	August 27, 2011
accessing device memory during kernel execution CUDA Programming and Performance	3	1529	March 10, 2010
Kernel not executed without any errors returned CUDA Programming and Performance	2	5865	March 5, 2012
Concurrent copy & execution problem Device to host memory copy is not overlapped with kernel exe CUDA Programming and Performance	1	1761	June 23, 2010
slow kernel CUDA Programming and Performance	4	1444	June 25, 2009
Visual Studio 2022 kernel code question CUDA Programming and Performance	6	34	December 16, 2024
Kernel can not run parallelly with CPU codes OK in XP, failed in Vista CUDA Programming and Performance	4	8689	December 5, 2008

How to read data during kernel execution

Related topics