Kernel Version: 4.4.32
Nvdia driver Version: 375.39
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39 Driver Version: 375.39 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P4 On | 0000:14:00.0 Off | Off |
| N/A 46C P0 25W / 75W | 113MiB / 8116MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
lspci info:
14:00.0 3D controller: NVIDIA Corporation Device 1bb3 (rev a1)
Subsystem: NVIDIA Corporation Device 11d8
Flags: bus master, fast devsel, latency 0, IRQ 194
Memory at c4000000 (32-bit, non-prefetchable)
Memory at 383fe0000000 (64-bit, prefetchable)
Memory at 383ff0000000 (64-bit, prefetchable)
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] #19
Kernel driver in use: nvidia
Sample code as follows:
#include <string.h>
#include <stdio.h>
struct DataElement
{
int value;
};
__global__
void Kernel(DataElement *elem) {
//printf("On device: value=%d\n", elem->value);
elem->value = 20;
}
void launch(DataElement *elem) {
Kernel<<< 1, 1 >>>(elem);
cudaDeviceSynchronize();
}
int main(void)
{
DataElement *e;
cudaMallocManaged((void**)&e, sizeof(DataElement));
e->value = 10;
launch(e);
printf("On host: value=%d\n", e->value);
cudaFree(e);
cudaDeviceReset();
}