Hello,
I just wrote my first cuda program which gets the device capabilities and such. It’s kinda fun to watch. I do wonder what other people/devices have, so if you think it’s fun too then here is your chance… download TestProgram.exe and then run it and then copy and paste the text here. (Just right click the ms-dos prompt window to do “select all” and then “copy”, then paste it here in this forum)
(The TestProgram.exe is a debug version with asserts here and there, so it should be quite safe to run !) (This also gives me a chance to try out my coding skills on your hardware and to see if it works on your hardware and if cuda is good External Image
TestProgram can be downloaded from this folder:
http://www.skybuck.org/CUDA/DeviceCapabilities/
TestProgram direct link:
http://www.skybuck.org/CUDA/DeviceCapabilities/TestProgram.exe
Here’s my Asus GT 520 silent output:
program started
mCudaDriverLibrary.Initialized true.
mCudaDriverVersionManagement.VersionAvailable true.
mCudaDriverVersionManagement.Version: 4000
mCudaDriverDeviceManagement.DeviceCountAvailable true
mCudaDriverDeviceManagement.DeviceCount: 1
mCudaDriverDeviceManagement.Device[0].Number: 0
mCudaDriverDeviceManagement.Device[0].Handle: 0
mCudaDriverDeviceManagement.Device[0].Name: GeForce GT 520
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Major: 2
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Minor: 1
mCudaDriverDeviceManagement.Device[0].MemorySize: 1008402432
mCudaDriverDeviceManagement.Device[0].Properties.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[0]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[1]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[2]: 64
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[0]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[1]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[2]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxSharedMemoryPerBlock: 49152
mCudaDriverDeviceManagement.Device[0].Properties.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Properties.WarpSize: 32
mCudaDriverDeviceManagement.Device[0].Properties.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Properties.MaxRegistersPerBlock: 32768
mCudaDriverDeviceManagement.Device[0].Properties.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Properties.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionX: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionY: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionZ: 64
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionX: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionY: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionZ: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxSharedMemoryForBlocksPerMult
iProcessor: 49152
mCudaDriverDeviceManagement.Device[0].Attributes.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxWarpSize: 32
mCudaDriverDeviceManagement.Device[0].Attributes.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Attributes.MaxRegistersForBlocksPerMultiPr
ocessor: 32768
mCudaDriverDeviceManagement.Device[0].Attributes.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Attributes.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryCopyAndKernelExecutionOve
rlap: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.MultiProcessorCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.RunTimeLimitForKernels: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.IntegratedWithHostMemory: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.CanMapHostMemoryIntoCudaAddress
Space: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ComputeMode: 0 (DEFAULT/UNRESTR
ICTED)
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DHeight: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DWidth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DHeight: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DDepth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredHeight: 1638
4
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayHeight: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArraySlices: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.SurfaceAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.ConcurrentKernels: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ErrorCorrectingCodesEnabled: FA
LSE
mCudaDriverDeviceManagement.Device[0].Attributes.PCIBusID: 2
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDeviceID: 0
mCudaDriverDeviceManagement.Device[0].Attributes.UsingTCCDriver: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryClockFrequency: 600000000
mCudaDriverDeviceManagement.Device[0].Attributes.GlobalMemoryBusWidthInBits: 64
mCudaDriverDeviceManagement.Device[0].Attributes.Level2CacheSize: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxResidentThreadsPerMultiProce
ssor: 1536
mCudaDriverDeviceManagement.Device[0].Attributes.AsynchronousEngineCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.UnifiedAddressing: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDomainID: 0
program finished
Bye,
Skybuck.
Hello,
I just wrote my first cuda program which gets the device capabilities and such. It’s kinda fun to watch. I do wonder what other people/devices have, so if you think it’s fun too then here is your chance… download TestProgram.exe and then run it and then copy and paste the text here. (Just right click the ms-dos prompt window to do “select all” and then “copy”, then paste it here in this forum)
(The TestProgram.exe is a debug version with asserts here and there, so it should be quite safe to run !) (This also gives me a chance to try out my coding skills on your hardware and to see if it works on your hardware and if cuda is good External Image
TestProgram can be downloaded from this folder:
http://www.skybuck.org/CUDA/DeviceCapabilities/
TestProgram direct link:
http://www.skybuck.org/CUDA/DeviceCapabilities/TestProgram.exe
Here’s my Asus GT 520 silent output:
program started
mCudaDriverLibrary.Initialized true.
mCudaDriverVersionManagement.VersionAvailable true.
mCudaDriverVersionManagement.Version: 4000
mCudaDriverDeviceManagement.DeviceCountAvailable true
mCudaDriverDeviceManagement.DeviceCount: 1
mCudaDriverDeviceManagement.Device[0].Number: 0
mCudaDriverDeviceManagement.Device[0].Handle: 0
mCudaDriverDeviceManagement.Device[0].Name: GeForce GT 520
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Major: 2
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Minor: 1
mCudaDriverDeviceManagement.Device[0].MemorySize: 1008402432
mCudaDriverDeviceManagement.Device[0].Properties.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[0]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[1]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[2]: 64
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[0]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[1]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[2]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxSharedMemoryPerBlock: 49152
mCudaDriverDeviceManagement.Device[0].Properties.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Properties.WarpSize: 32
mCudaDriverDeviceManagement.Device[0].Properties.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Properties.MaxRegistersPerBlock: 32768
mCudaDriverDeviceManagement.Device[0].Properties.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Properties.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionX: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionY: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionZ: 64
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionX: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionY: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionZ: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxSharedMemoryForBlocksPerMult
iProcessor: 49152
mCudaDriverDeviceManagement.Device[0].Attributes.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxWarpSize: 32
mCudaDriverDeviceManagement.Device[0].Attributes.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Attributes.MaxRegistersForBlocksPerMultiPr
ocessor: 32768
mCudaDriverDeviceManagement.Device[0].Attributes.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Attributes.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryCopyAndKernelExecutionOve
rlap: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.MultiProcessorCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.RunTimeLimitForKernels: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.IntegratedWithHostMemory: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.CanMapHostMemoryIntoCudaAddress
Space: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ComputeMode: 0 (DEFAULT/UNRESTR
ICTED)
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DHeight: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DWidth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DHeight: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DDepth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredHeight: 1638
4
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayHeight: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArraySlices: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.SurfaceAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.ConcurrentKernels: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ErrorCorrectingCodesEnabled: FA
LSE
mCudaDriverDeviceManagement.Device[0].Attributes.PCIBusID: 2
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDeviceID: 0
mCudaDriverDeviceManagement.Device[0].Attributes.UsingTCCDriver: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryClockFrequency: 600000000
mCudaDriverDeviceManagement.Device[0].Attributes.GlobalMemoryBusWidthInBits: 64
mCudaDriverDeviceManagement.Device[0].Attributes.Level2CacheSize: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxResidentThreadsPerMultiProce
ssor: 1536
mCudaDriverDeviceManagement.Device[0].Attributes.AsynchronousEngineCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.UnifiedAddressing: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDomainID: 0
program finished
Bye,
Skybuck.
Test program updated, new output:
program started
mCudaDriverLibrary.Initialized true.
mCudaDriverVersionManagement.VersionAvailable true.
mCudaDriverVersionManagement.Version: 4000
mCudaDriverDeviceManagement.DeviceCountAvailable true
mCudaDriverDeviceManagement.DeviceCount: 1
mCudaDriverDeviceManagement.Device[0].Number: 0
mCudaDriverDeviceManagement.Device[0].Handle: 0
mCudaDriverDeviceManagement.Device[0].Name: GeForce GT 520
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Major: 2
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Minor: 1
mCudaDriverDeviceManagement.Device[0].MemorySize: 1008402432
mCudaDriverDeviceManagement.Device[0].Properties.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[0]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[1]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[2]: 64
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[0]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[1]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[2]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxSharedMemoryPerBlock: 49152
mCudaDriverDeviceManagement.Device[0].Properties.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Properties.WarpSize: 32
mCudaDriverDeviceManagement.Device[0].Properties.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Properties.MaxRegistersPerBlock: 32768
mCudaDriverDeviceManagement.Device[0].Properties.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Properties.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionX: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionY: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionZ: 64
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionX: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionY: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionZ: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxSharedMemoryForBlocksPerMult
iProcessor: 49152
mCudaDriverDeviceManagement.Device[0].Attributes.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxWarpSize: 32
mCudaDriverDeviceManagement.Device[0].Attributes.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Attributes.MaxRegistersForBlocksPerMultiPr
ocessor: 32768
mCudaDriverDeviceManagement.Device[0].Attributes.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Attributes.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryCopyAndKernelExecutionOve
rlap: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.MultiProcessorCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.RunTimeLimitForKernels: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.IntegratedWithHostMemory: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.CanMapHostMemoryIntoCudaAddress
Space: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ComputeMode: 0 (DEFAULT/UNRESTR
ICTED)
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DHeight: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DWidth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DHeight: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DDepth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredHeight: 1638
4
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayHeight: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArraySlices: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.SurfaceAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.ConcurrentKernels: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ErrorCorrectingCodesEnabled: FA
LSE
mCudaDriverDeviceManagement.Device[0].Attributes.PCIBusID: 2
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDeviceID: 0
mCudaDriverDeviceManagement.Device[0].Attributes.UsingTCCDriver: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryClockFrequency: 600000000
mCudaDriverDeviceManagement.Device[0].Attributes.GlobalMemoryBusWidthInBits: 64
mCudaDriverDeviceManagement.Device[0].Attributes.Level2CacheSize: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxResidentThreadsPerMultiProce
ssor: 1536
mCudaDriverDeviceManagement.Device[0].Attributes.AsynchronousEngineCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.UnifiedAddressing: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDomainID: 0
mCudaDriverContext.Open( CudaDriverContextSchedulingAutomatic, mCudaDriverDevice
Management.Device[0] ) successfull.
mCudaDriverContext.Enter successfull.
mCudaDriverContext.APIVersion: 3010
mCudaDriverContext.Handle: 11101480
mCudaDriverContext.IsOpen: TRUE
mCudaDriverContext.IsCurrent: TRUE
mCudaDriverContext.IsSameDevice: TRUE
mCudaDriverContext.CachePreference: 0
assignment test:
mCudaDriverContext.CachePreference := CudaDriverContextCachePreferenceLargerShar
edMemorySmallerL1Cache(1)
mCudaDriverContext.CachePreference: 1
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitStackSize]: 1024
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitPrintfFifoSize]:
1048576
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitMallocHeapSize]:
8388608
assignment test:
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitStackSize] := 101
00100
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitPrintfFifoSize] :
= 20100100
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitMallocHeapSize] :
= 30100100
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitStackSize]: 1024
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitPrintfFifoSize]:
20100352
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitMallocHeapSize]:
30146560
mCudaDriverContext.Leave successfull.
mCudaDriverContext.Close successfull.
program finished
Test program updated, new output:
program started
mCudaDriverLibrary.Initialized true.
mCudaDriverVersionManagement.VersionAvailable true.
mCudaDriverVersionManagement.Version: 4000
mCudaDriverDeviceManagement.DeviceCountAvailable true
mCudaDriverDeviceManagement.DeviceCount: 1
mCudaDriverDeviceManagement.Device[0].Number: 0
mCudaDriverDeviceManagement.Device[0].Handle: 0
mCudaDriverDeviceManagement.Device[0].Name: GeForce GT 520
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Major: 2
mCudaDriverDeviceManagement.Device[0].ComputeCapability.Minor: 1
mCudaDriverDeviceManagement.Device[0].MemorySize: 1008402432
mCudaDriverDeviceManagement.Device[0].Properties.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[0]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[1]: 1024
mCudaDriverDeviceManagement.Device[0].Properties.MaxBlockDimension[2]: 64
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[0]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[1]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxGridDimension[2]: 65535
mCudaDriverDeviceManagement.Device[0].Properties.MaxSharedMemoryPerBlock: 49152
mCudaDriverDeviceManagement.Device[0].Properties.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Properties.WarpSize: 32
mCudaDriverDeviceManagement.Device[0].Properties.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Properties.MaxRegistersPerBlock: 32768
mCudaDriverDeviceManagement.Device[0].Properties.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Properties.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MaxThreadsPerBlock: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionX: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionY: 1024
mCudaDriverDeviceManagement.Device[0].Attributes.MaxBlockDimensionZ: 64
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionX: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionY: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxGridDimensionZ: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxSharedMemoryForBlocksPerMult
iProcessor: 49152
mCudaDriverDeviceManagement.Device[0].Attributes.MaxConstantMemory: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxWarpSize: 32
mCudaDriverDeviceManagement.Device[0].Attributes.MaxMemoryPitch: 2147483647
mCudaDriverDeviceManagement.Device[0].Attributes.MaxRegistersForBlocksPerMultiPr
ocessor: 32768
mCudaDriverDeviceManagement.Device[0].Attributes.ClockFrequency: 1620000000
mCudaDriverDeviceManagement.Device[0].Attributes.TextureAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryCopyAndKernelExecutionOve
rlap: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.MultiProcessorCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.RunTimeLimitForKernels: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.IntegratedWithHostMemory: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.CanMapHostMemoryIntoCudaAddress
Space: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ComputeMode: 0 (DEFAULT/UNRESTR
ICTED)
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DWidth: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DHeight: 65535
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DWidth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DHeight: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture3DDepth: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredHeight: 1638
4
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArrayHeight: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture2DArraySlices: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.SurfaceAlignment: 512
mCudaDriverDeviceManagement.Device[0].Attributes.ConcurrentKernels: TRUE
mCudaDriverDeviceManagement.Device[0].Attributes.ErrorCorrectingCodesEnabled: FA
LSE
mCudaDriverDeviceManagement.Device[0].Attributes.PCIBusID: 2
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDeviceID: 0
mCudaDriverDeviceManagement.Device[0].Attributes.UsingTCCDriver: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MemoryClockFrequency: 600000000
mCudaDriverDeviceManagement.Device[0].Attributes.GlobalMemoryBusWidthInBits: 64
mCudaDriverDeviceManagement.Device[0].Attributes.Level2CacheSize: 65536
mCudaDriverDeviceManagement.Device[0].Attributes.MaxResidentThreadsPerMultiProce
ssor: 1536
mCudaDriverDeviceManagement.Device[0].Attributes.AsynchronousEngineCount: 1
mCudaDriverDeviceManagement.Device[0].Attributes.UnifiedAddressing: FALSE
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredWidth: 16384
mCudaDriverDeviceManagement.Device[0].Attributes.MaxTexture1DLayeredLayers: 2048
mCudaDriverDeviceManagement.Device[0].Attributes.PCIDomainID: 0
mCudaDriverContext.Open( CudaDriverContextSchedulingAutomatic, mCudaDriverDevice
Management.Device[0] ) successfull.
mCudaDriverContext.Enter successfull.
mCudaDriverContext.APIVersion: 3010
mCudaDriverContext.Handle: 11101480
mCudaDriverContext.IsOpen: TRUE
mCudaDriverContext.IsCurrent: TRUE
mCudaDriverContext.IsSameDevice: TRUE
mCudaDriverContext.CachePreference: 0
assignment test:
mCudaDriverContext.CachePreference := CudaDriverContextCachePreferenceLargerShar
edMemorySmallerL1Cache(1)
mCudaDriverContext.CachePreference: 1
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitStackSize]: 1024
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitPrintfFifoSize]:
1048576
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitMallocHeapSize]:
8388608
assignment test:
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitStackSize] := 101
00100
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitPrintfFifoSize] :
= 20100100
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitMallocHeapSize] :
= 30100100
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitStackSize]: 1024
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitPrintfFifoSize]:
20100352
mCudaDriverContext.ResourceLimit[CudaDriverContextResourceLimitMallocHeapSize]:
30146560
mCudaDriverContext.Leave successfull.
mCudaDriverContext.Close successfull.
program finished
Something seems to be missing in these numbers.
It only mentions “number of multi processors”
It does not mention “number of sub cores per multi processor”
According to the information for the GT 520 on websites it has 48 cuda cores which could be considered sub cores ?!?
This info is not in the output of the program ?!?!?
(Cuda Driver API Limitation ?)
The device query tool seems to use some kind of routine or macro to figure it:
#if CUDART_VERSION >= 2000
shrLog(" (%2d) Multiprocessors x (%2d) CUDA Cores/MP: %d CUDA Cores\n",
deviceProp.multiProcessorCount,
ConvertSMVer2Cores(deviceProp.major, deviceProp.minor),
ConvertSMVer2Cores(deviceProp.major, deviceProp.minor) * deviceProp.multiProcessorCount);
#endif
Routine or Macro: ConvertSMVer2Cores
Maybe this routine not available, but somebody tried out a number of combinations it seems to be:
Relationship between compute capability and cuda cores:
Capability: Cores
1.0: 8
1.1: 8
1.2: 8
1.3: 8
2.0: 32
2.1: 48
Something seems to be missing in these numbers.
It only mentions “number of multi processors”
It does not mention “number of sub cores per multi processor”
According to the information for the GT 520 on websites it has 48 cuda cores which could be considered sub cores ?!?
This info is not in the output of the program ?!?!?
(Cuda Driver API Limitation ?)
The device query tool seems to use some kind of routine or macro to figure it:
#if CUDART_VERSION >= 2000
shrLog(" (%2d) Multiprocessors x (%2d) CUDA Cores/MP: %d CUDA Cores\n",
deviceProp.multiProcessorCount,
ConvertSMVer2Cores(deviceProp.major, deviceProp.minor),
ConvertSMVer2Cores(deviceProp.major, deviceProp.minor) * deviceProp.multiProcessorCount);
#endif
Routine or Macro: ConvertSMVer2Cores
Maybe this routine not available, but somebody tried out a number of combinations it seems to be:
Relationship between compute capability and cuda cores:
Capability: Cores
1.0: 8
1.1: 8
1.2: 8
1.3: 8
2.0: 32
2.1: 48
Something seems to be missing in these numbers.
It only mentions “number of multi processors”
It does not mention “number of sub cores per multi processor”
Correct. For some reason, the CUDA device properties do not include this number, and you have to infer it from the compute capability. The values you mention are documented in Appendix F.3 and F.4 of the CUDA C Programming Guide.
Something seems to be missing in these numbers.
It only mentions “number of multi processors”
It does not mention “number of sub cores per multi processor”
Correct. For some reason, the CUDA device properties do not include this number, and you have to infer it from the compute capability. The values you mention are documented in Appendix F.3 and F.4 of the CUDA C Programming Guide.
Yeah these numbers come from the start of F.3 and F.4 where it’s written in text form.
(It does mention cuda cores have integer and floating point units… but how much ? is it shared or not and so forth… perhaps some of this info is kept secret or so…)
It’s not exactly clear what is ment with for example F.4: “4 special floating point functions”.
Is this 4 special floating point functions per cuda core ? or is it shared by 32 cuda cores ? (compute 2.0)
Also 1 warp schedular in F.3 is that per cuda core or shared by 8 cuda cores ? (compute 1.x)
Yeah these numbers come from the start of F.3 and F.4 where it’s written in text form.
(It does mention cuda cores have integer and floating point units… but how much ? is it shared or not and so forth… perhaps some of this info is kept secret or so…)
It’s not exactly clear what is ment with for example F.4: “4 special floating point functions”.
Is this 4 special floating point functions per cuda core ? or is it shared by 32 cuda cores ? (compute 2.0)
Also 1 warp schedular in F.3 is that per cuda core or shared by 8 cuda cores ? (compute 1.x)