I have a new HP Omen Obelisk 25L running ubuntu 18.4 with a RTX 2080 GPU I am trying to set up to do some machine learning with TensorFlow.
uname -r
4.18.0-22-generic
I have followed this tutorial:
[/code]
[/code]
and have successfully completed through Step 7.2.3.
This step indicates that CUDA is installed and working as shown here:
./NVIDIA_CUDA-10.1_Samples/bin/x86_64/linux/release/deviceQuery
./NVIDIA_CUDA-10.1_Samples/bin/x86_64/linux/release/deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce RTX 2080"
CUDA Driver Version / Runtime Version 10.1 / 10.1
CUDA Capability Major/Minor version number: 7.5
Total amount of global memory: 7949 MBytes (8335458304 bytes)
(46) Multiprocessors, ( 64) CUDA Cores/MP: 2944 CUDA Cores
GPU Max Clock rate: 1710 MHz (1.71 GHz)
Memory Clock rate: 7000 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 4194304 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1024
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 3 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS
However, when I try to run a simple python program that uses the GPU:
import tensorflow as tf
# Creates a graph.
with tf.device('/device:GPU:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
I see the following output:
(base) shallada@shalladaGPU:~/book$ python tftest.py
2019-06-20 15:28:35.068487: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-06-20 15:28:35.089702: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3192000000 Hz
2019-06-20 15:28:35.090500: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55e85b0c2d80 executing computations on platform Host. Devices:
2019-06-20 15:28:35.090520: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): <undefined>, <undefined>
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
2019-06-20 15:28:35.091129: I tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:CPU:0
2019-06-20 15:28:35.091731: I tensorflow/core/common_runtime/placer.cc:1059] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:CPU:0
Traceback (most recent call last):
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1317, in _run_fn
self._extend_graph()
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1352, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation a: {{node a}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
[[{{node a}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tftest.py", line 11, in <module>
print(sess.run(c))
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation a: node a (defined at tftest.py:5) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
[[node a (defined at tftest.py:5) ]]
Caused by op 'a', defined at:
File "tftest.py", line 5, in <module>
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 179, in constant_v1
allow_broadcast=False)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/constant_op.py", line 289, in _constant_impl
name=name).outputs[0]
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/home/shallada/anaconda3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Cannot assign a device for operation a: node a (defined at tftest.py:5) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
[[node a (defined at tftest.py:5) ]]
My question is: Why does TensorFlow think CUDA is not enabled, and how to I make it enabled?