Nvidia Jetson Xavier Tensorflow Error

Udit · August 3, 2023, 9:36am

Hello community!
I am currently attempting to implement a CNN model using TensorFlow on Nvidia Jetson Xavier NX (16GB). Previously, the model ran perfectly fine on Windows 10 OS, but it is showing some errors while running on the Nvidia kit. Below are the versions of the libraries and dependencies I have installed:

Jetpack: 5.1-b147
CUDA: 11.8
cuDNN: 8.8.1.3-1 + CUDA 11.8
TensorFlow/Keras: 2.11
Ubuntu: 20.04.6 LTS
GCC: 9.4.0

I have attached screenshots of the errors and the CNN model for reference.

Complete Error:
NotFoundError Traceback (most recent call last)
Cell In[26], line 2
1 # Train the model
----> 2 history=model.fit(x_train, y_train, batch_size=32, epochs=100, validation_data=(x_test, y_test))

File /usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.traceback)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
—> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb

File /usr/local/lib/python3.8/dist-packages/tensorflow/python/eager/execute.py:52, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
50 try:
51 ctx.ensure_initialized()
—> 52 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
53 inputs, attrs, num_outputs)
54 except core._NotOkStatusException as e:
55 if name is not None:

NotFoundError: Graph execution error:

Detected at node ‘sequential/conv2d/Conv2D’ defined at (most recent call last):
File “/usr/lib/python3.8/runpy.py”, line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File “/usr/lib/python3.8/runpy.py”, line 87, in _run_code
exec(code, run_globals)
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel_launcher.py”, line 17, in
app.launch_new_instance()
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/traitlets/config/application.py”, line 1043, in launch_instance
app.start()
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/kernelapp.py”, line 725, in start
self.io_loop.start()
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/tornado/platform/asyncio.py”, line 215, in start
self.asyncio_loop.run_forever()
File “/usr/lib/python3.8/asyncio/base_events.py”, line 570, in run_forever
self._run_once()
File “/usr/lib/python3.8/asyncio/base_events.py”, line 1859, in _run_once
handle._run()
File “/usr/lib/python3.8/asyncio/events.py”, line 81, in _run
self._context.run(self._callback, *self._args)
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 513, in dispatch_queue
await self.process_one()
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 502, in process_one
await dispatch(*args)
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 409, in dispatch_shell
await result
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/kernelbase.py”, line 729, in execute_request
reply_content = await reply_content
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/ipkernel.py”, line 422, in do_execute
res = shell.run_cell(
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/ipykernel/zmqshell.py”, line 540, in run_cell
return super().run_cell(*args, **kwargs)
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 2961, in run_cell
result = self._run_cell(
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3016, in _run_cell
result = runner(coro)
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/IPython/core/async_helpers.py”, line 129, in pseudo_sync_runner
coro.send(None)
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3221, in run_cell_async
has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3400, in run_ast_nodes
if await self.run_code(code, result, async=asy):
File “/home/xavier/tensor_env1/lib/python3.8/site-packages/IPython/core/interactiveshell.py”, line 3460, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File “/tmp/ipykernel_16538/3312217822.py”, line 2, in
history=model.fit(x_train, y_train, batch_size=32, epochs=100, validation_data=(x_test, y_test))
File “/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/training.py”, line 1650, in fit
tmp_logs = self.train_function(iterator)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/training.py”, line 1249, in train_function
return step_function(self, iterator)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/training.py”, line 1233, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File “/usr/local/lib/python3.8/dist-packages/keras/engine/training.py”, line 1222, in run_step
outputs = model.train_step(data)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/training.py”, line 1023, in train_step
y_pred = self(x, training=True)
File “/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/training.py”, line 561, in call
return super().call(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/base_layer.py”, line 1132, in call
outputs = call_fn(inputs, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py”, line 96, in error_handler
return fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/sequential.py”, line 413, in call
return super().call(inputs, training=training, mask=mask)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py”, line 511, in call
return self._run_internal_graph(inputs, training=training, mask=mask)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/functional.py”, line 668, in _run_internal_graph
outputs = node.layer(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py”, line 65, in error_handler
return fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/engine/base_layer.py”, line 1132, in call
outputs = call_fn(inputs, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py”, line 96, in error_handler
return fn(*args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/keras/layers/convolutional/base_conv.py”, line 283, in call
outputs = self.convolution_op(inputs, self.kernel)
File “/usr/local/lib/python3.8/dist-packages/keras/layers/convolutional/base_conv.py”, line 255, in convolution_op
return tf.nn.convolution(
Node: ‘sequential/conv2d/Conv2D’
No algorithm worked! Error messages:
Profiling failure on CUDNN engine eng1{}: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc(4677): ‘status’
Profiling failure on CUDNN engine eng28{}: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc(4677): ‘status’
Profiling failure on CUDNN engine eng0{}: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc(4677): ‘status’
Profiling failure on CUDNN engine eng3{k11=2}: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc(4677): ‘status’
[[{{node sequential/conv2d/Conv2D}}]] [Op:__inference_train_function_1375]

Thanks and regards!

AastaLLL · August 4, 2023, 1:54am

Hi,

Not sure if this is a compatibility issue.
Could you share how do you install the TensorFlow?
Do you use our prebuilt below?
https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html
Thanks.

Udit · August 7, 2023, 2:46pm

I tried to reinstall TensorFlow through the link provided by you, using the relevant JetPack version.
This time, the CNN model started training, but the output got stuck at NaN.

The same code runs absolutely fine on Windows OS. When checking whether TensorFlow is able to find the GPU, it shows the following

and on defining the model, it shows the same error

Some errors were encountered during the installation of a few libraries. I followed the link given below to sort them out:

Do I need to update the QSPI as well??

Thanks and regards!

AastaLLL · August 8, 2023, 5:50am

Hi,

Could you share the TensorFlow version you used on Jetson and Windows?
Thanks.

Udit · August 8, 2023, 8:00am

On Jetson, it is tensorflow-2.11.0+nv23.03-cp38-cp38-linux_aarch64.whl

And for Windows, I’ve installed 2.13.0

AastaLLL · August 17, 2023, 4:17am

Hi,

Just want to confirm first.

Have you upgraded the CUDA manually to 11.8? (the version mentioned on the top)
Since the package is built with the default CUDA 11.4, it might not work normally with other CUDA versions.

Please test it with CUDA 11.4 instead.
Thanks.

system · September 12, 2023, 8:26am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorFlow wheel for JetPack 4.0 !! Jetson AGX Xavier	16	3675	October 15, 2018
Tensorflow Memory Error Jetson TX2	25	15286	October 18, 2021
Official TensorFlow for Jetson AGX Xavier Jetson AGX Xavier kb	97	41727	September 5, 2023
Slow model loading on a Jetson AGX Xavier with TensorFlow 2.5.0 Jetson AGX Xavier cuda , tensorflow	13	2339	November 10, 2021
Something (NUMP, PTX) Error running Tensorflow on JETSON XAVIER NX Jetson Xavier NX cuda , tensorflow	7	42	November 28, 2024
TensorFlow 1.11.0 wheel with JetPack 3.3 Jetson TX2	103	45349	November 13, 2019
Install Tensorflow on JetPack 4.6.1 on Xavier nx Jetson Xavier NX tensorflow	12	1223	January 16, 2024
TensorFlow on Jetson TX2 Jetson TX2	47	19410	September 18, 2017
CUDA_ERROR_LAUNCH_FAILED error when running TensorFlow mnist example Jetson TX2	4	2893	December 7, 2017
TensorFlow Issue - 'NonMaxSuppressionV3' in binary Jetson TX2	16	3144	October 18, 2021

Nvidia Jetson Xavier Tensorflow Error

Related topics