Can't run LSTM based (TF-Keras) model on Jetson Nano - Function call stack: distributed_function -> distributed_function -> distributed_function

I made a LSTM based (TF-Keras) model which i try to inference on Jetson Nano
This is the smaller model I took so I avoid any possible memory shortages but it seems it didn’t help. https://stackoverflow.com/questions/53972814/cudnnlstm-failed-to-call-thenrnnforward/53974172

I am running this normally on AGX Xavier (the bigger model as well) with the exact same installation.

The full output when I run the function with inference:

2020-02-21 10:31:47.595057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Testing folder to proceed:
../data/mixture_data/devtest/419c25653c5ab256b186cd3b5acb4559/audio_20_samples
2020-02-21 10:32:03.166261: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-21 10:32:03.206873: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:03.207033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-02-21 10:32:03.207108: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-21 10:32:03.293510: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-02-21 10:32:03.375089: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-02-21 10:32:03.487698: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-02-21 10:32:03.623978: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-02-21 10:32:03.694823: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-02-21 10:32:03.967148: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-21 10:32:03.967936: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:03.968728: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:03.968943: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-02-21 10:32:03.998702: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2020-02-21 10:32:03.999751: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x36219fd0 executing computations on platform Host. Devices:
2020-02-21 10:32:03.999810: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2020-02-21 10:32:04.102703: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:04.103007: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3605a1f0 executing computations on platform CUDA. Devices:
2020-02-21 10:32:04.103057: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3
2020-02-21 10:32:04.103667: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:04.103782: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216
pciBusID: 0000:00:00.0
2020-02-21 10:32:04.103888: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-21 10:32:04.104033: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-02-21 10:32:04.104092: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-02-21 10:32:04.104144: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-02-21 10:32:04.104191: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-02-21 10:32:04.104234: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-02-21 10:32:04.104280: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-21 10:32:04.104486: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:04.104737: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:04.104809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-02-21 10:32:04.104897: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-02-21 10:32:10.552818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-21 10:32:10.552886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-02-21 10:32:10.552911: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-02-21 10:32:10.553372: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:10.553707: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:973] ARM64 does not support NUMA - returning NUMA node zero
2020-02-21 10:32:10.553901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 402 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm (LSTM)                  (None, 16000, 15)         1020      
_________________________________________________________________
lstm_1 (LSTM)                (None, 16000, 15)         1860      
_________________________________________________________________
dense (Dense)                (None, 16000, 1)          16        
=================================================================
Total params: 2,896
Trainable params: 2,896
Non-trainable params: 0
_________________________________________________________________
  0%|                                                                                                                                                                                | 0/20 [00:00<?, ?it/s]../data/mixture_data/devtest/419c25653c5ab256b186cd3b5acb4559/audio_20_samples/mixture_devtest_gunshot_016_cf479ae2c96577a0173bd1c88c137d25.wav
1it [00:05,  5.56s/it]
(117, 16000) 5.55s/it]
2020-02-21 10:32:24.442954: W tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: Invalid argument: Functions '__inference_cudnn_lstm_with_fallback_1476_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_2142' and '__inference_cudnn_lstm_with_fallback_1476' both implement 'lstm_07e602c2-31ae-4e8f-8683-c88d77bb8c9a' but their signatures do not match.
2020-02-21 10:32:24.755170: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-02-21 10:32:26.661392: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-21 10:32:45.210454: W tensorflow/core/common_runtime/bfc_allocator.cc:419] Allocator (GPU_0_bfc) ran out of memory trying to allocate 335.80MiB (rounded to 352109824).  Current allocation summary follows.
2020-02-21 10:32:45.269597: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (256): 	Total Chunks: 7, Chunks in use: 6. 1.8KiB allocated for chunks. 1.5KiB in use in bin. 1.0KiB client-requested in use in bin.
2020-02-21 10:32:45.270639: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (512): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.271589: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (1024): 	Total Chunks: 1, Chunks in use: 1. 1.2KiB allocated for chunks. 1.2KiB in use in bin. 1.0KiB client-requested in use in bin.
2020-02-21 10:32:45.272018: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (2048): 	Total Chunks: 3, Chunks in use: 2. 10.8KiB allocated for chunks. 7.5KiB in use in bin. 7.0KiB client-requested in use in bin.
2020-02-21 10:32:45.272418: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (4096): 	Total Chunks: 6, Chunks in use: 5. 36.0KiB allocated for chunks. 32.0KiB in use in bin. 28.3KiB client-requested in use in bin.
2020-02-21 10:32:45.273080: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (8192): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.273750: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (16384): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.274205: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (32768): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.274641: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (65536): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.275356: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (131072): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.275475: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (262144): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.276029: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (524288): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.276245: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (1048576): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.276451: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (2097152): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.276646: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (4194304): 	Total Chunks: 1, Chunks in use: 1. 7.14MiB allocated for chunks. 7.14MiB in use in bin. 7.14MiB client-requested in use in bin.
2020-02-21 10:32:45.276877: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (8388608): 	Total Chunks: 1, Chunks in use: 1. 14.28MiB allocated for chunks. 14.28MiB in use in bin. 7.14MiB client-requested in use in bin.
2020-02-21 10:32:45.277472: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (16777216): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.277710: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (33554432): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.277969: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (67108864): 	Total Chunks: 1, Chunks in use: 1. 107.12MiB allocated for chunks. 107.12MiB in use in bin. 107.12MiB client-requested in use in bin.
2020-02-21 10:32:45.278259: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (134217728): 	Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.278581: I tensorflow/core/common_runtime/bfc_allocator.cc:869] Bin (268435456): 	Total Chunks: 1, Chunks in use: 0. 274.07MiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-02-21 10:32:45.278832: I tensorflow/core/common_runtime/bfc_allocator.cc:885] Bin for 335.80MiB was 256.00MiB, Chunk State: 
2020-02-21 10:32:45.279069: I tensorflow/core/common_runtime/bfc_allocator.cc:891]   Size: 274.07MiB | Requested Size: 60B | in_use: 0 | bin_num: 20, prev:   Size: 7.0KiB | Requested Size: 6.9KiB | in_use: 1 | bin_num: -1
2020-02-21 10:32:45.279209: I tensorflow/core/common_runtime/bfc_allocator.cc:898] Next region of size 422211584
2020-02-21 10:32:45.279415: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870000 next 1 of size 1280
2020-02-21 10:32:45.279828: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870500 next 3 of size 256
2020-02-21 10:32:45.279954: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870600 next 11 of size 256
2020-02-21 10:32:45.280059: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870700 next 4 of size 256
2020-02-21 10:32:45.280205: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870800 next 5 of size 256
2020-02-21 10:32:45.280330: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870900 next 10 of size 256
2020-02-21 10:32:45.280469: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870a00 next 2 of size 256
2020-02-21 10:32:45.280603: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free  at 0xf00870b00 next 12 of size 256
2020-02-21 10:32:45.280739: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00870c00 next 6 of size 6912
2020-02-21 10:32:45.280875: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free  at 0xf00872700 next 7 of size 4096
2020-02-21 10:32:45.281011: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00873700 next 9 of size 3840
2020-02-21 10:32:45.281146: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00874600 next 13 of size 3840
2020-02-21 10:32:45.281280: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00875500 next 15 of size 7168
2020-02-21 10:32:45.281417: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf00877100 next 8 of size 14968832
2020-02-21 10:32:45.281562: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf016bd900 next 14 of size 7488000
2020-02-21 10:32:45.281699: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free  at 0xf01de1b00 next 30 of size 3328
2020-02-21 10:32:45.281834: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf01de2800 next 31 of size 4352
2020-02-21 10:32:45.281973: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf01de3900 next 29 of size 112320000
2020-02-21 10:32:45.282109: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf08901700 next 28 of size 7168
2020-02-21 10:32:45.282243: I tensorflow/core/common_runtime/bfc_allocator.cc:905] InUse at 0xf08903300 next 27 of size 7168
2020-02-21 10:32:45.282379: I tensorflow/core/common_runtime/bfc_allocator.cc:905] Free  at 0xf08904f00 next 18446744073709551615 of size 287383808
2020-02-21 10:32:45.282511: I tensorflow/core/common_runtime/bfc_allocator.cc:914]      Summary of in-use Chunks by size: 
2020-02-21 10:32:45.282736: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 6 Chunks of size 256 totalling 1.5KiB
2020-02-21 10:32:45.282820: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 1280 totalling 1.2KiB
2020-02-21 10:32:45.282892: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 2 Chunks of size 3840 totalling 7.5KiB
2020-02-21 10:32:45.282965: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 4352 totalling 4.2KiB
2020-02-21 10:32:45.283045: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 6912 totalling 6.8KiB
2020-02-21 10:32:45.283135: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 3 Chunks of size 7168 totalling 21.0KiB
2020-02-21 10:32:45.283229: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 7488000 totalling 7.14MiB
2020-02-21 10:32:45.283331: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 14968832 totalling 14.28MiB
2020-02-21 10:32:45.283438: I tensorflow/core/common_runtime/bfc_allocator.cc:917] 1 Chunks of size 112320000 totalling 107.12MiB
2020-02-21 10:32:45.283545: I tensorflow/core/common_runtime/bfc_allocator.cc:921] Sum Total of in-use chunks: 128.57MiB
2020-02-21 10:32:45.283649: I tensorflow/core/common_runtime/bfc_allocator.cc:923] total_region_allocated_bytes_: 422211584 memory_limit_: 422211584 available bytes: 0 curr_region_allocation_bytes_: 844423168
2020-02-21 10:32:45.310481: I tensorflow/core/common_runtime/bfc_allocator.cc:929] Stats: 
Limit:                   422211584
InUse:                   134820096
MaxInUse:                134820096
NumAllocs:                     119
MaxAllocSize:            112320000

2020-02-21 10:32:45.311003: W tensorflow/core/common_runtime/bfc_allocator.cc:424] **x*****************************____________________________________________________________________
2020-02-21 10:32:45.378067: E tensorflow/stream_executor/dnn.cc:588] OOM when allocating tensor with shape[352109776] and type uint8 on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
2020-02-21 10:32:45.504043: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at cudnn_rnn_ops.cc:1498 : Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 1, 15, 1, 16000, 117, 15] 
2020-02-21 10:32:45.541609: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 1, 15, 1, 16000, 117, 15] 
	 [[{{node CudnnRNN}}]]
2020-02-21 10:32:45.679916: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Internal: {{function_node __inference_cudnn_lstm_with_fallback_1476_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_2142_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_2142}} {{function_node __inference_cudnn_lstm_with_fallback_1476_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_2142_specialized_for_sequential_lstm_StatefulPartitionedCall_at___inference_distributed_function_2142}} Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 1, 15, 1, 16000, 117, 15] 
	 [[{{node CudnnRNN}}]]
	 [[sequential/lstm/StatefulPartitionedCall]]
117/1 [======================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================] - 25s 214ms/sample
Traceback (most recent call last):
  File "get_response.py", line 100, in <module>
    model, file, file_meta, sample_rate, rolling_step)
  File "get_response.py", line 26, in get_response
    batch_size=data.shape[0], verbose=1)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 909, in predict
    use_multiprocessing=use_multiprocessing)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 462, in predict
    steps=steps, callbacks=callbacks, **kwargs)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 444, in _model_iteration
    total_epochs=1)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 123, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 86, in execution_function
    distributed_function(input_fn))
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 457, in __call__
    result = self._call(*args, **kwds)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 526, in _call
    return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds)  # pylint: disable=protected-access
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1141, in _filtered_call
    self.captured_inputs)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1224, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 511, in call
    ctx=ctx)
  File "/home/marko/.local/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError:  [_Derived_]  Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 1, 15, 1, 16000, 117, 15] 
	 [[{{node CudnnRNN}}]]
	 [[sequential/lstm/StatefulPartitionedCall]] [Op:__inference_distributed_function_2142]

Function call stack:
distributed_function -> distributed_function -> distributed_function

  0%|                                                                                                                                                                                | 0/20 [00:31<?, ?it/s]

Any help is appreciated!
Thanks!

Hi barra.495, here in the log, it says it is running Out Of Memory (OOM):

2020-02-21 10:32:45.378067: E tensorflow/stream_executor/dnn.cc:588] OOM when allocating tensor with shape[352109776] and type uint8 on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

You could try mounting additional SWAP memory, otherwise reduce the size/complexity of the model.
It is recommended to run “sudo tegrastats” in the background to keep an eye on memory usage.

Thank you very much. I fixed it by running Ubuntu from terminal and disabling X server on boot. By doing that only ±265 MB of RAM was taken in comparison to almost 2 GB when running full desktop and there were no problems with memory after that.

Something like this: display manager - How do I disable X at boot time so that the system boots in text mode? - Ask Ubuntu

I hope it helps somebody as well!