root@cc8b63e0b034:/workspace/src/openmpi-4.1.5# mpirun --allow-run-as-root --mca btl_vader_single_copy_mechanism none -np 2 python /usr/local/lib/python3.6/dist-packages/iva/detectnet_v2/scripts/train.py -e /workspace/tao-experiments/specs/detectnet_v2_train_peoplenet_kitti_multi.txt -r /workspace/results -k tlt_encode 2023-05-29 14:11:06.821638: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:06.821638: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. 2023-05-29 14:11:09.130322: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:09.130322: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:09.165408: I tensorflow/core/platform/profile_utils/cpu_utils.cc:109] CPU Frequency: 3593295000 Hz 2023-05-29 14:11:09.165411: I tensorflow/core/platform/profile_utils/cpu_utils.cc:109] CPU Frequency: 3593295000 Hz 2023-05-29 14:11:09.165514: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5f06300 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2023-05-29 14:11:09.165514: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x602cfd0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2023-05-29 14:11:09.165527: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2023-05-29 14:11:09.165527: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2023-05-29 14:11:09.166795: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2023-05-29 14:11:09.166898: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2023-05-29 14:11:10.587954: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.588226: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5e34cb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2023-05-29 14:11:10.588236: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA RTX 6000 Ada Generation, Compute Capability 8.9 2023-05-29 14:11:10.588389: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.588497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:21:00.0 2023-05-29 14:11:10.588520: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:10.599809: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:11:10.601676: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-05-29 14:11:10.601881: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-05-29 14:11:10.602336: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:11:10.602828: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-05-29 14:11:10.602943: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-05-29 14:11:10.603024: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.641054: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.641072: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.641430: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5dd6f40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2023-05-29 14:11:10.641453: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA RTX 6000 Ada Generation, Compute Capability 8.9 2023-05-29 14:11:10.641284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2023-05-29 14:11:10.641636: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.641800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:22:00.0 2023-05-29 14:11:10.641825: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:10.656228: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:11:10.658684: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-05-29 14:11:10.659072: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-05-29 14:11:10.659703: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:11:10.660375: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-05-29 14:11:10.660518: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-05-29 14:11:10.660700: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.660943: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.661107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 1 2023-05-29 14:11:10.918803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2023-05-29 14:11:10.918844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2023-05-29 14:11:10.918850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2023-05-29 14:11:10.919114: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.919308: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:10.919438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 46288 MB memory) -> physical GPU (device: 0, name: NVIDIA RTX 6000 Ada Generation, pci bus id: 0000:21:00.0, compute capability: 8.9) /usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version! RequestsDependencyWarning) Using TensorFlow backend. 2023-05-29 14:11:10,920 [INFO] iva.common.logging.logging: Log file already exists at /workspace/results/status.json 2023-05-29 14:11:10,920 [INFO] root: Starting DetectNet_v2 Training job 2023-05-29 14:11:10,920 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/specs/detectnet_v2_train_peoplenet_kitti_multi.txt. 2023-05-29 14:11:10,921 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/specs/detectnet_v2_train_peoplenet_kitti_multi.txt 2023-05-29 14:11:10,967 [INFO] root: Training gridbox model. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2023-05-29 14:11:10,967 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2023-05-29 14:11:11.015638: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2023-05-29 14:11:11.015681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 1 2023-05-29 14:11:11.015688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 1: N 2023-05-29 14:11:11.015985: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:11.016211: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:11.016368: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 46279 MB memory) -> physical GPU (device: 1, name: NVIDIA RTX 6000 Ada Generation, pci bus id: 0000:22:00.0, compute capability: 8.9) /usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version! RequestsDependencyWarning) Using TensorFlow backend. 2023-05-29 14:11:11,017 [INFO] iva.common.logging.logging: Log file already exists at /workspace/results/status.json 2023-05-29 14:11:11,017 [INFO] root: Starting DetectNet_v2 Training job 2023-05-29 14:11:11,017 [INFO] __main__: Loading experiment spec at /workspace/tao-experiments/specs/detectnet_v2_train_peoplenet_kitti_multi.txt. 2023-05-29 14:11:11,019 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/specs/detectnet_v2_train_peoplenet_kitti_multi.txt 2023-05-29 14:11:11,027 [INFO] root: Training gridbox model. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2023-05-29 14:11:11,027 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:153: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. 2023-05-29 14:11:12,834 [INFO] root: Sampling mode of the dataloader was set to user_defined. 2023-05-29 14:11:12,834 [INFO] __main__: Cannot iterate over exactly 47494 samples with a batch size of 24; each epoch will therefore take one extra step. 2023-05-29 14:11:12,834 [INFO] __main__: Cannot iterate over exactly 989 steps per epoch with 24 processors; each processor will therefore take one extra step per epoch. 2023-05-29 14:11:12,946 [INFO] root: Building DetectNet V2 model WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. 2023-05-29 14:11:12,946 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. 2023-05-29 14:11:12,947 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. 2023-05-29 14:11:12,961 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. 2023-05-29 14:11:13,820 [INFO] root: Sampling mode of the dataloader was set to user_defined. 2023-05-29 14:11:13,820 [INFO] __main__: Cannot iterate over exactly 47494 samples with a batch size of 24; each epoch will therefore take one extra step. 2023-05-29 14:11:13,820 [INFO] __main__: Cannot iterate over exactly 989 steps per epoch with 24 processors; each processor will therefore take one extra step per epoch. 2023-05-29 14:11:13,951 [INFO] root: Building DetectNet V2 model WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. 2023-05-29 14:11:13,951 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. 2023-05-29 14:11:13,952 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. 2023-05-29 14:11:13,968 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. 2023-05-29 14:11:16,145 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. 2023-05-29 14:11:16,145 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. 2023-05-29 14:11:16,146 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. 2023-05-29 14:11:16,950 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. 2023-05-29 14:11:17,780 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. 2023-05-29 14:11:17,780 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. 2023-05-29 14:11:17,781 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:199: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. 2023-05-29 14:11:18,856 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:206: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead. /usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually. warnings.warn('No training configuration found in save file: ' 2023-05-29 14:11:45,669 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used. __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 3, 544, 960) 0 __________________________________________________________________________________________________ conv1 (Conv2D) (None, 64, 272, 480) 9472 input_1[0][0] __________________________________________________________________________________________________ bn_conv1 (BatchNormalization) (None, 64, 272, 480) 256 conv1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 64, 272, 480) 0 bn_conv1[0][0] __________________________________________________________________________________________________ block_1a_conv_1 (Conv2D) (None, 64, 136, 240) 36928 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1a_conv_1[0][0] __________________________________________________________________________________________________ block_1a_relu_1 (Activation) (None, 64, 136, 240) 0 block_1a_bn_1[0][0] __________________________________________________________________________________________________ block_1a_conv_2 (Conv2D) (None, 64, 136, 240) 36928 block_1a_relu_1[0][0] __________________________________________________________________________________________________ block_1a_conv_shortcut (Conv2D) (None, 64, 136, 240) 4160 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1a_conv_2[0][0] __________________________________________________________________________________________________ block_1a_bn_shortcut (BatchNorm (None, 64, 136, 240) 256 block_1a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 64, 136, 240) 0 block_1a_bn_2[0][0] block_1a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_1a_relu (Activation) (None, 64, 136, 240) 0 add_1[0][0] __________________________________________________________________________________________________ block_1b_conv_1 (Conv2D) (None, 64, 136, 240) 36928 block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1b_conv_1[0][0] __________________________________________________________________________________________________ block_1b_relu_1 (Activation) (None, 64, 136, 240) 0 block_1b_bn_1[0][0] __________________________________________________________________________________________________ block_1b_conv_2 (Conv2D) (None, 64, 136, 240) 36928 block_1b_relu_1[0][0] __________________________________________________________________________________________________ block_1b_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1b_conv_2[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 64, 136, 240) 0 block_1b_bn_2[0][0] block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_relu (Activation) (None, 64, 136, 240) 0 add_2[0][0] __________________________________________________________________________________________________ block_1c_conv_1 (Conv2D) (None, 64, 136, 240) 36928 block_1b_relu[0][0] __________________________________________________________________________________________________ block_1c_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1c_conv_1[0][0] __________________________________________________________________________________________________ block_1c_relu_1 (Activation) (None, 64, 136, 240) 0 block_1c_bn_1[0][0] __________________________________________________________________________________________________ block_1c_conv_2 (Conv2D) (None, 64, 136, 240) 36928 block_1c_relu_1[0][0] __________________________________________________________________________________________________ block_1c_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1c_conv_2[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 64, 136, 240) 0 block_1c_bn_2[0][0] block_1b_relu[0][0] __________________________________________________________________________________________________ block_1c_relu (Activation) (None, 64, 136, 240) 0 add_3[0][0] __________________________________________________________________________________________________ block_2a_conv_1 (Conv2D) (None, 128, 68, 120) 73856 block_1c_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2a_conv_1[0][0] __________________________________________________________________________________________________ block_2a_relu_1 (Activation) (None, 128, 68, 120) 0 block_2a_bn_1[0][0] __________________________________________________________________________________________________ block_2a_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2a_relu_1[0][0] __________________________________________________________________________________________________ block_2a_conv_shortcut (Conv2D) (None, 128, 68, 120) 8320 block_1c_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2a_conv_2[0][0] __________________________________________________________________________________________________ block_2a_bn_shortcut (BatchNorm (None, 128, 68, 120) 512 block_2a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 128, 68, 120) 0 block_2a_bn_2[0][0] block_2a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_2a_relu (Activation) (None, 128, 68, 120) 0 add_4[0][0] __________________________________________________________________________________________________ block_2b_conv_1 (Conv2D) (None, 128, 68, 120) 147584 block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2b_conv_1[0][0] __________________________________________________________________________________________________ block_2b_relu_1 (Activation) (None, 128, 68, 120) 0 block_2b_bn_1[0][0] __________________________________________________________________________________________________ block_2b_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2b_relu_1[0][0] __________________________________________________________________________________________________ block_2b_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2b_conv_2[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 128, 68, 120) 0 block_2b_bn_2[0][0] block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_relu (Activation) (None, 128, 68, 120) 0 add_5[0][0] __________________________________________________________________________________________________ block_2c_conv_1 (Conv2D) (None, 128, 68, 120) 147584 block_2b_relu[0][0] __________________________________________________________________________________________________ block_2c_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2c_conv_1[0][0] __________________________________________________________________________________________________ block_2c_relu_1 (Activation) (None, 128, 68, 120) 0 block_2c_bn_1[0][0] __________________________________________________________________________________________________ block_2c_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2c_relu_1[0][0] __________________________________________________________________________________________________ block_2c_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2c_conv_2[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 128, 68, 120) 0 block_2c_bn_2[0][0] block_2b_relu[0][0] __________________________________________________________________________________________________ block_2c_relu (Activation) (None, 128, 68, 120) 0 add_6[0][0] __________________________________________________________________________________________________ block_2d_conv_1 (Conv2D) (None, 128, 68, 120) 147584 block_2c_relu[0][0] __________________________________________________________________________________________________ block_2d_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2d_conv_1[0][0] __________________________________________________________________________________________________ block_2d_relu_1 (Activation) (None, 128, 68, 120) 0 block_2d_bn_1[0][0] __________________________________________________________________________________________________ block_2d_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2d_relu_1[0][0] __________________________________________________________________________________________________ block_2d_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2d_conv_2[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 128, 68, 120) 0 block_2d_bn_2[0][0] block_2c_relu[0][0] __________________________________________________________________________________________________ block_2d_relu (Activation) (None, 128, 68, 120) 0 add_7[0][0] __________________________________________________________________________________________________ block_3a_conv_1 (Conv2D) (None, 256, 34, 60) 295168 block_2d_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_1[0][0] __________________________________________________________________________________________________ block_3a_relu_1 (Activation) (None, 256, 34, 60) 0 block_3a_bn_1[0][0] __________________________________________________________________________________________________ block_3a_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3a_relu_1[0][0] __________________________________________________________________________________________________ block_3a_conv_shortcut (Conv2D) (None, 256, 34, 60) 33024 block_2d_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_2[0][0] __________________________________________________________________________________________________ block_3a_bn_shortcut (BatchNorm (None, 256, 34, 60) 1024 block_3a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 256, 34, 60) 0 block_3a_bn_2[0][0] block_3a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_3a_relu (Activation) (None, 256, 34, 60) 0 add_8[0][0] __________________________________________________________________________________________________ block_3b_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3b_conv_1[0][0] __________________________________________________________________________________________________ block_3b_relu_1 (Activation) (None, 256, 34, 60) 0 block_3b_bn_1[0][0] __________________________________________________________________________________________________ block_3b_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3b_relu_1[0][0] __________________________________________________________________________________________________ block_3b_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3b_conv_2[0][0] __________________________________________________________________________________________________ add_9 (Add) (None, 256, 34, 60) 0 block_3b_bn_2[0][0] block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_relu (Activation) (None, 256, 34, 60) 0 add_9[0][0] __________________________________________________________________________________________________ block_3c_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3b_relu[0][0] __________________________________________________________________________________________________ block_3c_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3c_conv_1[0][0] __________________________________________________________________________________________________ block_3c_relu_1 (Activation) (None, 256, 34, 60) 0 block_3c_bn_1[0][0] __________________________________________________________________________________________________ block_3c_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3c_relu_1[0][0] __________________________________________________________________________________________________ block_3c_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3c_conv_2[0][0] __________________________________________________________________________________________________ add_10 (Add) (None, 256, 34, 60) 0 block_3c_bn_2[0][0] block_3b_relu[0][0] __________________________________________________________________________________________________ block_3c_relu (Activation) (None, 256, 34, 60) 0 add_10[0][0] __________________________________________________________________________________________________ block_3d_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3c_relu[0][0] __________________________________________________________________________________________________ block_3d_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3d_conv_1[0][0] __________________________________________________________________________________________________ block_3d_relu_1 (Activation) (None, 256, 34, 60) 0 block_3d_bn_1[0][0] __________________________________________________________________________________________________ block_3d_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3d_relu_1[0][0] __________________________________________________________________________________________________ block_3d_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3d_conv_2[0][0] __________________________________________________________________________________________________ add_11 (Add) (None, 256, 34, 60) 0 block_3d_bn_2[0][0] block_3c_relu[0][0] __________________________________________________________________________________________________ block_3d_relu (Activation) (None, 256, 34, 60) 0 add_11[0][0] __________________________________________________________________________________________________ block_3e_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3d_relu[0][0] __________________________________________________________________________________________________ block_3e_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3e_conv_1[0][0] __________________________________________________________________________________________________ block_3e_relu_1 (Activation) (None, 256, 34, 60) 0 block_3e_bn_1[0][0] __________________________________________________________________________________________________ block_3e_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3e_relu_1[0][0] __________________________________________________________________________________________________ block_3e_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3e_conv_2[0][0] __________________________________________________________________________________________________ add_12 (Add) (None, 256, 34, 60) 0 block_3e_bn_2[0][0] block_3d_relu[0][0] __________________________________________________________________________________________________ block_3e_relu (Activation) (None, 256, 34, 60) 0 add_12[0][0] __________________________________________________________________________________________________ block_3f_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3e_relu[0][0] __________________________________________________________________________________________________ block_3f_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3f_conv_1[0][0] __________________________________________________________________________________________________ block_3f_relu_1 (Activation) (None, 256, 34, 60) 0 block_3f_bn_1[0][0] __________________________________________________________________________________________________ block_3f_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3f_relu_1[0][0] __________________________________________________________________________________________________ block_3f_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3f_conv_2[0][0] __________________________________________________________________________________________________ add_13 (Add) (None, 256, 34, 60) 0 block_3f_bn_2[0][0] block_3e_relu[0][0] __________________________________________________________________________________________________ block_3f_relu (Activation) (None, 256, 34, 60) 0 add_13[0][0] __________________________________________________________________________________________________ block_4a_conv_1 (Conv2D) (None, 512, 34, 60) 1180160 block_3f_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_1[0][0] __________________________________________________________________________________________________ block_4a_relu_1 (Activation) (None, 512, 34, 60) 0 block_4a_bn_1[0][0] __________________________________________________________________________________________________ block_4a_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4a_relu_1[0][0] __________________________________________________________________________________________________ block_4a_conv_shortcut (Conv2D) (None, 512, 34, 60) 131584 block_3f_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_2[0][0] __________________________________________________________________________________________________ block_4a_bn_shortcut (BatchNorm (None, 512, 34, 60) 2048 block_4a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_14 (Add) (None, 512, 34, 60) 0 block_4a_bn_2[0][0] block_4a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_4a_relu (Activation) (None, 512, 34, 60) 0 add_14[0][0] __________________________________________________________________________________________________ block_4b_conv_1 (Conv2D) (None, 512, 34, 60) 2359808 block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4b_conv_1[0][0] __________________________________________________________________________________________________ block_4b_relu_1 (Activation) (None, 512, 34, 60) 0 block_4b_bn_1[0][0] __________________________________________________________________________________________________ block_4b_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4b_relu_1[0][0] __________________________________________________________________________________________________ block_4b_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4b_conv_2[0][0] __________________________________________________________________________________________________ add_15 (Add) (None, 512, 34, 60) 0 block_4b_bn_2[0][0] block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_relu (Activation) (None, 512, 34, 60) 0 add_15[0][0] __________________________________________________________________________________________________ block_4c_conv_1 (Conv2D) (None, 512, 34, 60) 2359808 block_4b_relu[0][0] __________________________________________________________________________________________________ block_4c_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4c_conv_1[0][0] __________________________________________________________________________________________________ block_4c_relu_1 (Activation) (None, 512, 34, 60) 0 block_4c_bn_1[0][0] __________________________________________________________________________________________________ block_4c_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4c_relu_1[0][0] __________________________________________________________________________________________________ block_4c_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4c_conv_2[0][0] __________________________________________________________________________________________________ add_16 (Add) (None, 512, 34, 60) 0 block_4c_bn_2[0][0] block_4b_relu[0][0] __________________________________________________________________________________________________ block_4c_relu (Activation) (None, 512, 34, 60) 0 add_16[0][0] __________________________________________________________________________________________________ output_bbox (Conv2D) (None, 28, 34, 60) 14364 block_4c_relu[0][0] __________________________________________________________________________________________________ output_cov (Conv2D) (None, 7, 34, 60) 3591 block_4c_relu[0][0] ================================================================================================== Total params: 21,332,579 Trainable params: 21,080,227 Non-trainable params: 252,352 __________________________________________________________________________________________________ 2023-05-29 14:11:45,701 [INFO] root: DetectNet V2 model built. 2023-05-29 14:11:45,702 [INFO] root: Building rasterizer. 2023-05-29 14:11:45,702 [INFO] root: Rasterizers built. 2023-05-29 14:11:45,716 [INFO] root: Building training graph. 2023-05-29 14:11:45,717 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False 2023-05-29 14:11:45,717 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False 2023-05-29 14:11:45,717 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0) 2023-05-29 14:11:45,717 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 64, io threads: 64, compute threads: 32, buffered batches: 4 2023-05-29 14:11:45,717 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 47494, number of sources: 1, batch size per gpu: 24, steps: 990 WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. 2023-05-29 14:11:45,752 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:45,786 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:45,800 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates. 2023-05-29 14:11:45.828288: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:45.828447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:21:00.0 2023-05-29 14:11:45.828510: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:45.828629: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 1 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:22:00.0 2023-05-29 14:11:45.828644: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:45.828686: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:11:45.828699: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-05-29 14:11:45.828718: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-05-29 14:11:45.828728: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:11:45.828740: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-05-29 14:11:45.828751: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-05-29 14:11:45.828802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:45.828986: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:45.829140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:45.829310: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:45.829433: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0, 1 2023-05-29 14:11:45,981 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 2 2023-05-29 14:11:45,986 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights: 2023-05-29 14:11:45,986 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000 WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:45,998 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:46,259 [INFO] __main__: Found 47494 samples in training set 2023-05-29 14:11:46,259 [INFO] root: Rasterizing tensors. 2023-05-29 14:11:46,438 [INFO] root: Tensors rasterized. 2023-05-29 14:11:49,002 [INFO] root: Training graph built. 2023-05-29 14:11:49,002 [INFO] root: Building validation graph. 2023-05-29 14:11:49,002 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False 2023-05-29 14:11:49,002 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False 2023-05-29 14:11:49,003 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0) 2023-05-29 14:11:49,003 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 64, io threads: 128, compute threads: 64, buffered batches: 4 2023-05-29 14:11:49,003 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 11873, number of sources: 1, batch size per gpu: 24, steps: 495 WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:49,012 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:49,026 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates. 2023-05-29 14:11:49,193 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1 2023-05-29 14:11:49,197 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights: 2023-05-29 14:11:49,197 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000 WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:49,208 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:49,374 [INFO] __main__: Found 11873 samples in validation set 2023-05-29 14:11:49,374 [INFO] root: Rasterizing tensors. 2023-05-29 14:11:49,556 [INFO] root: Tensors rasterized. 2023-05-29 14:11:49,982 [INFO] root: Validation graph built. 2023-05-29 14:11:51,405 [INFO] root: Running training loop. 2023-05-29 14:11:51,405 [INFO] __main__: Checkpoint interval: 10 2023-05-29 14:11:51,406 [INFO] __main__: Scalars logged at every 99 steps 2023-05-29 14:11:51,406 [INFO] __main__: Images logged at every 0 steps INFO:tensorflow:Create CheckpointSaverHook. 2023-05-29 14:11:51,408 [INFO] tensorflow: Create CheckpointSaverHook. INFO:tensorflow:Graph was finalized. 2023-05-29 14:11:54,910 [INFO] tensorflow: Graph was finalized. 2023-05-29 14:11:54.911551: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:54.911728: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:21:00.0 2023-05-29 14:11:54.911757: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:54.911809: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:11:54.911826: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-05-29 14:11:54.911838: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-05-29 14:11:54.911850: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:11:54.911870: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-05-29 14:11:54.911880: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-05-29 14:11:54.911968: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:54.912148: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:54.912257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0 2023-05-29 14:11:55.285505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2023-05-29 14:11:55.285549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 0 2023-05-29 14:11:55.285555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 0: N 2023-05-29 14:11:55.285843: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:55.286067: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:55.286194: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 46288 MB memory) -> physical GPU (device: 0, name: NVIDIA RTX 6000 Ada Generation, pci bus id: 0000:21:00.0, compute capability: 8.9) /usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually. warnings.warn('No training configuration found in save file: ' 2023-05-29 14:11:57,333 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used. __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 3, 544, 960) 0 __________________________________________________________________________________________________ conv1 (Conv2D) (None, 64, 272, 480) 9472 input_1[0][0] __________________________________________________________________________________________________ bn_conv1 (BatchNormalization) (None, 64, 272, 480) 256 conv1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 64, 272, 480) 0 bn_conv1[0][0] __________________________________________________________________________________________________ block_1a_conv_1 (Conv2D) (None, 64, 136, 240) 36928 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1a_conv_1[0][0] __________________________________________________________________________________________________ block_1a_relu_1 (Activation) (None, 64, 136, 240) 0 block_1a_bn_1[0][0] __________________________________________________________________________________________________ block_1a_conv_2 (Conv2D) (None, 64, 136, 240) 36928 block_1a_relu_1[0][0] __________________________________________________________________________________________________ block_1a_conv_shortcut (Conv2D) (None, 64, 136, 240) 4160 activation_1[0][0] __________________________________________________________________________________________________ block_1a_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1a_conv_2[0][0] __________________________________________________________________________________________________ block_1a_bn_shortcut (BatchNorm (None, 64, 136, 240) 256 block_1a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 64, 136, 240) 0 block_1a_bn_2[0][0] block_1a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_1a_relu (Activation) (None, 64, 136, 240) 0 add_1[0][0] __________________________________________________________________________________________________ block_1b_conv_1 (Conv2D) (None, 64, 136, 240) 36928 block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1b_conv_1[0][0] __________________________________________________________________________________________________ block_1b_relu_1 (Activation) (None, 64, 136, 240) 0 block_1b_bn_1[0][0] __________________________________________________________________________________________________ block_1b_conv_2 (Conv2D) (None, 64, 136, 240) 36928 block_1b_relu_1[0][0] __________________________________________________________________________________________________ block_1b_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1b_conv_2[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 64, 136, 240) 0 block_1b_bn_2[0][0] block_1a_relu[0][0] __________________________________________________________________________________________________ block_1b_relu (Activation) (None, 64, 136, 240) 0 add_2[0][0] __________________________________________________________________________________________________ block_1c_conv_1 (Conv2D) (None, 64, 136, 240) 36928 block_1b_relu[0][0] __________________________________________________________________________________________________ block_1c_bn_1 (BatchNormalizati (None, 64, 136, 240) 256 block_1c_conv_1[0][0] __________________________________________________________________________________________________ block_1c_relu_1 (Activation) (None, 64, 136, 240) 0 block_1c_bn_1[0][0] __________________________________________________________________________________________________ block_1c_conv_2 (Conv2D) (None, 64, 136, 240) 36928 block_1c_relu_1[0][0] __________________________________________________________________________________________________ block_1c_bn_2 (BatchNormalizati (None, 64, 136, 240) 256 block_1c_conv_2[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 64, 136, 240) 0 block_1c_bn_2[0][0] block_1b_relu[0][0] __________________________________________________________________________________________________ block_1c_relu (Activation) (None, 64, 136, 240) 0 add_3[0][0] __________________________________________________________________________________________________ block_2a_conv_1 (Conv2D) (None, 128, 68, 120) 73856 block_1c_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2a_conv_1[0][0] __________________________________________________________________________________________________ block_2a_relu_1 (Activation) (None, 128, 68, 120) 0 block_2a_bn_1[0][0] __________________________________________________________________________________________________ block_2a_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2a_relu_1[0][0] __________________________________________________________________________________________________ block_2a_conv_shortcut (Conv2D) (None, 128, 68, 120) 8320 block_1c_relu[0][0] __________________________________________________________________________________________________ block_2a_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2a_conv_2[0][0] __________________________________________________________________________________________________ block_2a_bn_shortcut (BatchNorm (None, 128, 68, 120) 512 block_2a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 128, 68, 120) 0 block_2a_bn_2[0][0] block_2a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_2a_relu (Activation) (None, 128, 68, 120) 0 add_4[0][0] __________________________________________________________________________________________________ block_2b_conv_1 (Conv2D) (None, 128, 68, 120) 147584 block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2b_conv_1[0][0] __________________________________________________________________________________________________ block_2b_relu_1 (Activation) (None, 128, 68, 120) 0 block_2b_bn_1[0][0] __________________________________________________________________________________________________ block_2b_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2b_relu_1[0][0] __________________________________________________________________________________________________ block_2b_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2b_conv_2[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 128, 68, 120) 0 block_2b_bn_2[0][0] block_2a_relu[0][0] __________________________________________________________________________________________________ block_2b_relu (Activation) (None, 128, 68, 120) 0 add_5[0][0] __________________________________________________________________________________________________ block_2c_conv_1 (Conv2D) (None, 128, 68, 120) 147584 block_2b_relu[0][0] __________________________________________________________________________________________________ block_2c_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2c_conv_1[0][0] __________________________________________________________________________________________________ block_2c_relu_1 (Activation) (None, 128, 68, 120) 0 block_2c_bn_1[0][0] __________________________________________________________________________________________________ block_2c_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2c_relu_1[0][0] __________________________________________________________________________________________________ block_2c_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2c_conv_2[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 128, 68, 120) 0 block_2c_bn_2[0][0] block_2b_relu[0][0] __________________________________________________________________________________________________ block_2c_relu (Activation) (None, 128, 68, 120) 0 add_6[0][0] __________________________________________________________________________________________________ block_2d_conv_1 (Conv2D) (None, 128, 68, 120) 147584 block_2c_relu[0][0] __________________________________________________________________________________________________ block_2d_bn_1 (BatchNormalizati (None, 128, 68, 120) 512 block_2d_conv_1[0][0] __________________________________________________________________________________________________ block_2d_relu_1 (Activation) (None, 128, 68, 120) 0 block_2d_bn_1[0][0] __________________________________________________________________________________________________ block_2d_conv_2 (Conv2D) (None, 128, 68, 120) 147584 block_2d_relu_1[0][0] __________________________________________________________________________________________________ block_2d_bn_2 (BatchNormalizati (None, 128, 68, 120) 512 block_2d_conv_2[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 128, 68, 120) 0 block_2d_bn_2[0][0] block_2c_relu[0][0] __________________________________________________________________________________________________ block_2d_relu (Activation) (None, 128, 68, 120) 0 add_7[0][0] __________________________________________________________________________________________________ block_3a_conv_1 (Conv2D) (None, 256, 34, 60) 295168 block_2d_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_1[0][0] __________________________________________________________________________________________________ block_3a_relu_1 (Activation) (None, 256, 34, 60) 0 block_3a_bn_1[0][0] __________________________________________________________________________________________________ block_3a_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3a_relu_1[0][0] __________________________________________________________________________________________________ block_3a_conv_shortcut (Conv2D) (None, 256, 34, 60) 33024 block_2d_relu[0][0] __________________________________________________________________________________________________ block_3a_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3a_conv_2[0][0] __________________________________________________________________________________________________ block_3a_bn_shortcut (BatchNorm (None, 256, 34, 60) 1024 block_3a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 256, 34, 60) 0 block_3a_bn_2[0][0] block_3a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_3a_relu (Activation) (None, 256, 34, 60) 0 add_8[0][0] __________________________________________________________________________________________________ block_3b_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3b_conv_1[0][0] __________________________________________________________________________________________________ block_3b_relu_1 (Activation) (None, 256, 34, 60) 0 block_3b_bn_1[0][0] __________________________________________________________________________________________________ block_3b_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3b_relu_1[0][0] __________________________________________________________________________________________________ block_3b_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3b_conv_2[0][0] __________________________________________________________________________________________________ add_9 (Add) (None, 256, 34, 60) 0 block_3b_bn_2[0][0] block_3a_relu[0][0] __________________________________________________________________________________________________ block_3b_relu (Activation) (None, 256, 34, 60) 0 add_9[0][0] __________________________________________________________________________________________________ block_3c_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3b_relu[0][0] __________________________________________________________________________________________________ block_3c_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3c_conv_1[0][0] __________________________________________________________________________________________________ block_3c_relu_1 (Activation) (None, 256, 34, 60) 0 block_3c_bn_1[0][0] __________________________________________________________________________________________________ block_3c_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3c_relu_1[0][0] __________________________________________________________________________________________________ block_3c_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3c_conv_2[0][0] __________________________________________________________________________________________________ add_10 (Add) (None, 256, 34, 60) 0 block_3c_bn_2[0][0] block_3b_relu[0][0] __________________________________________________________________________________________________ block_3c_relu (Activation) (None, 256, 34, 60) 0 add_10[0][0] __________________________________________________________________________________________________ block_3d_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3c_relu[0][0] __________________________________________________________________________________________________ block_3d_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3d_conv_1[0][0] __________________________________________________________________________________________________ block_3d_relu_1 (Activation) (None, 256, 34, 60) 0 block_3d_bn_1[0][0] __________________________________________________________________________________________________ block_3d_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3d_relu_1[0][0] __________________________________________________________________________________________________ block_3d_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3d_conv_2[0][0] __________________________________________________________________________________________________ add_11 (Add) (None, 256, 34, 60) 0 block_3d_bn_2[0][0] block_3c_relu[0][0] __________________________________________________________________________________________________ block_3d_relu (Activation) (None, 256, 34, 60) 0 add_11[0][0] __________________________________________________________________________________________________ block_3e_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3d_relu[0][0] __________________________________________________________________________________________________ block_3e_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3e_conv_1[0][0] __________________________________________________________________________________________________ block_3e_relu_1 (Activation) (None, 256, 34, 60) 0 block_3e_bn_1[0][0] __________________________________________________________________________________________________ block_3e_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3e_relu_1[0][0] __________________________________________________________________________________________________ block_3e_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3e_conv_2[0][0] __________________________________________________________________________________________________ add_12 (Add) (None, 256, 34, 60) 0 block_3e_bn_2[0][0] block_3d_relu[0][0] __________________________________________________________________________________________________ block_3e_relu (Activation) (None, 256, 34, 60) 0 add_12[0][0] __________________________________________________________________________________________________ block_3f_conv_1 (Conv2D) (None, 256, 34, 60) 590080 block_3e_relu[0][0] __________________________________________________________________________________________________ block_3f_bn_1 (BatchNormalizati (None, 256, 34, 60) 1024 block_3f_conv_1[0][0] __________________________________________________________________________________________________ block_3f_relu_1 (Activation) (None, 256, 34, 60) 0 block_3f_bn_1[0][0] __________________________________________________________________________________________________ block_3f_conv_2 (Conv2D) (None, 256, 34, 60) 590080 block_3f_relu_1[0][0] __________________________________________________________________________________________________ block_3f_bn_2 (BatchNormalizati (None, 256, 34, 60) 1024 block_3f_conv_2[0][0] __________________________________________________________________________________________________ add_13 (Add) (None, 256, 34, 60) 0 block_3f_bn_2[0][0] block_3e_relu[0][0] __________________________________________________________________________________________________ block_3f_relu (Activation) (None, 256, 34, 60) 0 add_13[0][0] __________________________________________________________________________________________________ block_4a_conv_1 (Conv2D) (None, 512, 34, 60) 1180160 block_3f_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_1[0][0] __________________________________________________________________________________________________ block_4a_relu_1 (Activation) (None, 512, 34, 60) 0 block_4a_bn_1[0][0] __________________________________________________________________________________________________ block_4a_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4a_relu_1[0][0] __________________________________________________________________________________________________ block_4a_conv_shortcut (Conv2D) (None, 512, 34, 60) 131584 block_3f_relu[0][0] __________________________________________________________________________________________________ block_4a_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4a_conv_2[0][0] __________________________________________________________________________________________________ block_4a_bn_shortcut (BatchNorm (None, 512, 34, 60) 2048 block_4a_conv_shortcut[0][0] __________________________________________________________________________________________________ add_14 (Add) (None, 512, 34, 60) 0 block_4a_bn_2[0][0] block_4a_bn_shortcut[0][0] __________________________________________________________________________________________________ block_4a_relu (Activation) (None, 512, 34, 60) 0 add_14[0][0] __________________________________________________________________________________________________ block_4b_conv_1 (Conv2D) (None, 512, 34, 60) 2359808 block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4b_conv_1[0][0] __________________________________________________________________________________________________ block_4b_relu_1 (Activation) (None, 512, 34, 60) 0 block_4b_bn_1[0][0] __________________________________________________________________________________________________ block_4b_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4b_relu_1[0][0] __________________________________________________________________________________________________ block_4b_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4b_conv_2[0][0] __________________________________________________________________________________________________ add_15 (Add) (None, 512, 34, 60) 0 block_4b_bn_2[0][0] block_4a_relu[0][0] __________________________________________________________________________________________________ block_4b_relu (Activation) (None, 512, 34, 60) 0 add_15[0][0] __________________________________________________________________________________________________ block_4c_conv_1 (Conv2D) (None, 512, 34, 60) 2359808 block_4b_relu[0][0] __________________________________________________________________________________________________ block_4c_bn_1 (BatchNormalizati (None, 512, 34, 60) 2048 block_4c_conv_1[0][0] __________________________________________________________________________________________________ block_4c_relu_1 (Activation) (None, 512, 34, 60) 0 block_4c_bn_1[0][0] __________________________________________________________________________________________________ block_4c_conv_2 (Conv2D) (None, 512, 34, 60) 2359808 block_4c_relu_1[0][0] __________________________________________________________________________________________________ block_4c_bn_2 (BatchNormalizati (None, 512, 34, 60) 2048 block_4c_conv_2[0][0] __________________________________________________________________________________________________ add_16 (Add) (None, 512, 34, 60) 0 block_4c_bn_2[0][0] block_4b_relu[0][0] __________________________________________________________________________________________________ block_4c_relu (Activation) (None, 512, 34, 60) 0 add_16[0][0] __________________________________________________________________________________________________ output_bbox (Conv2D) (None, 28, 34, 60) 14364 block_4c_relu[0][0] __________________________________________________________________________________________________ output_cov (Conv2D) (None, 7, 34, 60) 3591 block_4c_relu[0][0] ================================================================================================== Total params: 21,332,579 Trainable params: 21,080,227 Non-trainable params: 252,352 __________________________________________________________________________________________________ 2023-05-29 14:11:57,371 [INFO] root: DetectNet V2 model built. 2023-05-29 14:11:57,371 [INFO] root: Building rasterizer. 2023-05-29 14:11:57,372 [INFO] root: Rasterizers built. 2023-05-29 14:11:57,389 [INFO] root: Building training graph. 2023-05-29 14:11:57,390 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False 2023-05-29 14:11:57,390 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False 2023-05-29 14:11:57,390 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0) 2023-05-29 14:11:57,390 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 64, io threads: 64, compute threads: 32, buffered batches: 4 2023-05-29 14:11:57,390 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 47494, number of sources: 1, batch size per gpu: 24, steps: 990 WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. 2023-05-29 14:11:57,433 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead. WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:57,474 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:57,490 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates. 2023-05-29 14:11:57.521650: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:57.521860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:21:00.0 2023-05-29 14:11:57.521988: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:57.522133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 1 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:22:00.0 2023-05-29 14:11:57.522153: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:11:57.522229: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:11:57.522260: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-05-29 14:11:57.522284: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-05-29 14:11:57.522304: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:11:57.522324: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-05-29 14:11:57.522343: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-05-29 14:11:57.522434: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:57.522639: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:57.522854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:57.523054: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:11:57.523184: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 0, 1 2023-05-29 14:11:57,702 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 1 of 2 2023-05-29 14:11:57,707 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights: 2023-05-29 14:11:57,707 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000 WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:57,720 [WARNING] tensorflow: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code 2023-05-29 14:11:58,020 [INFO] __main__: Found 47494 samples in training set 2023-05-29 14:11:58,020 [INFO] root: Rasterizing tensors. INFO:tensorflow:Running local_init_op. 2023-05-29 14:11:58,047 [INFO] tensorflow: Running local_init_op. 2023-05-29 14:11:58,231 [INFO] root: Tensors rasterized. INFO:tensorflow:Done running local_init_op. 2023-05-29 14:11:58,689 [INFO] tensorflow: Done running local_init_op. 2023-05-29 14:12:01,211 [INFO] root: Training graph built. 2023-05-29 14:12:01,211 [INFO] root: Running training loop. 2023-05-29 14:12:01,211 [INFO] __main__: Checkpoint interval: 10 2023-05-29 14:12:01,212 [INFO] __main__: Scalars logged at every 99 steps 2023-05-29 14:12:01,212 [INFO] __main__: Images logged at every 0 steps INFO:tensorflow:Graph was finalized. 2023-05-29 14:12:03,050 [INFO] tensorflow: Graph was finalized. 2023-05-29 14:12:03.051059: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:12:03.051242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1669] Found device 0 with properties: name: NVIDIA RTX 6000 Ada Generation major: 8 minor: 9 memoryClockRate(GHz): 2.505 pciBusID: 0000:22:00.0 2023-05-29 14:12:03.051269: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2023-05-29 14:12:03.051322: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:12:03.051341: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2023-05-29 14:12:03.051354: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2023-05-29 14:12:03.051367: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:12:03.051378: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2023-05-29 14:12:03.051390: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2023-05-29 14:12:03.051475: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:12:03.051659: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:12:03.051776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1797] Adding visible gpu devices: 1 2023-05-29 14:12:03.432468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1209] Device interconnect StreamExecutor with strength 1 edge matrix: 2023-05-29 14:12:03.432508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1215] 1 2023-05-29 14:12:03.432514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1228] 1: N 2023-05-29 14:12:03.432777: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:12:03.433031: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1082] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-05-29 14:12:03.433151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1354] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 46279 MB memory) -> physical GPU (device: 1, name: NVIDIA RTX 6000 Ada Generation, pci bus id: 0000:22:00.0, compute capability: 8.9) INFO:tensorflow:Running local_init_op. 2023-05-29 14:12:05,830 [INFO] tensorflow: Running local_init_op. INFO:tensorflow:Done running local_init_op. 2023-05-29 14:12:06,267 [INFO] tensorflow: Done running local_init_op. 2023-05-29 14:12:12.930899: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:12:14.220481: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x8026bd0 2023-05-29 14:12:14.220811: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:12:14.510065: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:12:14.520451: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 INFO:tensorflow:Saving checkpoints for step-0. 2023-05-29 14:12:19,091 [INFO] tensorflow: Saving checkpoints for step-0. 2023-05-29 14:12:55.715127: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:12:56.412712: I tensorflow/core/kernels/cuda_solvers.cc:159] Creating CudaSolver handles for stream 0x8084110 2023-05-29 14:12:56.412883: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11 2023-05-29 14:12:56.470997: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2023-05-29 14:12:56.508138: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 cc8b63e0b034:189753:189771 [0] NCCL INFO Bootstrap : Using eth0:172.17.0.2<0> cc8b63e0b034:189753:189771 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v6 symbol. cc8b63e0b034:189753:189771 [0] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin (v5) cc8b63e0b034:189753:189771 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v6 symbol. cc8b63e0b034:189753:189771 [0] NCCL INFO NET/Plugin: Loaded coll plugin SHARP (v5) cc8b63e0b034:189753:189771 [0] NCCL INFO cudaDriverVersion 12000 NCCL version 2.15.5+cuda11.8 cc8b63e0b034:189753:189771 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so cc8b63e0b034:189753:189771 [0] NCCL INFO P2P plugin IBext cc8b63e0b034:189753:189771 [0] NCCL INFO NET/IB : No device found. cc8b63e0b034:189753:189771 [0] NCCL INFO NET/IB : No device found. cc8b63e0b034:189753:189771 [0] NCCL INFO NET/Socket : Using [0]eth0:172.17.0.2<0> cc8b63e0b034:189753:189771 [0] NCCL INFO Using network Socket cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 00/04 : 0 1 cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 01/04 : 0 1 cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 02/04 : 0 1 cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 03/04 : 0 1 cc8b63e0b034:189753:189771 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] -1/-1/-1->0->1 [2] 1/-1/-1->0->-1 [3] -1/-1/-1->0->1 cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 00/0 : 0[21000] -> 1[22000] via P2P/IPC cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 01/0 : 0[21000] -> 1[22000] via P2P/IPC cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 02/0 : 0[21000] -> 1[22000] via P2P/IPC cc8b63e0b034:189753:189771 [0] NCCL INFO Channel 03/0 : 0[21000] -> 1[22000] via P2P/IPC cc8b63e0b034:189753:189771 [0] NCCL INFO Connected all rings cc8b63e0b034:189753:189771 [0] NCCL INFO Connected all trees cc8b63e0b034:189753:189771 [0] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 cc8b63e0b034:189753:189771 [0] NCCL INFO 4 coll channels, 4 p2p channels, 2 p2p channels per peer cc8b63e0b034:189753:189771 [0] NCCL INFO comm 0x7f13aba24500 rank 0 nranks 2 cudaDev 0 busId 21000 - Init COMPLETE