Hello everyone, I am using the ngc nvcr.io/nvidia/tensorrt:19.07-py3 to test some TF-TRT scripts. However seems like is not detecting any GPU. I have this message Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
root@0ea1cf44ee27:/data/frozen_graphs/tensorrt-experiment# python optimize_model.py to_optimize/frozen_inference_graph.pb
Optimizing graph: to_optimize/frozen_inference_graph.pb
WARN: Output directory found, it will be overwritten
2019-08-08 19:13:06.876394: I tensorflow/core/grappler/devices.cc:60] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA support)
2019-08-08 19:13:06.877314: I tensorflow/core/grappler/clusters/single_machine.cc:359] Starting new session
2019-08-08 19:13:06.878060: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-08 19:13:06.905308: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300030000 Hz
2019-08-08 19:13:06.905860: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x15ec95e0 executing computations on platform Host. Devices:
2019-08-08 19:13:06.905890: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-08-08 19:13:11.177284: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: tf_graph
2019-08-08 19:13:11.177336: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 5078 nodes (-1029), 6515 edges (-1043), time = 2853.16ms.
2019-08-08 19:13:11.177356: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] layout: layout did nothing. time = 5.245ms.
2019-08-08 19:13:11.177367: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 5078 nodes (0), 6515 edges (0), time = 614.901ms.
WARNING: Logging before flag parsing goes to stderr.
W0808 19:13:12.603023 139966277490496 deprecation_wrapper.py:119] From optimize_model.py:39: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.
Done. Find the optimized model in optimized_model/ folder
The code I am running to obtain a tensorrt optimized graph is below:
import tensorflow as tf
import tensorflow.contrib.tensorrt as trt
from tensorflow.python.platform import gfile
import pathlib
import argparse
from frame_config import BATCH_SIZE
parser = argparse.ArgumentParser()
parser.add_argument('graph_path', help="Path to the frozen graph you want to optimize")
args = parser.parse_args()
print('Optimizing graph: {}'.format(args.graph_path))
output_dir = "optimized_model/"
batch_size = BATCH_SIZE
workspace_size = 1<<32
precision_mode="FP16"
output_tensors = ["num_detections", "detection_boxes", "detection_scores", "detection_classes"]
output_path = pathlib.Path(output_dir)
if output_path.exists():
print('WARN: Output directory found, it will be overwritten')
output_path.mkdir(parents=True, exist_ok=True)
# Load graphdef
graph_def = tf.compat.v1.GraphDef()
with tf.io.gfile.GFile(args.graph_path,'rb') as f:
graph_def.ParseFromString(f.read())
# Get optimized graph
trt_graph_def = trt.create_inference_graph(graph_def, output_tensors,
max_batch_size=batch_size,
is_dynamic_op=True,
max_workspace_size_bytes=workspace_size,
minimum_segment_size=4,
maximum_cached_engines=100,
precision_mode=precision_mode)
out_graph_name = str(output_path / 'frozen_inference_graph.pb')
with tf.gfile.GFile(out_graph_name, "w") as f:
f.write(trt_graph_def.SerializeToString())
print('Done. Find the optimized model in {} folder'.format(output_dir))
I am doing the inference, but seems like the GPU is not being using. When I run nvidia-smi command I can see my GPU:
root@0ea1cf44ee27:/data/frozen_graphs/tensorrt-experiment# nvidia-smi
Thu Aug 8 19:18:27 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.40.04 Driver Version: 418.40.04 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:1E.0 Off | 0 |
| N/A 40C P0 26W / 300W | 0MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Please someboy know what is happening ??? Thanks for your help in advance.