Hi,
I am trying to run a resnet101 based graphdef model using nvinferserver. There seems to be some issue while initialization. It throws the following error:-
ERROR: infer_trtis_server.cpp:202 TRTIS: failed to get response status, trtis_err_str:INTERNAL, err_msg:2 root error(s) found.
(0) Failed precondition: Attempting to use uninitialized value stage2_unit1_sc_bias
[[{{node stage2_unit1_sc_bias/read}}]]
[[fc1/add_1/_3]]
(1) Failed precondition: Attempting to use uninitialized value stage2_unit1_sc_bias
[[{{node stage2_unit1_sc_bias/read}}]]
0 successful operations.
0 derived errors ignored.
I face a similar error if I do not include the following two lines int the python script:
def tf_init(tf_model_file):
...
with tf.Graph().as_default() as graph:
...
init = graph.get_operation_by_name("import/init")
sess.run(init)
...
It throws an error:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value import/stage2_unit2_bn1_scale
[[{{node import/stage2_unit2_bn1_scale/read}}]]
Thanks.
Hi,
We can run your model successfully on Tesla P4 card or T4 card, please let me know your issue in the description fix or not.
root@27c40e1cfcbe:~/workspace/models-master# CUDA_VISIBLE_DEVICES=1 python infer.py
init
Tensor(“import/data:0”, shape=(?, 224, 224, 3), dtype=float32) Tensor(“import/softmax:0”, shape=(?, 1000), dtype=float32)
2020-11-25 10:34:46.178999: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-25 10:34:46.206023: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2394595000 Hz
2020-11-25 10:34:46.206253: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x992a5e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-25 10:34:46.206287: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-25 10:34:46.209399: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-25 10:34:46.332165: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x8f8e670 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-25 10:34:46.332226: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P4, Compute Capability 6.1
2020-11-25 10:34:46.333680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: Tesla P4 major: 6 minor: 1 memoryClockRate(GHz): 1.1135
pciBusID: 0000:04:00.0
2020-11-25 10:34:46.334261: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-25 10:34:46.336358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-25 10:34:46.338268: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-25 10:34:46.338862: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-25 10:34:46.341406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-25 10:34:46.343472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-25 10:34:46.348319: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-25 10:34:46.349768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2020-11-25 10:34:46.349836: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-25 10:34:46.351017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-25 10:34:46.351039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0
2020-11-25 10:34:46.351061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N
2020-11-25 10:34:46.352481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5328 MB memory) → physical GPU (device: 0, name: Tesla P4, pci bus id: 0000:04:00.0, compute capability: 6.1)
init-exit
1.15.2
(1, 224, 224, 3)
2020-11-25 10:34:49.038501: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-25 10:34:49.173622: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-25 10:34:50.447730: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.27GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-11-25 10:34:50.456144: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.27GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-11-25 10:34:50.499009: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.27GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-11-25 10:34:50.507382: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.27GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
tf_processing : exit!!