Hi
I am setting up the inference server as per instructions on the guide:
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/quickstart.html
My setup is as:
GPU: RTX 2080 Ti
Cuda Version: 10.2
Driver: 450.57
cudnn: libcudnn8_8.0.2.39-1+cuda10.2_amd64
Apparently inference server starts and listens as describe in the guide however when I run the command using curl to test then don’t receive 200 from server and receive following message:
Command: curl -v localhost:8000/v2/health/ready
- Trying 127.0.0.1…
- TCP_NODELAY set
- Connected to localhost (127.0.0.1) port 8000 (#0)
GET /v2/health/ready HTTP/1.1
Host: localhost:8000
User-Agent: curl/7.58.0
Accept: /
< HTTP/1.1 400 Bad Request
< Content-Length: 0
< Content-Type: text/plain
<
- Connection #0 to host localhost left intact
Please help me with this issue. Thanks.
Regards,
Ghazni
=============================
== Execution Command to start inference server
docker run --gpus=1 --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/mgsaeed/wd500gb/github/triton-inference-server/docs/examples/model_repository:/models nvcr.io/nvidia/tritonserver:20.07-v1-py3 tritonserver --model-repository=/models
=============================
== Triton Inference Server ==
NVIDIA Release 20.07 (build 14602913)
Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
2020-08-10 16:11:10.765523: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
I0810 16:11:10.792665 1 metrics.cc:164] found 1 GPUs supporting NVML metrics
I0810 16:11:10.798194 1 metrics.cc:173] GPU 0: GeForce RTX 2080 Ti
I0810 16:11:10.798388 1 server.cc:127] Initializing Triton Inference Server
I0810 16:11:10.955257 1 server_status.cc:55] New status tracking for model ‘densenet_onnx’
I0810 16:11:10.955277 1 server_status.cc:55] New status tracking for model ‘inception_graphdef’
I0810 16:11:10.955281 1 server_status.cc:55] New status tracking for model ‘resnet50_netdef’
I0810 16:11:10.955285 1 server_status.cc:55] New status tracking for model ‘simple’
I0810 16:11:10.955288 1 server_status.cc:55] New status tracking for model ‘simple_string’
I0810 16:11:10.955312 1 model_repository_manager.cc:723] loading: simple:1
I0810 16:11:10.955387 1 model_repository_manager.cc:723] loading: simple_string:1
I0810 16:11:10.955491 1 model_repository_manager.cc:723] loading: resnet50_netdef:1
I0810 16:11:10.955541 1 model_repository_manager.cc:723] loading: inception_graphdef:1
I0810 16:11:10.955609 1 model_repository_manager.cc:723] loading: densenet_onnx:1
I0810 16:11:10.957881 1 base_backend.cc:176] Creating instance simple_0_gpu0 on GPU 0 (7.5) using model.graphdef
I0810 16:11:10.957958 1 base_backend.cc:176] Creating instance simple_string_0_gpu0 on GPU 0 (7.5) using model.graphdef
I0810 16:11:10.958099 1 base_backend.cc:176] Creating instance inception_graphdef_0_gpu0 on GPU 0 (7.5) using model.graphdef
I0810 16:11:10.984917 1 onnx_backend.cc:203] Creating instance densenet_onnx_0_gpu0 on GPU 0 (7.5) using model.onnx
2020-08-10 16:11:10.988128: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3699850000 Hz
2020-08-10 16:11:10.989941: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f2a34088380 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-10 16:11:10.989983: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-08-10 16:11:10.990153: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-08-10 16:11:10.991876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:67:00.0
2020-08-10 16:11:10.991917: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-08-10 16:11:10.991964: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-08-10 16:11:10.991997: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-10 16:11:10.992066: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-10 16:11:10.998282: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-10 16:11:10.998368: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.11
2020-08-10 16:11:10.998399: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-08-10 16:11:11.003025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
I0810 16:11:11.064634 1 netdef_backend.cc:206] Creating instance resnet50_netdef_0_gpu0 on GPU 0 (7.5) using init_model.netdef and model.netdef
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
2020-08-10 16:11:12.082177: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-10 16:11:12.082227: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] 0
2020-08-10 16:11:12.082238: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0: N
2020-08-10 16:11:12.088323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9610 MB memory) → physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:67:00.0, compute capability: 7.5)
2020-08-10 16:11:12.092148: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f2a34766940 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-10 16:11:12.092182: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-08-10 16:11:12.094081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:67:00.0
2020-08-10 16:11:12.094142: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-08-10 16:11:12.094158: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-08-10 16:11:12.094173: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-10 16:11:12.094187: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-10 16:11:12.094215: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-10 16:11:12.094249: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.11
2020-08-10 16:11:12.094261: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-08-10 16:11:12.097381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-08-10 16:11:12.097420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-10 16:11:12.097431: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] 0
2020-08-10 16:11:12.097440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0: N
2020-08-10 16:11:12.100445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9610 MB memory) → physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:67:00.0, compute capability: 7.5)
2020-08-10 16:11:12.102552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:67:00.0
2020-08-10 16:11:12.102626: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
2020-08-10 16:11:12.102656: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.11
2020-08-10 16:11:12.102684: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-08-10 16:11:12.102705: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-08-10 16:11:12.102749: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-08-10 16:11:12.102770: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.11
2020-08-10 16:11:12.102789: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8
2020-08-10 16:11:12.106010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0
2020-08-10 16:11:12.106055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-10 16:11:12.106072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] 0
2020-08-10 16:11:12.106091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0: N
2020-08-10 16:11:12.109675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9610 MB memory) → physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:67:00.0, compute capability: 7.5)
I0810 16:11:12.110803 1 model_repository_manager.cc:888] successfully loaded ‘simple’ version 1
I0810 16:11:12.110835 1 model_repository_manager.cc:888] successfully loaded ‘simple_string’ version 1
I0810 16:11:12.203993 1 model_repository_manager.cc:888] successfully loaded ‘inception_graphdef’ version 1
I0810 16:11:12.813593 1 model_repository_manager.cc:888] successfully loaded ‘densenet_onnx’ version 1
I0810 16:11:12.892981 1 model_repository_manager.cc:888] successfully loaded ‘resnet50_netdef’ version 1
Starting endpoints, ‘inference:0’ listening on
I0810 16:11:12.895164 1 grpc_server.cc:1942] Started GRPCService at 0.0.0.0:8001
I0810 16:11:12.895179 1 http_server.cc:1428] Starting HTTPService at 0.0.0.0:8000
I0810 16:11:12.936628 1 http_server.cc:1443] Starting Metrics Service at 0.0.0.0:8002