Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) Tesla T4 GPU
• DeepStream Version 5.0 (5.0-dp-20.04-devel docker)
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only) 10.2
• Issue Type( questions, new requirements, bugs) bugs
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I am using my modified version of deepstream-test5-app and use Kafka broker to receive and send messages into and from deepstream application. After couple of hours, the application will output error message like this, and cannot receive any new messages anymore.
%3|1602260329.352|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602260329.352|ERROR|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602260329.363|FAIL|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602260329.363|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602260329.363|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 2/2 brokers are down
%3|1602260929.454|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602260929.455|ERROR|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602260929.465|FAIL|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602260929.465|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602260929.465|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 2/2 brokers are down
%3|1602261229.356|FAIL|rdkafka#producer-3| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602261229.356|ERROR|rdkafka#producer-3| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602261529.561|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602261529.561|ERROR|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602261529.577|FAIL|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602261529.577|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602261529.577|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 2/2 brokers are down
%3|1602261829.353|FAIL|rdkafka#consumer-2| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602261829.353|ERROR|rdkafka#consumer-2| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602262129.670|FAIL|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602262129.670|ERROR|rdkafka#producer-1| [thrd:kafka:9092/bootstrap]: kafka:9092/bootstrap: Receive failed: Disconnected
%3|1602262129.681|FAIL|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602262129.681|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 172.17.0.1:9092/1001: Receive failed: Disconnected
%3|1602262129.681|ERROR|rdkafka#producer-1| [thrd:172.17.0.1:9092/1001]: 2/2 brokers are down
FYI, the Kafka related configuration is shown here:
[sink1]
enable=1
gpu-id=0
#Type - 1=FakeSink 2=EglSink 3=File 4=UDPSink 5=nvoverlaysink 6=MsgConvBroker
type=6
msg-conv-config=dstest5_msgconv_config.txt
disable-msgconv=1
#(0): PAYLOAD_DEEPSTREAM - Deepstream schema payload
#(1): PAYLOAD_DEEPSTREAM_MINIMAL - Deepstream schema payload minimal
#(256): PAYLOAD_RESERVED - Reserved type
#(257): PAYLOAD_CUSTOM - Custom schema payload
msg-conv-payload-type=0
msg-broker-proto-lib=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_kafka_proto.so
#Provide your msg-broker-conn-str here
msg-broker-conn-str=kafka;9092;metromind-start
topic=metromind-start
#msg-broker-conn-str=kafka;9092;metromind-start-test
#topic=metromind-start-test
#Optional:
msg-broker-config=cfg_kafka.txt
# sink type = 6 by default creates msg converter + broker.
# To use multiple brokers use this group for converter and use
# sink type = 6 with disable-msgconv = 1
[message-converter]
enable=1
msg-conv-config=dstest5_msgconv_sample_config.txt
#(0): PAYLOAD_DEEPSTREAM - Deepstream schema payload
#(1): PAYLOAD_DEEPSTREAM_MINIMAL - Deepstream schema payload minimal
#(256): PAYLOAD_RESERVED - Reserved type
#(257): PAYLOAD_CUSTOM - Custom schema payload
msg-conv-payload-type=0
# Name of library having custom implementation.
#msg-conv-msg2p-lib=<val>
# Id of component in case only selected message to parse.
#msg-conv-comp-id=<val>
# Configure this group to enable cloud message consumer.
[message-consumer0]
enable=1
proto-lib=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_kafka_proto.so
conn-str=kafka;9092
config-file=cfg_kafka.txt
subscribe-topic-list=source-control;add-source;delete-source;mp4-segment
The cfg_kafka.txt is set as follows:
[message-broker]
partition-key=sensor.id
proto-cfg="socket.keepalive.enable=true;retries=5;connection.max.idle.ms=100000"
I find this link for kafka related to this problem. It says Kafka will automatically reconnect whenever the connection is closed, however it seems fail to reconnect.
•Requirement
Is there a way to quickly fix this problem?
If not, how can I modify the deepstream-test5-app so it can raise error and exit upon this situation?