tgtd disconnected frequently

Hi, all

I have a trouble on CentOS 6.5

I create a scsi target used tgtd and iser, BUT disconnected frequently , its log like this:

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x7f1ee0 cm_id:0x81b460 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x7f1ee0 cm_id:0x0x81b460 state: CLOSE, refcnt:395

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x7f06a0 cm_id:0x7d29b0 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x7f06a0 cm_id:0x0x7d29b0 state: CLOSE, refcnt:395

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x7f00e0 cm_id:0x807ce0 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x7f00e0 cm_id:0x0x807ce0 state: CLOSE, refcnt:395

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x7f0f00 cm_id:0x7f0c60 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x7f0f00 cm_id:0x0x7f0c60 state: CLOSE, refcnt:389

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x7d71c0 cm_id:0x7ed170 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x7d71c0 cm_id:0x0x7ed170 state: CLOSE, refcnt:390

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x7ff380 cm_id:0x7ff0e0 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x7ff380 cm_id:0x0x7ff0e0 state: CLOSE, refcnt:391

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x81bb10 cm_id:0x75c5e0 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x81bb10 cm_id:0x0x75c5e0 state: CLOSE, refcnt:403

Sep 15 12:16:09 almstor01 tgtd: iser_cm_disconnected(1632) conn:0x828100 cm_id:0x7ed800 event:10, RDMA_CM_EVENT_DISCONNECTED

Sep 15 12:16:09 almstor01 tgtd: iser_conn_close(1291) conn:0x828100 cm_id:0x0x7ed800 state: CLOSE, refcnt:416

Sep 15 12:16:10 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x7d7780 cm_id:0x7d6f70, 192.168.100.4 → 192.168.100.101, established

Sep 15 12:16:10 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x8641b0 cm_id:0x7eceb0, 192.168.200.3 → 192.168.200.101, established

Sep 15 12:16:11 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x85fee0 cm_id:0x7d8650, 192.168.100.1 → 192.168.100.101, established

Sep 15 12:16:11 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x870dc0 cm_id:0x7d8b00, 192.168.200.1 → 192.168.200.101, established

Sep 15 12:16:11 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x8619e0 cm_id:0x861740, 192.168.200.2 → 192.168.200.101, established

Sep 15 12:16:11 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x862da0 cm_id:0x862b00, 192.168.100.2 → 192.168.100.101, established

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x7f1ee0 cm_id:0x81b460

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x7f06a0 cm_id:0x7d29b0

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x7f00e0 cm_id:0x807ce0

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x7f0f00 cm_id:0x7f0c60

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x7d71c0 cm_id:0x7ed170

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x7ff380 cm_id:0x7ff0e0

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x81bb10 cm_id:0x75c5e0

Sep 15 12:16:11 almstor01 tgtd: iser_cm_timewait_exit(1646) conn:0x828100 cm_id:0x7ed800

Sep 15 12:16:11 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x7eeae0 cm_id:0x7d80d0, 192.168.100.3 → 192.168.100.101, established

Sep 15 12:16:11 almstor01 tgtd: iser_cm_conn_established(1612) conn:0x7f00c0 cm_id:0x8636e0, 192.168.200.4 → 192.168.200.101, established

what can I do something for fix the issue.

Thanks.

For beginning, what is the hardware? Are you using Mellanox OFED or Inbox driver? I would suggest to install Mellanox OFED - http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers .

Next, simplify and debug the issue using only one connection (you have six at the moment)

Use ibdump/tcpdump(with sniffer flag enabled) in order to collect the data on the sender/receiver and load it in wireshark and follow the packets to see who terminates the connection

For additional details how to use ibdump/tcpdump(sniffer), check Mellanox OFED manual http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_User_Manual_v4_4.pdf http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_User_Manual_v4_4.pdf