How multiply a matrix and vector

Please help me use cublasGemmEx I always get the wrong result. I want to multiply a matrix and vector. I’m using Java. cuda 12.0.

      for (int i = 0, index = 0; i < matrix.getRow(); i++) {
          for (int j = 0; j < matrix.getColumn(); j++, index++) {
              result.data[i] += data[j] * matrix.data[index];
          }
      }

data is my vector.
values ​​of type half.

I want to get the same result.

I use this:

        int M = 1;
        float alpha = 1.0f;
        float beta = 0.0f;

        int PType = cudaDataType.CUDA_R_16F;
        int CComputeType = cublasComputeType.CUBLAS_COMPUTE_16F;

        int N = matrix.getRow();
        int K = matrix.getColumn();

        int SUCCESS = JCublas2.cublasGemmEx_new(cublasHandle, cublasOperation.CUBLAS_OP_T, cublasOperation.CUBLAS_OP_N, N, M, K, Pointer.to(new short[]{Float.floatToFloat16(alpha)}), matrix.data_gpu, PType, K, data_gpu,
                PType, K, Pointer.to(new short[]{Float.floatToFloat16(beta)}), result.data_gpu, PType, N, CComputeType,
                cublasGemmAlgo.CUBLAS_GEMM_DEFAULT_TENSOR_OP);

        if (cublasStatus.CUBLAS_STATUS_SUCCESS != SUCCESS)
        {
            
        }