The model of LSTM I’m trying to replicate is this one:

```
X = tf.placeholder(tf.float32, [None, None, n_coeffs_input])
Y = tf.placeholder(tf.float32, [None, None, n_coeffs_output])
l_r = tf.placeholder(tf.float32, [])
utts_len = tf.placeholder(tf.float32, [batch_size])
lstm1 = tf.compat.v1.keras.layers.CuDNNLSTM(nodos, return_sequences=True,return_state=True)
output_lstm1 = lstm1(inputs=X)
#output_lstm1 is a list of len 3, where the first element is the output of the lstm for all times
#the second element is the last output, and the third one is the last state.
mse = tf.reduce_mean(tf.squared_difference(Y, output_lstm1[0]))
...
```

The function I defined to replicate the LSTM use the ifco order:

```
def NewLSTM(kernel, recurrent_kernel, Bias,x,previous_output,previous_state,units=1):
k_gates=np.dot(np.array([x]),kernel)
kr_gates=np.dot(previous_output,np.array([recurrent_kernel]))
gates=k_gates+kr_gates+Bias[:(units*4)]+Bias[(units*4):]
i=sigmoide_np(gates[:,:units])
f=sigmoide_np(gates[:,units:(units*2)])
c=np.tanh(gates[:,(units*2):(units*3)])
o=sigmoide_np(gates[:,(units*3):])
state= np.multiply(i,c)+np.multiply(previous_state,f)
out=o*np.tanh(state)
return state, out
```

For a LSTM with n_coef

```
total_out=net.run((output_lstm1), feed_dict={X: batch_in, utts_len: [10]})
last_out=total_out[1]
last_state=total_out[2]
total_out=np.squeeze(total_out[0])
replicated_LSTM=[]
x_in=np.squeeze(batch_in)
#For the first in we assume 0 last state and 0 last output
h=np.zeros((1,n_coeffs_output))
c=np.zeros((1,n_coeffs_output))
state, outI_t_1=NuevaLSTM(lstm1.get_weights()[0], lstm1.get_weights()[1],
lstm1.get_weights()[2],
x_in[0],h[0,:],c[0,:])
replicated_LSTM.append(outI_t_1)
for last in range(-9,0):
outI=net.run((output_lstm1), feed_dict={X: np.reshape(batch_in[0,:last,:],(1,10+last,n_coeffs_input)), utts_len: ([10+last])})
last_out=outI[1]
last_state=outI[2]
state, outI_t_1=NewLSTM(lstm1.get_weights()[0], lstm1.get_weights()[1],
lstm1.get_weights()[2],
x_in[last],last_out[0],last_state[0],1)
replicated_LSTM.append(outI_t_1)
print("output with net.run:")
for i in total_out:
print(i)
print("Output with NewLSTM:")
for i in replicated_LSTM:
print(i)
```

The output for n_coeffs_input=1 and n_coeffs_output=1 is:

```
output with net.run:
0.6079306
0.88047564
0.9457015
0.9593791
0.9630412
0.96477175
0.9660919
0.96729606
0.96844375
0.96954805
Output with NewLSTM:
[[0.60793061]]
[[0.88047569]]
[[0.94570145]]
[[0.95937899]]
[[0.96304125]]
[[0.96477178]]
[[0.96609195]]
[[0.96729607]]
[[0.96844372]]
[[0.96954812]]
```

As you can see, the results are very similar, so the we could say that is a good replication, but when I use n_coeffs_output=2 I got very different results:

```
output with net.run:
[[-1.6811120e-05 -7.6083797e-01]
[-5.3793984e-04 -9.5489222e-01]
[-4.9301507e-03 -9.7488159e-01]
[-2.7245175e-02 -9.5495248e-01]
[-1.0828623e-01 -9.0503311e-01]
[-2.8274482e-01 -8.2405442e-01]
[-4.7703674e-01 -7.3043454e-01]
[-6.2741107e-01 -6.1664152e-01]
[-7.3953187e-01 -4.4329706e-01]
[-8.0043334e-01 -1.7720392e-01]]
Output with NewLSTM:
[[-1.68111220e-05 -7.60837996e-01]]
[[-4.95224298e-04 -9.54892126e-01]]
[[-0.00478779 -0.97487646]]
[[-0.02719039 -0.95483796]]
[[-0.10874437 -0.90371213]]
[[-0.29537781 -0.81501876]]
[[-0.52451342 -0.69765331]]
[[-0.69437694 -0.54462647]]
[[-0.79712232 -0.31257828]]
[[-0.85020044 -0.08377024]]
```