I am trying to build an architecture where I have 2 stacked LSTMs where the output sequence of the first cell is downsampled before going into the second cell.
For example if I have a sequence length of SEQ_LEN, the cell after it will get an entry with a sequence length of SEQ_LEN/2, such that we take the max between the vectors at positions 2k and 2k+1 in the output sequence of the first cell for each possible k.
I have searched in the documentation and didn’t find anything referring to this type of downsampling. Did I miss some parameter or some feature in the docs ? Or do I have to implement this logic myself ?
Thank you very much for your time !