LSTM

On the Properties of Neural Machine Translation= Encoder–Decoder Approaches

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

Attention Is All You Need