First, I'm not sure whether the model contains the encoder during training.
EOS means end-of-sentence. Encoder and decoder are part of transformer network.
If without-encoder, training time:
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]
If without-encoder, testing time:
decoder input: [0]
If with encoder, training time:
encoder input: [A, B, C, B]
target: [E, F, G, H, EOS]
decoder input: [0, E, F, G, H]
If with-encoder, testing time:
encoder input: [A, B, C, D]
decoder input: [0]
Am I exact right?