I tried to merge the embedded words using two different methods, Glove and BERT, before passing them to the BiLSTM model, but it did not work for me, so I am searching for previous experiments that did the same thing. Is there anyone who can help me in clarifying this matter?