Performance Insights of Attention-free Language Models in Sentiment Analysis: A Case Study for E commerce Platforms in Vietnam
Published in Inventive Communication and Computational Technologies, 2024
Recommended citation: Pending Pending
Transformer-based models have gained significant development over the last few years due to their efficiency and parallelizability in training on vari- ous data domains. However, one bottleneck of the Transformer architecture lies in the Attention mechanism with high complexity. Consequently, training a Trans- former network involves long training time and large computational resources. Although there has been much research work to address this challenge, it is necessary to investigate other language models without the Attention compo- nent. In this work, we focused on the effectiveness and efficiency of attention- free language models, namely BiLSTM, TextCNN, gMLP, and HyenaDNA for the sentiment analysis problem based on reviews on popular e-commerce plat- forms in Vietnam. The findings showed that the accuracy of Bidirectional LSTM, TextCNN, HyenaDNA, and gMLP achieved approximately 97.8%, 97%, 96.8%, and 97.5%, respectively, compared to a popular attention-based model – RoBERTa but their number of parameters is 36.7 times less, 410 times less, 9.3 times less and 98 times less, respectively. In addition, among the considered attention-free models, even though Bidirectional LSTM obtained the highest accuracy, the dif- ference compared to gMLP is tiny. Otherwise, gMLP also acquired the highest F1 score in the considered attention-free model family.