Comparison of Feature Extraction Methods for Conducting Sentiment Classification in Ternate Malay Language using Machine Learning Approaches

Satria Dwi Surya, Ema Utami, Ainul Yaqin

Abstract


Local people in Ternate, North Maluku, often use local languages to communicate on social media. This poses a challenge for newcomers to understand the implied meaning and emotions of the messages conveyed through social media. This research aims to develop a natural language processing (NLP)-based emotion classification method that can be applied to Ternate Malay text datasets. The application of NLP is expected to improve the accuracy of emotion detection and classification in the text. The research was conducted by applying and comparing the performance of several classification models trained using Ternate Malay text datasets. The models used include SVM (Support Vector Machine), K-Nearest Neighbors (KNN) Random Forest, Decision Tree and Logistic Regression. Each model is applied using BoW (Bag-of-Words) and Word2Vec vectorization representations. The evaluation results show that the BoW+SVM model provides the highest performance with 77% accuracy, followed by BoW+Random Forest (75%) and BoW+Logistic Regression (73%). Thus it can be concluded that NLP can be applied to the Ternate Malay language dataset to classify emotions based on text.


Keywords


Text Classification, Emotion, Artificial Intelligence, Machine Learning, Ternate Malay Language

Full Text:

PDF

References


R. D. Handayani, K. Kusrini, and H. Al Fatta, “Perbandingan Fitur Ekstraksi Untuk Klasifikasi Emosi Pada Sosial Media,” J. Ilm. SINUS, vol. 18, no. 2, p. 21, 2020, doi: 10.30646/sinus.v18i2.457.

A. Musyayyidin and S. Adinugroho, “Analisis Emosional Pelajar terhadap Pembelajaran Daring Dengan Menggunakan Latent Semantic Indexing ( LSI ) dan N-Gram,” vol. 5, no. 7, pp. 3013–3017, 2021.

S. G. Tesfagergish, J. Kapočiūtė-Dzikienė, and R. Damaševičius, “Zero-Shot Emotion Detection for Semi-Supervised Sentiment Analysis Using Sentence Transformers and Ensemble Learning,” Appl. Sci., vol. 12, no. 17, 2022, doi: 10.3390/app12178662.

D. H. F. Alan Tusa Bagus W, “Klasifikasi Emosi Pada Teks Dengan Menggunakan Metode Deep Learning,” J. Ilm. Indones. p-ISSN 2541-0849;Cirebon, vol. 6, no. 1, 2021.

R. Klinger, O. De Clercq, S. M. Mohammad, and A. Balahur, “IEST: WASSA-2018 Implicit Emotions Shared Task,” arXiv (Cornell Univ., 2018, doi: 10.48550/arxiv.1809.01083.

A. Nurkasanah and M. Hayaty, “Feature Extraction using Lexicon on the Emotion Recognition Dataset of Indonesian Text,” Ultim. J. Tek. Inform., vol. 14, no. 1, pp. 20–27, 2022, doi: 10.31937/ti.v14i1.2540.

A. N. Rohman, E. Utami, and S. Raharjo, “Deteksi Kondisi Emosi pada Media Sosial Menggunakan Pendekatan Leksikon dan Natural Language Processing,” Eksplora Inform., vol. 9, no. 1, pp. 70–76, 2019, doi: 10.30864/eksplora.v9i1.277.

I. M. Abduh and H. Cangara, “Kritik Sosial Kebijakan Pemerintah dalam Platform Media Sosial dengan Pendekatan Komunikasi Hyperpersonal,” J. Nomosleca, vol. 8, no. 1, pp. 91–100, 2022, doi: 10.26905/nomosleca.v8i1.7085.

A. A. Efat, A. Atiq, A. S. Abeed, A. Momin, and M. G. R. Alam, “EMPOLITICON: NLP and ML Based Approach for Context and Emotion Classification of Political Speeches From Transcripts,” IEEE Access, vol. 11, no. May, pp. 54808–54821, 2023, doi: 10.1109/ACCESS.2023.3282162.

R. Olusegun, T. Oladunni, H. Audu, Y. A. O. Houkpati, and S. Bengesi, “Text Mining and Emotion Classification on Monkeypox Twitter Dataset: A Deep Learning-Natural Language Processing (NLP) Approach,” IEEE Access, vol. 11, no. March, pp. 49882–49894, 2023, doi: 10.1109/ACCESS.2023.3277868.

P. W. A. Wibawa and C. Pramartha, “Systematic Literature Review: Machine Learning Methods in Emotion Classification in Textual Data,” J. Sisfokom (Sistem Inf. dan Komputer), vol. 12, no. 3, pp. 425–433, 2023, doi: 10.32736/sisfokom.v12i3.1787.

Z. Wu, “Research on Automatic Classification Method of Ethnic Music Emotion Based on Machine Learning,” J. Math., vol. 2022, 2022, doi: 10.1155/2022/7554404.

F. Febriningsih, “Umpatan dalam Bahasa Melayu Ternate di Media Sosial,” Gramatika, vol. VIII, no. 2, pp. 184–193, 2020.

P. Nandwani and R. Verma, “A review on sentiment analysis and emotion detection from text,” Soc. Netw. Anal. Min., vol. 11, no. 1, pp. 1–19, 2021, doi: 10.1007/s13278-021-00776-6.

M. P. Solanki, “A Study on Emotion Detection & Classification from Text using Machine Learning,” J. Image Process. Intell. Remote Sens., no. 23, pp. 24–30, 2022, doi: 10.55529/jipirs.23.24.30.

K. B. Rashmi, H. S. Guruprasad, and B. R. Shambhavi, “Sentiment Classification on Bilingual Code-Mixed Texts for Dravidian Languages using Machine Learning Methods,” CEUR Workshop Proc., vol. 3159, no. April 2023, pp. 899–907, 2021.

S. Al-Saqqa, H. Abdel-Nabi, and A. Awajan, “A Survey of Textual Emotion Detection,” IEEE Xplore. pp. 136–142, 2018. doi: 10.1109/CSIT.2018.8486405.

M. A. H. Wadud, M. F. Mridha, and M. M. Rahman, “Word Embedding Methods for Word Representation in Deep Learning for Natural Language Processing,” Iraqi J. Sci., vol. 63, no. 3, pp. 1349–1361, 2022, doi: 10.24996/ijs.2022.63.3.37.

S. Akuma, T. Lubem, and I. T. Adom, “Comparing Bag of Words and TF-IDF with different models for hate speech detection from live tweets,” Int. J. Inf. Technol., vol. 14, no. 7, pp. 3629–3635, 2022, doi: 10.1007/s41870-022-01096-4.

A. Majeed, H. Mujtaba, and M. O. Beg, “Emotion detection in roman urdu text using machine learning,” Proc. - 2020 35th IEEE/ACM Int. Conf. Autom. Softw. Eng. Work. ASEW 2020, pp. 125–130, 2020, doi: 10.1145/3417113.3423375.

S. Diantika, “Penerapan Teknik Random Oversampling Untuk Mengatasi Imbalance Class Dalam Klasifikasi Website Phishing Menggunakan Algoritma Lightgbm,” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 1, pp. 19–25, 2023, doi: 10.36040/jati.v7i1.6006.

H. D. Abubakar and M. Umar, “Sentiment Classification: Review of Text Vectorization Methods: Bag of Words, Tf-Idf, Word2vec and Doc2vec,” SLU J. Sci. Technol., vol. 4, no. 1&2, pp. 27–33, 2022, doi: 10.56471/slujst.v4i.266.

D. I. Af’idah, D. Dairoh, S. F. Handayani, and R. W. Pratiwi, “Pengaruh Parameter Word2Vec terhadap Performa Deep Learning pada Klasifikasi Sentimen,” J. Inform. J. Pengemb. IT, vol. 6, no. 3, pp. 156–161, 2021, doi: 10.30591/jpit.v6i3.3016.

S. Guo, S. Wang, M. Wei, R. Chen, C. Guo, and H. Li, “Combining Imbalance Learning Strategy and Multiclassifier Estimator for Bug Report Classification,” Math. Probl. Eng., vol. 2020, 2020, doi: 10.1155/2020/5712461.

Zein Hanni Pradana, Hanin Nafi’ah, and Raditya Artha Rochmanto, “in Chatbot-based Information Service using RASA Open-SourceFrameworkin Prambanan Temple Tourism Object,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 4, pp. 656–662, 2022, doi: 10.29207/resti.v6i4.3913.

A. Kadhim, “An Evaluation of Preprocessing Techniques for Text Classification,” https://www.researchgate.net/publication/329339664_An_Evaluation_of_Preprocessing_Techniques_for_Text_Classification. International Journal of Computer Science and Information Security, 2018. [Online]. Available: https://sites.google.com/site/ijcsis/




DOI: https://doi.org/10.33387/protk.v11i3.7262

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.



Editorial Office :
Protek : Jurnal Ilmiah Teknik Elektro
Department of Electrical Engineering. Faculty of Engineering. Universitas Khairun.
Address: Jusuf Abdulrahman 53 Gambesi, Ternate City, Indonesia.
Email: protek@unkhair.ac.id, WhatsApp: +6282292852552
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

View Stat Protek