OPTIMIZING GPT AND INDOBERT FOR SENTIMENT ANALYSIS AND CONSUMER TREND PREDICTION ON LAZADA PRODUCT REVIEWS

Arif Fitra Setyawan, Rozaq Isnaini Nugraha

Abstract


Sentiment analysis has become a vital approach in understanding customer opinions through textual reviews. One of the primary challenges in sentiment classification lies in class imbalance, where positive reviews often dominate the dataset. This imbalance causes machine learning models to be biased toward the majority class and underperform in detecting minority sentiments. To address this issue, this study applies the Synthetic Minority Oversampling Technique (SMOTE) and evaluates the performance of two Transformer-based models: Generative Pre-trained Transformer (GPT) as a baseline and IndoBERT as the primary model. The dataset consists of 12,704 product reviews from Lazada, obtained from the Kaggle platform, and is categorized into three sentiment classes (positive, neutral, negative). The data was split into 80% for training and 20% for testing. After preprocessing and applying SMOTE for data balancing, the fine-tuned IndoBERT model achieved the best performance with an accuracy of 88%, significantly outperforming GPT, which yielded only 47% accuracy in a zero-shot setting. These findings highlight the critical role of addressing data imbalance and selecting context-aware models for improving sentiment classification accuracy in Indonesian language texts

Full Text:

PDF

References


R. M. R. W. P. K. Atmaja and W. Yustanti, “Analisis Sentimen Customer Review Aplikasi Ruang Guru Dengan Metode BERT (Bidirectional Encoder Representations from Transformers),” J. Emerg. Inf. Syst. Bus. Intell., vol. 2, no. 3, pp. 55–62, 2021, doi: 10.26740/jeisbi.v2i3.41567.

J. Bodapati, N. Veeranjaneyulu, and N. S. Shaik, “Sentiment Analysis from Movie Reviews Using LSTMs,” Ingénierie des Systèmes d Inf., vol. 24, no. 1, pp. 125–129, Apr. 2019, doi: 10.18280/isi.240119.

N. M. Ali, M. M. Abd El Hamid, and A. Youssif, “Sentiment Analysis for Movies Reviews Dataset Using Deep Learning Models,” Int. J. Data Min. Knowl. Manag. Process Vol, vol. 9, no. 2, pp. 19–27, 2019, [Online]. Available: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3403985

K. I. Gunawan and J. Santoso, “Multilabel Text Classification Menggunakan SVM dan Doc2Vec Classification Pada Dokumen Berita Bahasa Indonesia,” J. Inf. Syst. Hosp. Technol., vol. 3, no. 1, pp. 29–38, Apr. 2021, doi: 10.37823/ insight.v3i01.126.

G. Situmorang and R. Purba, “Deteksi Potensi Depresi dari Unggahan Media Sosial X Menggunakan IndoBERT,” Build. Informatics, Technol. Sci., vol. 6, no. 2, pp. 649–661, 2024, doi: 10.47065/bits.v6i2.5496.

D. T. Arum, N. Nurchim, and A. I. Pradana, “Implementasi Bidirectional Encoder Repre sentations from Transformers (BERT) untuk Klasifikasi Spam pada Email,” JATI (Jurnal Mhs. Tek. Inform., vol. 9, no. 2, pp. 2491–2496, 2025, doi: 10.36040/jati.v9i2.13114.

S. R. Wardhana and D. Purwitasari, “Klasifikasi Multi Class pada Analisis Sentimen Opini Pengguna Aplikasi Mobile untuk Evaluasi Faktor Usability,” INTEGER J. Inf. Technol., vol. 4, no. 1, pp. 1–15, 2019, doi: 10.31284/j.integer .2019.v4i1.474.

A. Roethel, M. Ganzha, and A. Wróblewska, “Enriching Language Models with Graph-Based Context Information to Better Understand Textual Data,” 2024. doi: 10.3390/electronics 13101919.

D. Sjoraida, B. Guna, and D. Yudhakusuma, “Analisis Sentimen Film Dirty Vote Menggunakan BERT (Bidirectional Encoder Representations from Transformers),” J. JTIK (Jurnal Teknol. Inf. dan Komunikasi), vol. 8, no. 2, pp. 393–404, Apr. 2024, doi: 10.35870/ jtik.v8i2.1580.

R. Rahman, N. Setiawan, and F. Bachtiar, “Analisis Sentimen Pengguna Aplikasi Mobile Berbasis Review Pada Platform Blibli Menggunakan Metode Bidirectional Encoder Representations from Transformers (BERT),” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 9, no. 4, pp. 1–9, Jan. 2025, [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/14641

L. Zhang, H. Fan, C. Peng, G. Rao, and Q. Cong, “Sentiment Analysis Methods for HPV Vaccines Related Tweets Based on Transfer Learning,” 2020. doi: 10.3390/healthcare8030307.

A. A. Mudding, “Mengungkap Opini Publik: Pendekatan BERT-based-caused untuk Analisis Sentimen pada Komentar Film,” J. Syst. Comput. Eng., vol. 5, no. 1, pp. 36–43, 2024, doi: 10.61628/jsce.v5i1.1060.

W. Hidayat and V. Nastiti, “Perbandingan Kinerja Pre-trained IndoBERT-Base dan IndoBERT-Lite pada Klasifikasi Sentimen Ulasan TikTok Tokopedia Seller Center dengan Model IndoBERT,” JSiI (Jurnal Sist. Informasi), vol. 11, no. 2, pp. 13–20, Sep. 2024, doi: 10.30656/jsii.v11i2.9168.

U. Khairani, V. Mutiawani, and H. Ahmadian, “Pengaruh Tahapan Preprocessing Terhadap Model Indobert dan Indobertweet untuk Mendeteksi Emosi pada Komentar Akun Berita Instagram,” J. Teknol. Inf. dan Ilmu Komput., vol. 11, no. 4, pp. 887–894, 2024, doi: 10.25126/jtiik.1148315.

A. F. Setyawan, A. D. P. Ariyanto, F. K. Fikriah, and R. I. Nugraha, “Analisis Sentimen Ulasan iPhone di Amazon Menggunakan Model Deep Learning BERT Berbasis Transformer,” Elkom J. Elektron. dan Komput., vol. 17, no. 2, pp. 447–452, 2024, doi: 10.51903/elkom.v17i2.2150.

S. Rahayu, J. J. Purnama, A. Hamid, and N. K. Hikmawati, “Analisis Sentimen AicoGPT (Generative Pre-trained Transformer) Menggunakan TF-IDF,” J. Buana Inform., vol. 14, no. 02, pp. 97–106, 2023, doi: 10.24002/ jbi.v14i02.7039.

T. D. Salma, M. F. Kurniawan, R. Darmawan, and A. Basri, “Analisis Sentimen Berbasis Transformer: Persepsi Publik terhadap Nusantara pada Perayaan Kemerdekaan Indonesia yang Pertama,” J. JTIK (Jurnal Teknol. Inf. dan Komunikasi), vol. 9, no. 2, pp. 757–764, 2025, Available: https://lembagakita.org/journal/index. php/jtik/article/view/3535

T. D. Purnomo and J. Sutopo, “Comparison of Pre-trained BERT-Based Transformer Models for Regional Language Text Sentiment Analysis in Indonesia,” Int. J. Sci. Technol., vol. 3, no. 3, pp. 11–21, 2024, doi: 10.56127/ijst.v3i3.1739.

F. S. Aditama, D. Krismawati, and S. Pramana, “Multiclass Classification of Marketplace Products with Machine Learning,” Media Stat., vol. 17, no. 1, pp. 25–35, Oct. 2024, doi: 10.14710/medstat.17.1.25-35.

K. A. Simanjuntak, M. Koyimatu, and Y. P. Ervanisari, “Identifikasi Opini Publik Terhadap Kendaraan Listrik dari Data Komentar YouTube: Pemodelan Topik Menggunakan BERTopic,” TEMATIK, vol. 11, no. 2, pp. 195–203, 2024, doi: 10.38204/tematik.v11i2.2096.




DOI: https://doi.org/10.33387/jiko.v8i2.10066

Refbacks

  • There are currently no refbacks.