IMPROVING INDONESIAN SPEECH EMOTION CLASSIFICATION USING MFCC AND BILSTM WITH AUDIO AUGMENTATION
Abstract
References
S. Akinpelu, S. Viriri, and A. Adegun, “An enhanced speech emotion recognition using vision transformer,” Sci. Rep., vol. 14, no. 1, pp. 1–17, 2024.
BPPTIK, “Voice Assistant AI: Pendamping Digital yang Siap Membantu,” bpptik.komdigi.go.id, 2024. [Online]. Available: https://bpptik.komdigi.go.id/Publikasi/detail/voice-assistant-ai-pendamping-digital-yang-siap-membantu.
MiiTel, “Survei: Indonesia Peringkat 4 Negara Paling Antusias dengan AI,” AI Analytics for Voice Communication, 2024. [Online]. Available: https://miitel.com/id/survei-indonesia-peringkat-4-negara-paling-antusias-dengan-ai/.
Y. K. Aini, T. B. Santoso, and T. Dutono, “Pemodelan CNN Untuk Deteksi Emosi Berbasis Speech Bahasa Indonesia,” J. Komput. Terap., vol. 7, no. 1, pp. 143–152, 2021.
R. Y. Rumagit, G. Alexander, and I. F. Saputra, “Model Comparison in Speech Emotion Recognition for Indonesian Language,” Procedia Comput. Sci., vol. 179, no. 2020, pp. 789–797, 2021.
A. Bustamin, A. M. Rizky, E. Warni, I. S. Areni, and Indrabayu, “IndoWaveSentiment: Indonesian audio dataset for emotion classification,” Data Br., vol. 57, 2024.
M. R. N. Majiid, K. E. Setiawan, P. P. Yudha, A. Taufiq, and N. L. Setiawan, “Advancing Indonesian Audio Emotion Classification : A Comparative Study Using IndoWaveSentiment,” vol. 7, no. 2, pp. 207–211, 2025.
I. Dewa Agung Adwitya Prawangsa and A. Eka Karyawati, “Penerapan Metode MFCC dan LSTM untuk Speech Emotion Recognition,” J. Elektron. Ilmu Komput. Udayana, vol. 12, no. 4, pp. 2654–5101, 2024.
C. Zhang, H. Zhan, Z. Hao, and X. Gao, “Classification of Complicated Urban Forest Acoustic Scenes with Deep Learning Models,” Forests, vol. 14, no. 2, 2023.
F. F. Dias, M. A. Ponti, and R. Minghim, “Enhancing sound-based classification of birds and anurans with spectrogram representations and acoustic indices in neural network architectures,” Ecol. Inform., vol. 90, no. April, p. 103232, 2025.
A. S. Kumar, T. Schlosser, S. Kahl, and D. Kowerko, “Improving learning-based birdsong classification by utilizing combined audio augmentation strategies,” Ecol. Inform., vol. 82, no. June, 2024.
A. Alamsyah, F. Ardiansyah, and A. Kholiq, “Music Genre Classification Using Mel Frequency Cepstral Coefficients and Artificial Neural Networks: A Novel Approach,” Sci. J. Informatics, vol. 11, no. 4, pp. 937–948, 2024.
J. H. Chowdhury, S. Ramanna, and K. Kotecha, “Speech emotion recognition with light weight deep neural ensemble model using hand crafted features,” Sci. Rep., vol. 15, no. 1, pp. 1–14, 2025.
J. L. Bautista and Y. K. Lee, “Speech Emotion Recognition Based on Parallel CNN-Attention Networks with Multi-Fold Data Augmentation,” pp. 1–14, 2022.
E. Aurora Az Zahra, Y. Sibaroni, and S. Suryani Prasetyowati, “Classification of Multi-Label of Hate Speech on Twitter Indonesia using LSTM and BiLSTM Method,” JINAV J. Inf. Vis., vol. 4, no. 2, pp. 170–178, 2023.
T. Li, “Optimizing the configuration of deep learning models for music genre classification,” Heliyon, vol. 10, no. 2, p. e24892, 2024.
F. Makhmudov, A. Kutlimuratov, and Y. I. Cho, “Hybrid LSTM–Attention and CNN Model for Enhanced Speech Emotion Recognition,” Appl. Sci., vol. 14, no. 23, 2024.
DOI: https://doi.org/10.33387/jiko.v8i3.10820
Refbacks
- There are currently no refbacks.


