PERFORMANCE EVALUATION OF HYBRID CLUSTERING K-MEANS AND DBSCAN WITH FEATURE WEIGHT OPTIMIZATION

Vic Devlin; Robet Robet; Octara Pribadi

doi:10.33387/jiko.v9i1.10859

Authors

Vic Devlin STMIK TIME
Robet Robet STMIK TIME
Octara Pribadi STMIK TIME

DOI:

https://doi.org/10.33387/jiko.v9i1.10859

Abstract

This research evaluates the performance of a hybrid clustering model that integrates K-Means and DBSCAN, enhanced through Feature Weight Optimization (FWO) using a Genetic Algorithm (GA), to achieve more precise consumer data segmentation. Two benchmark datasets, Customer Personality Analysis (CPA) and Online Retail (OR), were utilized to examine how different clustering techniques respond to variations in data structure. The feature weighting process was optimized using GA to improve the representational contribution of each variable toward the final cluster configuration. The Silhouette Score was adopted as the primary evaluation metric to measure intra-cluster cohesion and inter-cluster separation. Experimental findings reveal that for the CPA dataset, the Hybrid + FWO method achieved the best performance with a Silhouette Score of 0.9600, while the K-Means + FWO method recorded the highest score of 0.9804 on the OR dataset. Across all scenarios, the inclusion of FWO consistently enhanced clustering stability and interpretability. These results highlight that algorithm selection must consider dataset characteristics, and that feature weight optimization is pivotal in strengthening segmentation quality and ensuring more meaningful insights in consumer behavior analytics.

Downloads

Download data is not yet available.

References

A. A. Rahma, A. Faqih, and A. R. Rinaldi, “Optimalisasi Strategi Pemasaran melalui Segmentasi Pelanggan dengan Analisis RFM dan Algoritma K-Means untuk Bisnis Ritel,” JIKO (Jurnal Informatika dan Komputer), vol. 9, no. 2, p. 338, Jun. 2025, doi: 10.26798/jiko.v9i2.1737.

S. D. K. Wardani, A. S. Ariyanto, M. Umroh, and D. Rolliawati, “PERBANDINGAN HASIL METODE CLUSTERING K-MEANS, DB SCANNER & HIERARCHICAL UNTUK ANALISA SEGMENTASI PASAR,” JIKO (Jurnal Informatika dan Komputer), vol. 7, no. 2, p. 191, Sep. 2023, doi: 10.26798/jiko.v7i2.796.

Rahmati r and Wijayanto A, “ANALISIS CLUSTER DENGAN ALGORITMA K-MEANS, FUZZY C-MEANS DAN HIERARCHICAL CLUSTERING,” JIKO (Jurnal Informatika dan Komputer), vol. 5, no. 2, Mar. 2021.

Z. Wang et al., “AMD-DBSCAN: An Adaptive Multi-density DBSCAN for datasets of extremely variable density,” arXiv preprint arXiv:2210.08162, 2022, doi: 10.48550/arXiv.2210.08162.

K. N. Sridevi and M. Rajanna, “Hybrid Clustering Framework for Scalable and Robust Query Analysis: Integrating Mini-Batch K-Means with DBSCAN,” International Journal of Advanced Computer Science and Applications, vol. 16, no. 1, pp. 87–95, 2025, doi: 10.14569/IJACSA.2025.0160187.

K. Kouser, A. Priyam, M. Gupta, S. Kumar, and V. Bhattacharjee, “Genetic Algorithm–Based Optimization of Clustering Algorithms for the Healthy Aging Dataset,” Applied Sciences, vol. 14, no. 13, p. 5530, 2024, doi: 10.3390/app14135530.

G. Feng, “Feature selection algorithm based on optimized genetic algorithm and the application in high-dimensional data processing,” PLoS One, vol. 19, no. 5, 2024, doi: 10.1371/journal.pone.0303088.

A. G. Oskouei et al., “Feature-Weighted Fuzzy Clustering Methods: An Experimental Review,” Neurocomputing, vol. 619, p. 129176, 2025, doi: 10.1016/j.neucom.2024.129176.

M. Gaido, “Distributed Silhouette Algorithm: Evaluating Clustering on Big Data,” arXiv preprint arXiv:2303.14102, 2023, [Online]. Available: https://arxiv.org/abs/2303.14102

A. Suryaputra Paramita and T. Hariguna, “Comparison of K-Means and DBSCAN Algorithms for Customer Segmentation in E-commerce,” Journal of Digital Marketing and Digital Commerce, vol. 1, no. 1, pp. 43–62, 2024, doi: 10.47738/jdmdc.v1i1.3.

F. Salman and F. Fauziah, “Comparison Analysis of K-Means and DBSCAN Algorithms for Improving Budget Absorption Efficiency in EIS,” Brilliance: Research of Artificial Intelligence, vol. 3, no. 2, pp. 378–383, 2023, doi: 10.47709/brilliance.v3i2.3373.

Q.-V. Doan, T. Amagasa, T.-H. Pham, T. Sato, F. Chen, and H. Kusaka, “Structural k-means (Sk-means) and clustering uncertainty evaluation framework (CUEF) for mining climate data,” Geosci Model Dev, vol. 16, pp. 2215–2233, 2023, doi: 10.5194/gmd-16-2215-2023.

R. Tinós, L. Zhao, F. Chicano, and D. Whitley, “NK Hybrid Genetic Algorithm for Clustering,” arXiv preprint arXiv:2402.03813, 2024, doi: 10.48550/arXiv.2402.03813.

S. Chowdhury, N. Helian, and R. de Amorim, “Feature weighting in DBSCAN using reverse nearest neighbours,” Pattern Recognit, vol. 137, p. 109314, 2023, doi: 10.1016/j.patcog.2023.109314.

R. Mussabayev and R. Mussabayev, “Comparative Analysis of Optimization Strategies for K-Means Clustering in Big Data Contexts: A Review,” arXiv preprint arXiv:2310.09819, 2023, doi: 10.48550/arXiv.2310.09819.

T. Bezdan, Y. Zhang, and Y. Zhang, “Fruit-Fly Algorithm Based Hybrid K-Means Clustering Method for Text Document Clustering,” Mathematics, vol. 9, no. 16, p. 1929, 2021, doi: 10.3390/math9161929.

P. Bansal et al., “GGA-MLP: A Greedy Genetic Algorithm to Optimize Weights and Biases in MLP,” Contrast Media Mol Imaging, vol. 2022, p. 4036035, 2022, doi: 10.1155/2022/4036035.

V. V. Baligodugula and F. Amsaad, “Unsupervised Learning: Comparative Analysis of Clustering Techniques on High-Dimensional Data,” arXiv preprint arXiv:2503.23215, 2025.

M. K. Alsmadi et al., “A Hybrid Topic Modeling Method Based on Dirichlet Multinomial Mixture and Fuzzy Matching Algorithm for Short Text Clustering,” J Big Data, vol. 11, no. 68, 2024, doi: 10.1186/s40537-024-00930-9.