Predicting YouTube Video Viewership Using Multi-Feature Random Forest Modeling: A Case Study on the Warganet Life Official Channel

Meiza Alliansa; Nur Hafifah Matondang; Rifka Dwi Amalia

doi:10.64472/jciet.v1i2.23

Authors

Meiza Alliansa Universitas Pembangunan Nasional "Veteran" Jakarta Author
Nur Hafifah Matondang Universitas Pembangunan Nasional “Veteran” Jakarta Author
Rifka Dwi Amalia Universitas Pembangunan Nasional "Veteran" Jakarta Author

DOI:

https://doi.org/10.64472/jciet.v1i2.23

Keywords:

youtube analytics, viewer prediction, random forest, machine learning, CRISP-DM

Abstract

This study presents a viewer prediction model for the YouTube channel “Warganet Life Official” using the Random Forest algorithm and multi-feature engagement metrics obtained from YouTube Studio. The dataset includes impressions, likes, dislikes, shares, watch time, and subscriber changes, which were processed using the CRISP-DM framework. The model achieved its best performance under a 70:30 train–test split, producing a MAPE of 12.20%, an RMSE of 204,890.42. Random Forest outperformed Linear Regression and XGBoost baselines, confirming its suitability for modeling nonlinear engagement behavior in dynamic digital-media environments. The novelty of this work lies in its multi-feature, engagement-driven modeling applied to a large Southeast Asian entertainment channel, offering localized evidence for viewer-performance forecasting. Theoretically, this study strengthens recent findings that multi-modal engagement metrics yield more accurate digital-media performance predictions. Practically, the deployment of a Streamlit-based prediction tool enables creators to perform real-time content evaluation and early performance diagnostics, providing actionable insights for improving content strategies and long-term channel optimization.

Downloads

Download data is not yet available.

Author Biographies

Nur Hafifah Matondang, Universitas Pembangunan Nasional “Veteran” Jakarta

Lecturer at the Information Systems Study Program, Faculty of Computer Science, Universitas Pembangunan Nasional “Veteran” Jakarta.
Rifka Dwi Amalia, Universitas Pembangunan Nasional "Veteran" Jakarta

Lecturer at the Information Systems Study Program, Faculty of Computer Science, Universitas Pembangunan Nasional “Veteran” Jakarta. Research interests include information systems management, IT service management, digital governance, and data-driven decision-making.

References

P. Chapman et al., “CRISP-DM 1.0: Step-by-step data mining guide,” IBM, 2020.

M. Ahmed, M. S. Khan, and R. Rony, “Machine Learning–Based Viewer Engagement Prediction for Online Video Platforms,” IEEE Access, vol. 12, pp. 11523–11538, 2024, doi: 10.1109/ACCESS.2024.3356721.

A. Gupta and S. Kumar, “Analyzing Nonlinear Audience Growth and Virality Patterns in Online Video Networks,” ACM Trans. Web, vol. 18, no. 2, pp. 1–25, 2024, doi: 10.1145/3641234.

D. R. Thomas and K. Lee, “Evaluating Regression Models for Social-Media Popularity Prediction: A Comparative Study of Linear, Tree-Based, and Boosting Methods,” Expert Syst. Appl., vol. 235, 2024, doi: 10.1016/j.eswa.2023.121234.

H. Liu, J. Park, and T. Chen, “Hyperparameter Optimization Strategies for Ensemble Learning Models in Large-Scale Prediction Tasks,” Information Sciences, vol. 661, pp. 119874, 2024, doi: 10.1016/j.ins.2023.119874.

Y. Zhao, B. Wu, and J. Luo, “Understanding Multi-Feature Engagement Metrics for Predictive Modeling in Digital Media Platforms,” IEEE Trans. Multimedia, vol. 26, pp. 4120–4134, 2024, doi: 10.1109/TMM.2023.3345678.

R. H. Pratama and P. H. Gunawan, “YouTube Viewership Prediction Using Facebook Prophet,” J. Media Inform. Budidarma, vol. 8, no. 1, pp. 383–392, 2024.

S. E. K. Sihombing, “Comparison of Multiple Linear Regression and Random Forest Regression for Information System Project Budget Forecasting,” J. Comput. Digital Business, vol. 3, no. 2, pp. 86–97, 2024.

Q. Balqis, S. Suryati, and M. Manalullaili, “The Role of YouTube in Digital Communication Behavior,” Journal of Digital Communication, vol. 1, no. 2, pp. 10–20, 2024.

D. Indrawan et al., “Deep Neural Network Model for YouTube Viewer Prediction,” JISICOM, vol. 5, no. 1, pp. 94–98, 2021.

F. Mukarromah and S. A. Putri, “Descriptive Analytics of YouTube Engagement Metrics: Case of Satu Persen Channel,” J. Mediakita, vol. 5, no. 2, pp. 130–146, 2021.

R. Lo et al., “Python-Based Modeling of Agricultural Media Quality Using Machine Learning,” J. Publ. Tek. Inform., vol. 2, no. 2, pp. 100–109, 2023.

A. S. T. Al Azhima et al., “Hybrid Machine Learning for Predictive Healthcare Analytics,” J. Teknol. Terpadu, vol. 8, no. 1, pp. 40–46, 2022.

M. N. Raza, “Naïve Bayes and Random Forest for Hoax Detection,” Pondasi, vol. 1, no. 2, pp. 43–57, 2024.

E. Riyanto and R. D. Amalia, “Comparison of The Accuracy of Predicting The Number of Positive COVID-19 Between The Neural Network and LSTM Methods,” in Proc. 2023 International Conference on Informatics, Multimedia, Cyber and Information Systems (ICIMCIS 2023), pp. 578–582, Nov. 2023. [Online]. Available: https://www.researchgate.net/publication/376548340_Comparison_of_The_Accuracy_of_Predicting_The_Number_Of_Positive_Covid-19_Between_The_Neural_Network_and_LSTM_Methods.

A. Utami and N. T. Hadi, “Anomaly Detection of Road Ranking Shifts Due to Traffic Accidents Using Deep Learning on Time Series Data”, Journal of Computing Innovations and Emerging Technologies, vol. 1, no. 1, pp. 21-25, 2025, doi : 10.64472/jciet.v1i1.5

Predicting YouTube Video Viewership Using Multi-Feature Random Forest Modeling: A Case Study on the Warganet Life Official Channel

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

References

Downloads

Published

Issue

Section

License

PROPOSED POLICY FOR JOURNALS OFFERING OPEN ACCESS

How to Cite

Make a Submission

ADDITIONAL MENU

Aplication Support

Statistics

indexed by

Keywords