Effective Risk Analysis and Predictive Modeling in Motor Insurance in Saudi Arabia

Authors

  • Abdullah Aldaeej Department of Management Information Systems, Collage of Business Administration, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, 31441, Dammam, Saudi Arabia.
  • Hajar Aseeri Department of Management Information Systems, Collage of Business Administration, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, 31441, Dammam, Saudi Arabia.
  • Atiq Siddiqui Department of Management Information Systems, Collage of Business Administration, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, 31441, Dammam, Saudi Arabia.
  • Jumanah Alshehri Department of Management Information Systems, Collage of Business Administration, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, 31441, Dammam, Saudi Arabia.
  • Hafsa Alabdullateef Department of Management Information Systems, Collage of Business Administration, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, 31441, Dammam, Saudi Arabia.

DOI:

https://doi.org/10.48161/qaj.v6n1a2094

Keywords:

motor insurance; insurance premium; machine learning; risk factors; Saudi Arabia.

Abstract

Accurate prediction of motor insurance premiums that correspond with actual claims are critical to the sustainability of insurance companies. However, predicting premiums is a challenging task due to the complexity of risk factors. This study aims to identify significant risk factors and develop predictive models for motor insurance pricing within the Saudi context, using real data obtained from one of the leading insurance providers in Saudi Arabia. The dataset consists of 71,280 records and 26 features of insurance claims reported during the year of 2023. After preprocessing the data, significant risk factors are identified using Analysis of Variance (ANOVA), which are used later to build the prediction models. The findings reveal that vehicle body type and manufacturing country emerged as the most influential risk factors. The evaluation metrics (R², MAE, MSE) have been applied to evaluate the best-performing machine-learning pricing prediction model (Decision tree, Neural network, Generalized linear model, and Random Forest). The results of our evaluation show that the Random Forest model consistently outperformed the other models in terms of prediction accuracy. The study contributes to motor insurance industry in Saudi Arabia by supporting informed risk assessment within the Saudi Takaful insurance operations. It highlights the performance of prediction models for motor insurance pricing in Saudi Arabia.

Downloads

Download data is not yet available.

References

Hanafy, M., & Ming, R. (2022). Classification of the insureds using integrated machine learning algorithms: A comparative study. Applied Artificial Intelligence, 36(1).

David, M. (2015). Auto insurance premium calculation using generalized linear models. In D. Airinei, C. Pintilescu, D. Viorica, & M. Asandului (Eds.), Globalization and higher education in economics and business administration—GEBA 2013 (Vol. 20, pp. 147–156). Elsevier.

Swiss Re Institute. (2024). World insurance report 2024. Swiss Re Institute.

Insurance Authority. (2023). Insurance sector reports. Insurance Authority of Saudi Arabia.

Argaam. (2024). S&P expects GCC insurance sector to grow driven by Saudi market. ArgaamPlus.

Statista. (2024). Insurances—Saudi Arabia: Market forecast. Statista.

Azaare, J., Wu, Z., & Ahia, B. N. K. (2022). Exploring the effects of classical auto insurance rating variables on premium in ARDL: Is the high policyholders’ premium in Ghana justified? SAGE Open, 12(4), 21582440221134219.

Yang, Y., Qian, W., & Zou, H. (2018). Insurance premium prediction via gradient tree-boosted Tweedie compound Poisson models. Journal of Business & Economic Statistics, 36(3), 456–470.

Xie, S., & Luo, R. (2022). Measuring variable importance in generalized linear models for modeling size of loss distributions. Mathematics, 10(10), 1630.

Gómez-Déniz, E., & Calderín-Ojeda, E. (2021). A priori ratemaking selection using multivariate regression models allowing different coverages in auto insurance. Risks, 9(7), 137.

Alhumoudi, Y. (2013). Islamic insurance takaful and its applications in Saudi Arabia (Doctoral thesis). Brunel University.

Xie, S., & Shi, K. (2023). Generalized additive modelling of auto insurance data with territory design: A rate regulation perspective. Mathematics, 11(2), 334.

Dragos, C. M., & Dragos, S. L. (2017). Estimating consumers’ behavior in motor insurance using discrete choice models. E&M Ekonomie a Management, 20(4), 88–102.

Xie, S. (2021). Improving explainability of major risk factors in artificial neural networks for auto insurance rate regulation. Risks, 9(7), 126.

Laas, D., Schmeiser, H., & Wagner, J. (2016). Empirical findings on motor insurance pricing in Germany, Austria, and Switzerland. The Geneva Papers on Risk and Insurance – Issues and Practice, 41(3), 398–431.

Hosein, P. (2023). A data science approach to risk assessment for automobile insurance policies. International Journal of Data Science and Analytics.

Omerasevic, A., & Selimovic, J. (2020). Risk factor selection with data mining methods for insurance premium ratemaking. Zbornik Radova Ekonomskog Fakulteta u Rijeci, 38(2), 667–696.

Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Lichtendahl, K. C. (2017). Data mining for business analytics: Concepts, techniques, and applications in R (1st ed.). Wiley.

Salih, M. S., Ibrahim, R. K., Zeebaree, S. R., Asaad, D., Zebari, L. M., & Abdulkareem, N. M. (2024). Diabetic prediction based on machine learning using PIMA Indian dataset. Communications on Applied Nonlinear Analysis, 31(5s), 138-156.

Abuzaid, A., & Alkronz, E. (2024). A comparative study on univariate outlier winsorization methods in data science context. Statistica Applicata – Italian Journal of Applied Statistics.

Nitika, S. (2025). One-hot encoding using categorical data. Analytics Vidhya.

Liu, H., & Cocea, M. (2017). Semi-random partitioning of data into training and test sets in granular computing context. Granular Computing, 2(4), 357–386.

Simkus, J. (2023). ANOVA test statistics: Analysis of variance. Simply Psychology.

Daines, R. (2024). Two-way ANOVA. Statistics resources, LibGuides.

Gurucharan, K. (2020). Machine learning basics: Decision tree regression. Medium.

Zebari, D. A., Sulaiman, D. M., Sadiq, S. S., Zebari, N. A., & Salih, M. S. (2022, September). Automated Detection of Covid-19 from X-ray Using SVM. In 2022 4th International Conference on Advanced Science and Engineering (ICOASE) (pp. 130-135). IEEE.

Ahmed, F. Y., Masli, A. A., Khassawneh, B., Yousif, J. H., & Zebari, D. A. (2023). Optimized Downlink Scheduling over LTE Network Based on Artificial Neural Network. Computers, 12(9), 179.

Salih, M. S., Zebari, N. A., Masoud, R., & Zebari, D. A. (2025). Deep Transfer Learning and Feature Fusion for Improving Facial Expression Recognition on JAFFE Dataset. Applied Computing Journal.

Mohammed, M. A., Lakhan, A., Zebari, D. A., Abdulkareem, K. H., Nedoma, J., Martinek, R., ... & Tiwari, P. (2023). Adaptive secure malware efficient machine learning algorithm for healthcare data. CAAI Transactions on Intelligence Technology.

Donges, N. (2024). Random forest: A complete guide for machine learning. Built In.

Quantified Trading. (2024). R-squared: Definition, formula, uses, and pros and cons. Quantified Strategies.

Ahmed, M. W. (2023, August 24). Understanding mean absolute error (MAE) in regression: A practical guide. Medium.

Encord. (2023). Mean square error (MSE). Encord machine learning glossary.

Published

2026-01-13

How to Cite

Aldaeej, A., Aseeri, H. ., Siddiqui, . A. ., Alshehri, J. ., & Alabdullateef, H. . (2026). Effective Risk Analysis and Predictive Modeling in Motor Insurance in Saudi Arabia. Qubahan Academic Journal, 6(1), 98–117. https://doi.org/10.48161/qaj.v6n1a2094

Issue

Section

Articles