Road Traffic Accident Prediction by Machine Learning and GIS: Case Study in Thanh Hoa Province, Vietnam

Authors

  • Ha Le Thi
    Affiliation
    Faculty of Civil Engineering, Campus in Ho Chi Minh City, University of Transport and Communications, 450-451 Le Van Viet Street, Tang Nhon Phu Ward, 700000 Ho Chi Minh City, Vietnam
  • Thao Vu Thi Phuong
    Affiliation
    Faculty of Environment, Hanoi University of Mining and Geology, Ministry of Education and Training of Vietnam, 18 Duc Thang, 129000 Bac Tu Liem, Hanoi, Vietnam
  • Thao Do Thi Phuong
    Affiliation
    Department of Geodesy - Map and land management, Hanoi University of Mining and Geology, Ministry of Education and Training of Vietnam, 18 Duc Thang, 129000 Bac Tu Liem, Hanoi, Vietnam
https://doi.org/10.3311/PPtr.39701

Abstract

Traffic accidents pose significant challenges for communities worldwide, particularly in Vietnam, where many individuals live on low to middle incomes and where infrastructure often struggles to keep pace with rapid mechanization. This study uses a historical data set of traffic accidents from 2020 to 2023 in Thanh Hoa province, Vietnam, as input data for Random Forest and Spline Regression machine learning models to predict the number of deaths and injuries from traffic accidents. A traffic accident prediction map for 2024 is established from the predicted results of the death and injury numbers obtained from Random Forest combined with GIS technology. The prediction results show superiority in providing detailed information about accidents to intervene to make traffic safer, especially in areas at high risk and during peak periods of accidents. The Random Forest model demonstrated superior performance to Spline Regression, achieving a mean absolute error of 0.012072 for deaths and 0.036323 for injuries, with the R2 values of 0.998663 and 0.996552, respectively. Including lagged variables and adjusting for seasonal effects further improved the accuracy of daily predictions. The study offers an approach to solving traffic accidents in low- and middle-income countries, where traffic accident prediction methods based on historical data sources are still not widely used, with the hope of applying machine learning and GIS in road safety management shortly.

Keywords:

historical data sources, low- and middle-income countries, random forest algorithm, spline regression, traffic accident prediction

Citation data from Crossref and Scopus

Published Online

2025-11-27

How to Cite

Le Thi, H., Vu Thi Phuong, T., Do Thi Phuong, T. (2025) “Road Traffic Accident Prediction by Machine Learning and GIS: Case Study in Thanh Hoa Province, Vietnam”, Periodica Polytechnica Transportation Engineering. https://doi.org/10.3311/PPtr.39701

Issue

Section

Articles