early warning system ; machine learning model ; mosquito abundance ; nowcasting ; Plasmodium vivax malaria
Abstract
Since a resurgence occurred in 1993, malaria has remained an endemic disease in the Republic of Korea (ROK). A major challengeis the inaccessibility of current vector mosquito abundance data due to a 2-week reporting delay, which limits timely implementa-tion of control measures. We aimed to nowcast mosquito abundance and assess its utility by evaluating the predictive value ofmosquito abundance for malaria epidemic peaks. We used machine learning models to nowcast mosquito abundance, employinggradient boosting models (GBMs), extreme gradient boosting (XGB), and an ensemble model combining both. Various meteoro-logical factors served as predictors. The models were trained with data from mosquito collection sites between 2009 and 2021 andtested with data from 2022. To evaluate the utility of nowcasting, we calculated the effective reproduction number (Rt ), which canindicate malaria epidemic peaks. Generalized linear models (GLMs) were then used to assess the impact of vector mosquitoabundance on Rt . The ensemble models demonstrated the best performance in nowcasting mosquito abundance, with a root meansquare error (RMSE) of 0.90 and R-squared value (R2) value of 0.85. The GBM model showed an RMSE of 0.91 and R2 of 0.84,while the XGB model had an RMSE of 0.92 and R2 of 0.85. Additionally, the R2 of the GLMs predicting Rt using mosquitoabundance 2 weeks in advance was >0.72 for all provinces. The mosquito abundance coefficients were also significant. Weconstructed reliable models to nowcast mosquito abundance. These outcomes could potentially be incorporated into a malariaearly warning system. Our study provides evidence to support the development of malaria management strategies in regions wheremalaria remains a public health challenge.