Analytical tool prototype for monitoring road infrastructure projects in Colombia incorporating machine learning

Main Article Content

Daniel David Bonilla Bonilla
Nicolás Alejandro Castellanos Roncancio
Lina María Gómez Montenegro
César Augusto Leal Coronado

Keywords

Machine learning, Project monitoring, Duration forecasting, Resource optimization, Cost forecasting, Recommendations

Abstract

This article presents the development of an analytical tool prototype designed for the control of road infrastructure projects in Colombia, incorporating machine learning techniques as both a pedagogical resource and an innovation in project management. Its purpose is to strengthen decision-making and training processes in both academic and professional settings through predictive models that estimate project duration and costs, optimize resources, and generate automatic technical recommendations. The methodology followed the CRISP-DM approach, starting with the collection and cleaning of historical data from the Infrastructure Project Manager (GPI), and the development of four models: LightGBM for duration prediction, K-Means for resource optimization, linear regression for cost estimation, and Random Forest for recommendations. These models were integrated into an interactive interface that enables real-time use, fostering applied learning and reflective analysis. The results show high predictive accuracy and effective classification into three efficiency levels. It is concluded that incorporating data-driven analytical tools in educational and professional contexts not only improves project control and planning but also fosters critical competencies in data analysis, risk mitigation, and strategic decision-making in complex scenarios.

Abstract 217 | PDF (Spanish) Downloads 110

References

Abed, Y. G., Hasan, T. M., & Zehawi, R. N. (2022). Machine learning algorithms for constructions cost prediction: A systematic review. International Journal of Nonlinear Analysis and Applications, 13(2), 2205–2218. https://doi. org/10.22075/ijnaa.2022.27673.3684

Amat Rodrigo, J. (2023). Gradient boosting con Python. https://dev.cienciadedatos.net/ documentos/py09_gradient_boosting_ python

Aung, T., Liana, S. R., Htet, A., & Bhaumik, A. (2023). Using machine learning to predict cost overruns in construction projects. Journal of Technology Innovations and Energy, 2(2). https://doi.org/10.56556/ jtie.v2i2.511

Calinski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3(1), 1–27. https://doi. org/10.1080/03610927408827101

Cámara Colombiana de la Construcción [CAMACOL]. (2018). Informe de productividad en el sector de la construcción. https://camacol.co/ informe-productividad

Camacol. (2024). Informe económico No. 119: Coyuntura y retos para el sector de la construcción en 2024. https://camacol. co/informe-economico-119

Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., & Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11), 2783–2792. https://doi.org/10.1890/07-0539.1

Departamento Administrativo Nacional de Estadística [DANE]. (2022). Encuesta Pulso Empresarial: Uso de tecnologías de inteligencia artificial por sectores económicos. https://www.dane.gov. co/files/investigaciones/boletines/ pulso-empresarial/presentacion-pulsoempresarial-oct22-nov22.pdf

Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(2), 224–227. https://doi. org/10.1109/TPAMI.1979.4766909

EY Americas. (2021, noviembre 3). AI: Construction’s new frontier of digital enablement. https://www.ey.com/ en_us/construction-real-estate/aiconstructions-new-frontier-of-digitalenablement

Google for Developers. (s. f.). Machine learning: Regresión lineal. https://developers. google.com/machine-learning/crashcourse/linear-regression?hl=es-419

Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (3rd ed.). Morgan Kaufmann.

Instituto Iberoamericano de Mercados de Valores [IIMV]. (2017). La financiación de las micro, pequeñas y medianas empresas a través de los mercados de capitales en Iberoamérica. Fundación IIMV. https://scioteca.caf.com/ handle/123456789/1454

Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666. https://doi. org/10.1016/j.patrec.2009.09.011