Analytical tool prototype for monitoring road infrastructure projects in Colombia incorporating machine learning
Main Article Content
Keywords
Machine learning, Project monitoring, Duration forecasting, Resource optimization, Cost forecasting, Recommendations
Abstract
This article presents the development of an analytical tool prototype designed for the control of road infrastructure projects in Colombia, incorporating machine learning techniques as both a pedagogical resource and an innovation in project management. Its purpose is to strengthen decision-making and training processes in both academic and professional settings through predictive models that estimate project duration and costs, optimize resources, and generate automatic technical recommendations. The methodology followed the CRISP-DM approach, starting with the collection and cleaning of historical data from the Infrastructure Project Manager (GPI), and the development of four models: LightGBM for duration prediction, K-Means for resource optimization, linear regression for cost estimation, and Random Forest for recommendations. These models were integrated into an interactive interface that enables real-time use, fostering applied learning and reflective analysis. The results show high predictive accuracy and effective classification into three efficiency levels. It is concluded that incorporating data-driven analytical tools in educational and professional contexts not only improves project control and planning but also fosters critical competencies in data analysis, risk mitigation, and strategic decision-making in complex scenarios.
References
Amat Rodrigo, J. (2023). Gradient boosting con Python. https://dev.cienciadedatos.net/ documentos/py09_gradient_boosting_ python
Aung, T., Liana, S. R., Htet, A., & Bhaumik, A. (2023). Using machine learning to predict cost overruns in construction projects. Journal of Technology Innovations and Energy, 2(2). https://doi.org/10.56556/ jtie.v2i2.511
Calinski, T., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3(1), 1–27. https://doi. org/10.1080/03610927408827101
Cámara Colombiana de la Construcción [CAMACOL]. (2018). Informe de productividad en el sector de la construcción. https://camacol.co/ informe-productividad
Camacol. (2024). Informe económico No. 119: Coyuntura y retos para el sector de la construcción en 2024. https://camacol. co/informe-economico-119
Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., & Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11), 2783–2792. https://doi.org/10.1890/07-0539.1
Departamento Administrativo Nacional de Estadística [DANE]. (2022). Encuesta Pulso Empresarial: Uso de tecnologías de inteligencia artificial por sectores económicos. https://www.dane.gov. co/files/investigaciones/boletines/ pulso-empresarial/presentacion-pulsoempresarial-oct22-nov22.pdf
Davies, D. L., & Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2(2), 224–227. https://doi. org/10.1109/TPAMI.1979.4766909
EY Americas. (2021, noviembre 3). AI: Construction’s new frontier of digital enablement. https://www.ey.com/ en_us/construction-real-estate/aiconstructions-new-frontier-of-digitalenablement
Google for Developers. (s. f.). Machine learning: Regresión lineal. https://developers. google.com/machine-learning/crashcourse/linear-regression?hl=es-419
Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (3rd ed.). Morgan Kaufmann.
Instituto Iberoamericano de Mercados de Valores [IIMV]. (2017). La financiación de las micro, pequeñas y medianas empresas a través de los mercados de capitales en Iberoamérica. Fundación IIMV. https://scioteca.caf.com/ handle/123456789/1454
Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666. https://doi. org/10.1016/j.patrec.2009.09.011

