skip to main content
research-article

A semi-supervised learning detection method for vision-based monitoring of construction sites by integrating teacher-student networks and data augmentation

Published: 01 October 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Recently, deep-learning detection methods have achieved huge success in the vision-based monitoring of construction sites in terms of safety control and productivity analysis. However, deep-learning detection methods require large-scale datasets for training purposes, and such datasets are difficult to develop due to the limited accessibility of construction images and the need for labor-intensive annotations. To address this problem, this research proposes a semi-supervised learning detection method for construction site monitoring based on teacher–student networks and data augmentation. The proposed method requires a limited number of labeled data to achieve high detection performance in construction scenarios. Initially, the proposed method trains the teacher object detector with labeled data following weak data augmentation. Next, the trained teacher object detector generates pseudo-detection results from unlabeled images that have been weakly augmented. Finally, the student object detector is trained with the pseudo-detection results and unlabeled images that have been both weakly and strongly augmented. In our experiments, 10,000 annotated construction images from the Alberta Construction Image Dataset (ACID) have been divided into a training set (70%) and a validation set (30%). The proposed method achieved a 91% mean average precision (mAP) on the validation set while only requiring 30% of the training set. In comparison, the existing supervised learning method ResNet50 Faster R-CNN achieved a mAP of 90.8% when training on the full training set. These experimental results show the potential of the proposed method in terms of reducing the time, effort, and costs spent on developing construction datasets. As such, this research has explored the potential of semi-supervised learning methods and increased the practicality of vision-based monitoring systems in the construction industry.

    References

    [1]
    J. Yang, M.-W. Park, P.A. Vela, M. Golparvar-Fard, Construction performance monitoring via still images, time-lapse photos, and video streams: Now, tomorrow, and the future, Adv. Eng. Informatics. 29 (2015) 211–224,.
    [2]
    B. Xiao, Z. Zhu, Two-dimensional visual tracking in construction scenarios: A comparative study, J. Comput. Civ. Eng. 32 (2018) 04018006,.
    [3]
    M. Lu, W. Chen, X. Shen, H.-C. Lam, J. Liu, Positioning and tracking construction vehicles in highly dense urban areas and building construction sites, Autom. Constr. 16 (2007) 647–656,.
    [4]
    A. Adán, B. Quintana, S.A. Prieto, F. Bosché, Scan-to-BIM for ‘secondary’ building components, Adv. Eng. Informatics. 37 (2018) 119–138,.
    [5]
    J.S. Bohn, J. Teizer, Benefits and barriers of construction project monitoring using high-resolution automated cameras, J. Constr. Eng. Manag. 136 (2010) 632–640,.
    [6]
    B. Xiao, S. Kang, Vision-based method integrating deep learning detection for tracking multiple construction machines, J. Comput. Civ. Eng. 35 (2021) 04020071,.
    [7]
    H. Son, H. Seong, H. Choi, C. Kim, Real-time vision-based warning system for prevention of collisions between workers and heavy equipment, J. Comput. Civ. Eng. 33 (2019) 1–14,.
    [8]
    Z.-Q. Zhao, P. Zheng, S.-T. Xu, X. Wu, Object detection with deep learning: A review, IEEE Trans. Neural Networks Learn. Syst. PP (2019) 1–21,.
    [9]
    A. Xuehui, Z. Li, L. Zuguang, W. Chengzhi, L. Pengfei, L. Zhiwei, Dataset and benchmark for detecting moving objects in construction sites, Autom. Constr. 122 (2021),.
    [10]
    J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi, I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, K. Murphy, Speed/accuracy trade-offs for modern convolutional object detectors, IEEE, 2017, pp. 3296–3297,.
    [11]
    O. Angah, A.Y. Chen, Tracking multiple construction workers through deep learning and the gradient based method with re-matching based on multi-object tracking accuracy, Autom. Constr. 119 (2020),.
    [12]
    W. Fang, L. Ding, B. Zhong, P.E.D. Love, H. Luo, Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach, Adv. Eng. Informatics. 37 (2018) 139–149,.
    [13]
    J. Kim, S. Chi, Action recognition of earthmoving excavators based on sequential pattern analysis of visual features and operation cycles, Autom. Constr. 104 (2019) 255–264,.
    [14]
    Y. Liu, J.K.W. Yeoh, D.K.H. Chua, Deep learning-based enhancement of motion blurred UAV concrete crack images, J. Comput. Civ. Eng. 34 (2020) 04020028,.
    [15]
    H. Tajeen, Z. Zhu, Image dataset development for measuring construction equipment recognition performance, Autom. Constr. 48 (2014) 1–10,.
    [16]
    J. Guo, Q. Wang, Y. Li, Semi-supervised learning based on convolutional neural network and uncertainty filter for façade defects classification, Comput. Civ. Infrastruct. Eng. (2020) 1–16,.
    [17]
    M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, A. Zisserman, The pascal visual object classes (VOC) challenge, Int. J. Comput. Vis. 88 (2010) 303–338,.
    [18]
    E. Rezazadeh Azar, S. Dickinson, B. McCabe, Server-customer interaction tracker: Computer vision-based system to estimate dirt-loading cycles, J. Constr. Eng. Manag. 139 (2013) 785–794,.
    [19]
    J. Yang, P. Vela, J. Teizer, Z. Shi, Vision-based tower crane tracking for understanding construction activity, J. Comput. Civ. Eng. 28 (2014) 103–112,.
    [20]
    S. Chi, C.H. Caldas, Image-based safety assessment: Automated spatial safety risk identification of earthmoving and surface mining activities, J. Constr. Eng. Manag. 138 (2012) 341–351,.
    [21]
    G. Gualdi, A. Prati, R. Cucchiara, Contextual information and covariance descriptors for people surveillance: An application for safety of construction workers, EURASIP J. Image Video Process. (2011) 1–16,.
    [22]
    A. Heydarian, M. Memarzadeh, M. Golparvar-Fard, Automated benchmarking and monitoring of an earthmoving operation’s carbon footprint using video cameras and a greenhouse gas estimation model, in: Proceedings: Comput. Civ, Eng., American Society of Civil Engineers, Reston, VA, 2012, pp. 509–516,.
    [23]
    C. Chen, Z. Zhu, A. Hammad, Automated excavators activity recognition and productivity analysis from construction site surveillance videos, Autom. Constr. 110 (2020),.
    [24]
    S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2017) 1137–1149,.
    [25]
    H. Kim, S. Bang, H. Jeong, Y. Ham, H. Kim, Analyzing context and productivity of tunnel earthmoving processes using imaging and simulation, Autom. Constr. 92 (2018) 188–198,.
    [26]
    J. Dai, Y. Li, K. He, J. Sun, R-FCN: Object detection via region-based fully convolutional networks, Adv. Neural Inf. Process. Syst. (2016) 379–387. https://papers.nips.cc/paper/2016/file/577ef1154f3240ad5b9b413aa7346a1e-Paper.pdf.
    [27]
    B. Xiao, S. Kang, Development of an image data set of construction machines for deep learning object detection, J. Comput. Civ. Eng. 35 (2021) 05020005,.
    [28]
    X. Wang, A. Gupta, Unsupervised learning of visual representations using videos, IEEE, 2015, pp. 2794–2802,.
    [29]
    C. Doersch, A. Zisserman, Multi-task self-supervised visual learning, IEEE, 2017, pp. 2070–2079,.
    [30]
    K. Wang, X. Yan, D. Zhang, L. Zhang, L. Lin, Towards human-machine cooperation: Self-supervised sample mining for object detection, IEEE, 2018, pp. 1605–1613,.
    [31]
    A. Tarvainen, H. Valpola, Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, ArXiv (2017) http://arxiv.org/abs/1703.01780.
    [32]
    J. Jeong, V. Verma, M. Hyun, J. Kannala, N. Kwak, Consistency-based semi-supervised learning for object detection, NeurIPS (2019) https://proceedings.neurips.cc/paper/2019/file/d0f4dae80c3d0277922f8371d5827292-Paper.pdf.
    [33]
    E.D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, Q.V. Le, AutoAugment: Learning augmentation strategies from data, IEEE, 2019, pp. 113–123,.
    [34]
    E.D. Cubuk, B. Zoph, J. Shlens, Q.V. Le, RandAugment: Practical automated data augmentation with a reduced search space, ArXiv (2019) http://arxiv.org/abs/1909.13719.
    [35]
    D. Hendrycks, N. Mu, E.D. Cubuk, B. Zoph, J. Gilmer, B. Lakshminarayanan, AugMix: A simple data processing method to improve robustness and uncertainty, ArXiv (2019) http://arxiv.org/abs/1912.02781.
    [36]
    K. Sohn, D. Berthelot, C.-L. Li, Z. Zhang, N. Carlini, E.D. Cubuk, A. Kurakin, H. Zhang, C. Raffel, FixMatch: Simplifying semi-supervised learning with consistency and confidence, ArXiv. (2020) http://arxiv.org/abs/2001.07685.
    [37]
    K. Sohn, Z. Zhang, C.-L. Li, H. Zhang, C.-Y. Lee, T. Pfister, A simple semi-supervised learning framework for object detection, ArXiv. (2020) http://arxiv.org/abs/2005.04757.
    [38]
    K. Liu, M. Golparvar-Fard, Crowdsourcing construction activity analysis from jobsite video streams, J. Constr. Eng. Manag. 141 (2015) 04015035,.
    [39]
    M.M. Soltani, Z. Zhu, A. Hammad, Automated annotation for visual recognition of construction resources using synthetic images, Autom. Constr. 62 (2016) 14–23,.
    [40]
    J. Kim, J. Hwang, S. Chi, J.O. Seo, Towards database-free vision-based monitoring on construction sites: A deep active learning approach, Autom. Constr. 120 (2020),.
    [41]
    T. DeVries, G.W. Taylor, Improved regularization of convolutional neural networks with cutout, ArXiv (2017) http://arxiv.org/abs/1708.04552.
    [42]
    Q. Fang, H. Li, X. Luo, L. Ding, H. Luo, T.M. Rose, W. An, Detecting non-hardhat-use by a deep learning method from far-field surveillance videos, Autom. Constr. 85 (2018) 1–9,.
    [43]
    X. Luo, H. Li, X. Yang, Y. Yu, D. Cao, Capturing and understanding workers’ activities in far-field surveillance videos with deep action recognition and bayesian nonparametric learning, Comput. Civ. Infrastruct. Eng. 34 (2019) 333–351,.
    [44]
    X. Luo, H. Li, D. Cao, F. Dai, J. Seo, S. Lee, Recognizing diverse construction activities in site images via relevance networks of construction-related objects detected by convolutional neural networks, J. Comput. Civ. Eng. 32 (2018) 04018012,.
    [45]
    K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, ArXiv. (2015) http://arxiv.org/abs/1512.03385.
    [46]
    Girshick R., Fast R-CNN, in: 2015 IEEE Int. Conf. Comput. Vis., IEEE, 2015: pp. 1440–1448.
    [47]
    Sutskever I., Martens J., Dahl G., Hinton G., On the importance of initialization and momentum in deep learning, in: Proc. 30 Th Int. Conf. Mach. Learn., Altanta, 2013.
    [48]
    Xiao B., Kang S., ACID dataset webpage, (2021) (n.p.). https://www.acidb.ca.
    [49]
    E. Yilmaz, J.A. Aslam, Estimating average precision with incomplete and imperfect judgments, in: Proc. 15th ACM Int. Conf. Inf. Knowl. Manag. - CIKM ’06, ACM Press, New York, New York, USA, 2006, p. 102,.
    [50]
    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft COCO: Common objects in context, Lect. Notes Comput Sci. (2014) 740–755,.
    [51]
    A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig, V. Ferrari, The open images dataset V4: Unified image classification, object detection, and visual relationship detection at scale, ArXiv (2018) http://arxiv.org/abs/1811.00982.
    [52]
    YouTube Channel of Mukesh Borewell Drilling Water Machine, (2021). https://www.youtube.com/watch?v=CZbusJ4ke5s&list=WL&index=5.

    Cited By

    View all
    • (2024)A data extension framework of seismic-induced gravelly soil liquefaction based on semi-supervised methodsAdvanced Engineering Informatics10.1016/j.aei.2023.10229559:COnline publication date: 1-Jan-2024
    • (2023)Process-oriented guidelines for systematic improvement of supervised learning research in construction engineeringAdvanced Engineering Informatics10.1016/j.aei.2023.10221558:COnline publication date: 1-Oct-2023

    Index Terms

    1. A semi-supervised learning detection method for vision-based monitoring of construction sites by integrating teacher-student networks and data augmentation
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Advanced Engineering Informatics
          Advanced Engineering Informatics  Volume 50, Issue C
          Oct 2021
          1047 pages

          Publisher

          Elsevier Science Publishers B. V.

          Netherlands

          Publication History

          Published: 01 October 2021

          Author Tags

          1. Deep learning
          2. Object detection
          3. Teacher-student networks
          4. Data augmentation
          5. Vision-based monitoring
          6. Construction sites

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)A data extension framework of seismic-induced gravelly soil liquefaction based on semi-supervised methodsAdvanced Engineering Informatics10.1016/j.aei.2023.10229559:COnline publication date: 1-Jan-2024
          • (2023)Process-oriented guidelines for systematic improvement of supervised learning research in construction engineeringAdvanced Engineering Informatics10.1016/j.aei.2023.10221558:COnline publication date: 1-Oct-2023

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media

          -