skip to main content
10.1007/978-3-030-11012-3_31guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Semi-supervised Data Augmentation Approach Using 3D Graphical Engines

Published: 29 January 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Deep learning approaches have been rapidly adopted across a wide range of fields because of their accuracy and flexibility, but require large labeled training datasets. This presents a fundamental problem for applications with limited, expensive, or private data (i.e. small data), such as human pose and behavior estimation/tracking which could be highly personalized. In this paper, we present a semi-supervised data augmentation approach that can synthesize large scale labeled training datasets using 3D graphical engines based on a physically-valid low dimensional pose descriptor. To evaluate the performance of our synthesized datasets in training deep learning-based models, we generated a large synthetic human pose dataset, called ScanAva using 3D scans of only 7 individuals based on our proposed augmentation approach. A state-of-the-art human pose estimation deep learning model then was trained from scratch using our ScanAva dataset and could achieve the pose estimation accuracy of 91.2% at PCK0.5 criteria after applying an efficient domain adaptation on the synthetic images, in which its pose estimation accuracy was comparable to the same model trained on large scale pose data from real humans such as MPII dataset and much higher than the model trained on other synthetic human dataset such as SURREAL.

    References

    [1]
    [2]
    CMU graphics lab motion capture database (2018). http://mocap.cs.cmu.edu/
    [3]
    Skanect 3D Scanning Software By Occipital. http://skanect.occipital.com/. Accessed 2018
    [4]
    Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR, June 2014
    [5]
    Anguelov D, Srinivasan P, Koller D, Thrun S, Rodgers J, and Davis J SCAPE: shape completion and animation of people ACM Trans. Graph. 2005 24 3 408-416
    [6]
    Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., Sivic, J.: Seeing 3D chairs: exemplar part-based 2D-3D alignment using a large dataset of CAD models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3762–3769 (2014)
    [7]
    Bengio, Y.: Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 17–36 (2012)
    [8]
    Bengio, Y., et al.: Deep learners benefit more from out-of-distribution examples. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 164–172 (2011)
    [9]
    Bottou L Lechevallier Y and Saporta G Large-scale machine learning with stochastic gradient descent Proceedings of COMPSTAT 2010 Heidelberg Physica-Verlag 177-186
    [10]
    Caruana, R.: Learning many related tasks at the same time with backpropagation. In: Advances in Neural Information Processing Systems, pp. 657–664 (1995)
    [11]
    Chen, W., et al.: Synthesizing training images for boosting human 3D pose estimation. In: 2016 Fourth International Conference on 3D Vision, 3DV, pp. 479–488 (2016)
    [12]
    Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2016)
    [13]
    Craig JJ Introduction to Robotics: Mechanics and Control 2005 Upper Saddle River Pearson Prentice Hall
    [14]
    Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision, pp. 2758–2766 (2015)
    [15]
    Du Y et al. Leibe B, Matas J, Sebe N, Welling M, et al. Marker-less 3D human motion capture with monocular image sequence and height-maps Computer Vision – ECCV 2016 2016 Cham Springer 20-36
    [16]
    Everingham M, Van Gool L, Williams CK, Winn J, and Zisserman A The PASCAL visual object classes (VOC) challenge Int. J. Comput. Vis. 2010 88 2 303-338
    [17]
    Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. arXiv preprint arXiv:1605.06457 (2016)
    [18]
    Ghezelghieh, M.F., Kasturi, R., Sarkar, S.: Learning camera viewpoint using CNN to improve 3D body pose estimation. In: 2016 Fourth International Conference on 3D Vision, 3DV, pp. 685–693 (2016)
    [19]
    Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 2066–2073 (2012)
    [20]
    Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: Proceedings of the British Machine Vision Conference (2010).
    [21]
    Kajita S, Hirukawa H, Harada K, and Yokoi K Introduction to Humanoid Robotics 2014 Heidelberg Springer
    [22]
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
    [23]
    Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Artificial Intelligence and Statistics, pp. 562–570 (2015)
    [24]
    Liebelt, J., Schmid, C.: Multi-view object class detection with a 3D geometric model. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 1688–1695 (2010)
    [25]
    Liu, S., Yin, Y., Ostadabbas, S.: In-bed pose estimation: deep learning with shallow dataset. arXiv preprint arXiv:1711.01005 (2018)
    [26]
    Marin, J., Vázquez, D., Gerónimo, D., López, A.M.: Learning appearance in virtual scenarios for pedestrian detection. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 137–144 (2010)
    [27]
    Newell A, Yang K, and Deng J Leibe B, Matas J, Sebe N, and Welling M Stacked hourglass networks for human pose estimation Computer Vision – ECCV 2016 2016 Cham Springer 483-499
    [28]
    Okada R and Soatto S Forsyth D, Torr P, and Zisserman A Relevant feature selection for human pose estimation and localization in cluttered images Computer Vision – ECCV 2008 2008 Heidelberg Springer 434-445
    [29]
    Pishchulin, L., Jain, A., Andriluka, M., Thormählen, T., Schiele, B.: Articulated people detection and pose estimation: reshaping the future. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 3178–3185 (2012)
    [30]
    Qiu, W.: Generating human images and ground truth using computer graphics. Ph.D. thesis. University of California, Los Angeles (2016)
    [31]
    Romero J, Loper M, and Black MJ Gall J, Gehler P, and Leibe B FlowCap: 2D human pose from optical flow Pattern Recognition 2015 Cham Springer 412-423
    [32]
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
    [33]
    Stark, M., Goesele, M., Schiele, B.: Back to the future: learning shape models from 3D CAD data. In: BMVC, vol. 2, no. 4, p. 5 (2010)
    [34]
    Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2686–2694 (2015)
    [35]
    Sun, B., Feng, J., Saenko, K.: Correlation alignment for unsupervised domain adaptation. arXiv preprint arXiv:1612.01939 (2016)
    [36]
    Sun, B., Peng, X., Saenko, K.: Generating large scale image datasets from 3D CAD models. In: CVPR 2015 Workshop on the Future of Datasets in Vision (2015)
    [37]
    Sun, M., Su, H., Savarese, S., Fei-Fei, L.: A multi-view probabilistic model for 3D object classes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1247–1254 (2009)
    [38]
    Varol, G., et al.: Learning from synthetic humans. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 (2017)
    [39]
    Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016)
    [40]
    Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
    [41]
    Yu, F., Zhang, Y., Song, S., Seff, A., Xiao, J.: LSUN: construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 (2015)
    [42]
    Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Index Terms

    1. A Semi-supervised Data Augmentation Approach Using 3D Graphical Engines
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Guide Proceedings
            Computer Vision – ECCV 2018 Workshops: Munich, Germany, September 8-14, 2018, Proceedings, Part II
            Sep 2018
            695 pages
            ISBN:978-3-030-11011-6
            DOI:10.1007/978-3-030-11012-3
            • Editors:
            • Laura Leal-Taixé,
            • Stefan Roth

            Publisher

            Springer-Verlag

            Berlin, Heidelberg

            Publication History

            Published: 29 January 2019

            Author Tags

            1. Data augmentation
            2. Deep learning
            3. Domain adaptation
            4. Human pose estimation
            5. Low dimensional subspace learning

            Qualifiers

            • Article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            Citations

            View Options

            View options

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media

            -