skip to main content
10.1145/3586183.3606760acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article
Open access

TouchType-GAN: Modeling Touch Typing with Generative Adversarial Network

Published: 29 October 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Models that can generate touch typing tasks are important to the development of touch typing keyboards. We propose TouchType-GAN, a Conditional Generative Adversarial Network that can simulate locations and time stamps of touch points in touch typing. TouchType-GAN takes arbitrary text as input to generate realistic touch typing both spatially (i.e., (x, y) coordinates of touch points) and temporally (i.e., timestamps of touch points). TouchType-GAN introduces a variational generator that estimates Gaussian Distributions for every target letter to prevent mode collapse. Our experiments on a dataset with 3k typed sentences show that TouchType-GAN outperforms existing touch typing models, including the Rotational Dual Gaussian model [36] for simulating the distribution of touch points, and the Finger-Fitts Euclidean Model [30] for simulating typing time. Overall, our research demonstrates that the proposed GAN structure can learn the distribution of user typed touch points, and the resulting TouchType-GAN can also estimate typing movements. TouchType-GAN can serve as a valuable tool for designing and evaluating touch typing input systems.

    Supplementary Material

    ZIP File (3606760.zip)
    Supplemental File

    References

    [1]
    Johnny Accot and Shumin Zhai. 2003. Refining Fitts’ Law Models for Bivariate Pointing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Ft. Lauderdale, Florida, USA) (CHI ’03). Association for Computing Machinery, New York, NY, USA, 193–200. https://doi.org/10.1145/642611.642646
    [2]
    Dongsheng An, Yang Guo, Min Zhang, Xin Qi, Na Lei, and Xianfang Gu. 2020. AE-OT-GAN: Training GANs from Data Specific Latent Distribution. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 548–564.
    [3]
    Shiri Azenkot and Shumin Zhai. 2012. Touch Behavior with Different Postures on Soft Smartphone Keyboards. In Proceedings of the 14th International Conference on Human-Computer Interaction with Mobile Devices and Services (San Francisco, California, USA) (MobileHCI ’12). Association for Computing Machinery, New York, NY, USA, 251–260. https://doi.org/10.1145/2371574.2371612
    [4]
    Nikola Banovic, Varun Rao, Abinaya Saravanan, Anind K. Dey, and Jennifer Mankoff. 2017. Quantifying Aversion to Costly Typing Errors in Expert Mobile Text Entry. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI ’17). Association for Computing Machinery, New York, NY, USA, 4229–4241. https://doi.org/10.1145/3025453.3025695
    [5]
    Xiaojun Bi, Yang Li, and Shumin Zhai. 2013. FFitts Law: Modeling Finger Touch with Fitts’ Law. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Paris, France) (CHI ’13). Association for Computing Machinery, New York, NY, USA, 1363–1372. https://doi.org/10.1145/2470654.2466180
    [6]
    Xiaojun Bi and Shumin Zhai. 2013. Bayesian Touch: A Statistical Criterion of Target Selection with Finger Touch. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (St. Andrews, Scotland, United Kingdom) (UIST ’13). Association for Computing Machinery, New York, NY, USA, 51–60. https://doi.org/10.1145/2501988.2502058
    [7]
    Xiaojun Bi and Shumin Zhai. 2016. Predicting Finger-Touch Accuracy Based on the Dual Gaussian Distribution Model. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 313–319. https://doi.org/10.1145/2984511.2984546
    [8]
    Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. In Proceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS’16). Curran Associates Inc., Red Hook, NY, USA, 2180–2188.
    [9]
    Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, and Pedro J. Moreno. 2020. Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. In Proc. Interspeech 2020. International Speech Communication Association, Shanghai, China, 556–560. https://doi.org/10.21437/Interspeech.2020-1475
    [10]
    Jeremy Chu, Dongsheng An, Yan Ma, Wenzhe Cui, Shumin Zhai, Xianfeng Gu, and Xiaojun Bi. 2023. WordGesture-GAN: Modeling Word-Gesture Movement with Generative Adversarial Network. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3544548.3581279
    [11]
    Mark Davies. 2008. The corpus of contemporary American English (COCA): 560 million words, 1990-present.
    [12]
    Jesse H. Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts. 2019. GANSynth: Adversarial Neural Audio Synthesis. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019 (New Orleans, LA, USA). OpenReview.net, -, –. https://openreview.net/forum?id=H1xQVn09FX
    [13]
    Paul M Fitt’s. 1954. The Information Capacity of the Human Motor System in Controlling the Amplitude of Movement. In Journal of Experimental Psychology, Vol. 47. 381–391.
    [14]
    Andrew Fowler, Kurt Partridge, Ciprian Chelba, Xiaojun Bi, Tom Ouyang, and Shumin Zhai. 2015. Effects of language modeling and its personalization on touchscreen typing performance. In Proceedings of the 33rd annual ACM conference on human factors in computing systems. 649–658.
    [15]
    Mayank Goel, Jacob Wobbrock, and Shwetak Patel. 2012. GripSense: Using Built-in Sensors to Detect Hand Posture and Pressure on Commodity Mobile Phones. In Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology (Cambridge, Massachusetts, USA) (UIST ’12). Association for Computing Machinery, New York, NY, USA, 545–554. https://doi.org/10.1145/2380116.2380184
    [16]
    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger (Eds.). Vol. 27. Curran Associates, Inc., Montréal, Canada. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
    [17]
    Joshua Goodman, Gina Venolia, Keith Steury, and Chauncey Parker. 2002. Language modeling for soft keyboards. In Proceedings of the 7th international conference on Intelligent user interfaces. 194–195.
    [18]
    R.L. Graham. 1972. An efficient algorithm for determining the convex hull of a finite planar set. Inform. Process. Lett. 1, 4 (1972), 132–133. https://doi.org/10.1016/0020-0190(72)90045-2
    [19]
    Tovi Grossman and Ravin Balakrishnan. 2005. A Probabilistic Approach to Modeling Two-Dimensional Pointing. ACM Trans. Comput.-Hum. Interact. 12, 3 (sep 2005), 435–459. https://doi.org/10.1145/1096737.1096741
    [20]
    Niels Henze, Enrico Rukzio, and Susanne Boll. 2012. Observational and Experimental Investigation of Typing Behaviour Using Virtual Keyboards for Mobile Devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI ’12). Association for Computing Machinery, New York, NY, USA, 2659–2668. https://doi.org/10.1145/2207676.2208658
    [21]
    Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. arXiv preprint arxiv:2006.11239 (2020).
    [22]
    Christian Holz and Patrick Baudisch. 2010. The Generalized Perceived Input Point Model and How to Double Touch Accuracy by Extracting Fingerprints. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 581–590. https://doi.org/10.1145/1753326.1753413
    [23]
    Christian Holz and Patrick Baudisch. 2011. Understanding Touch. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI ’11). Association for Computing Machinery, New York, NY, USA, 2501–2510. https://doi.org/10.1145/1978942.1979308
    [24]
    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, USA, 5967–5976. https://doi.org/10.1109/CVPR.2017.632
    [25]
    Xinhui Jiang, Yang Li, Jussi P.P. Jokinen, Viet Ba Hirvola, Antti Oulasvirta, and Xiangshi Ren. 2020. How We Type: Eye and Finger Movement Strategies in Mobile Typing. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376711
    [26]
    Jussi Jokinen, Aditya Acharya, Mohammad Uzair, Xinhui Jiang, and Antti Oulasvirta. 2021. Touchscreen Typing As Optimal Supervisory Control. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 720, 14 pages. https://doi.org/10.1145/3411764.3445483
    [27]
    Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, and Taesung Park. 2023. Scaling up GANs for Text-to-Image Synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    [28]
    Tero Karras, Samuli Laine, and Timo Aila. 2021. A Style-Based Generator Architecture for Generative Adversarial Networks. IEEE Trans. Pattern Anal. Mach. Intell. 43, 12 (dec 2021), 4217–4228. https://doi.org/10.1109/TPAMI.2020.2970919
    [29]
    Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. arxiv:1312.6114 [stat.ML]
    [30]
    Yu-Jung Ko, Hang Zhao, Yoonsang Kim, IV Ramakrishnan, Shumin Zhai, and Xiaojun Bi. 2020. Modeling Two Dimensional Touch Pointing. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’20). Association for Computing Machinery, New York, NY, USA, 858–868. https://doi.org/10.1145/3379337.3415871
    [31]
    Jungil Kong, Jaehyeon Kim, and Jaekyoung Bae. 2020. Hifi-gan: Generative adversarial networks for efficient and high fidelity speech synthesis. Advances in Neural Information Processing Systems 33 (2020), 17022–17033.
    [32]
    Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79–86.
    [33]
    Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2019. Improved Precision and Recall Metric for Assessing Generative Models. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, Article 353, 10 pages.
    [34]
    Seungyon Lee and Shumin Zhai. 2009. The Performance of Touch Screen Soft Buttons. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 309–318. https://doi.org/10.1145/1518701.1518750
    [35]
    Yuexing Luo and Daniel Vogel. 2014. Crossing-Based Selection with Direct Touch Input. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Toronto, Ontario, Canada) (CHI ’14). Association for Computing Machinery, New York, NY, USA, 2627–2636. https://doi.org/10.1145/2556288.2557397
    [36]
    Yan Ma, Shumin Zhai, IV Ramakrishnan, and Xiaojun Bi. 2021. Modeling Touch Point Distribution with Rotational Dual Gaussian Model. In The 34th Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY, USA, 1197–1209. https://doi.org/10.1145/3472749.3474816
    [37]
    I. Scott MacKenzie and William Buxton. 1992. Extending Fitts’ Law to Two-Dimensional Tasks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Monterey, California, USA) (CHI ’92). Association for Computing Machinery, New York, NY, USA, 219–226. https://doi.org/10.1145/142750.142794
    [38]
    Akash Mehra, Jerome R. Bellegarda, Ojas Bapat, Hema Koppula, Rick Chang, Ashish Shrivastava, and Oncel Tuzel. 2021. Implicit vs. Explicit Style Transfer? A Comparison of GAN Architectures for Continuous Path Keyboard Input Modeling. In 2021 29th European Signal Processing Conference (EUSIPCO). IEEE Computer Society, USA, 1396–1400. https://doi.org/10.23919/EUSIPCO54536.2021.9615962
    [39]
    Akash Mehra, Jerome R. Bellegarda, Ojas Bapat, Partha Lal, and Xin Wang. 2020. Leveraging Gans to Improve Continuous Path Keyboard Input Models. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE Computer Society, USA, 8174–8178. https://doi.org/10.1109/ICASSP40776.2020.9052978
    [40]
    Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. arxiv:1411.1784 [cs.LG]
    [41]
    Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. GauGAN: Semantic Image Synthesis with Spatially Adaptive Normalization. In ACM SIGGRAPH 2019 Real-Time Live! (Los Angeles, California) (SIGGRAPH ’19). Association for Computing Machinery, New York, NY, USA, Article 2, 1 pages. https://doi.org/10.1145/3306305.3332370
    [42]
    Philip Quinn and Shumin Zhai. 2018. Modeling gesture-typing movements. Human–Computer Interaction 33, 3 (2018), 234–280.
    [43]
    Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. https://doi.org/10.48550/ARXIV.1511.06434
    [44]
    Daniel Vogel and Ravin Balakrishnan. 2010. Occlusion-Aware Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Atlanta, Georgia, USA) (CHI ’10). Association for Computing Machinery, New York, NY, USA, 263–272. https://doi.org/10.1145/1753326.1753365
    [45]
    Daniel Vogel and Patrick Baudisch. 2007. Shift: A Technique for Operating Pen-Based Interfaces Using Touch. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’07). Association for Computing Machinery, New York, NY, USA, 657–666. https://doi.org/10.1145/1240624.1240727
    [46]
    Daniel Vogel and Géry Casiez. 2012. Hand Occlusion on a Multi-Touch Tabletop. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI ’12). Association for Computing Machinery, New York, NY, USA, 2307–2316. https://doi.org/10.1145/2207676.2208390
    [47]
    Feng Wang and Xiangshi Ren. 2009. Empirical Evaluation for Finger Input Properties in Multi-Touch Interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 1063–1072. https://doi.org/10.1145/1518701.1518864
    [48]
    Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. 2021. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). IEEE Computer Society, USA, 1905–1914. https://doi.org/10.1109/ICCVW54120.2021.00217
    [49]
    Daryl Weir, Henning Pohl, Simon Rogers, Keith Vertanen, and Per Ola Kristensson. 2014. Uncertain text entry on mobile devices. In Proceedings of the SIGCHI conference on human factors in computing systems. 2307–2316.
    [50]
    Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017. Toward Multimodal Image-to-Image Translation. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc., Red Hook, NY, USA. https://proceedings.neurips.cc/paper/2017/file/819f46e52c25763a55cc642422644317-Paper.pdf

    Index Terms

    1. TouchType-GAN: Modeling Touch Typing with Generative Adversarial Network

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        UIST '23: Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology
        October 2023
        1825 pages
        ISBN:9798400701320
        DOI:10.1145/3586183
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 29 October 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Machine Learning
        2. Mobile Devices: Phones/Tablets
        3. Tap Typing
        4. Touch/Haptic/Pointing/Gesture

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        Conference

        UIST '23

        Acceptance Rates

        Overall Acceptance Rate 842 of 3,967 submissions, 21%

        Upcoming Conference

        UIST '24

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 419
          Total Downloads
        • Downloads (Last 12 months)419
        • Downloads (Last 6 weeks)33

        Other Metrics

        Citations

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media

        -