Eye Gaze in Salience Modeling for Spoken Language Processing

Motivations and Objectives

Previous psycholinguistic work has shown that eye gaze is tightly linked to human language processing. Almost immediately after hearing a word, the eyes move to the corresponding real-world referent. And right before speaking a word, the eyes also move to the mentioned object. Not only is eye gaze highly reliable, it is also an implicit, subconscious reflex of speech. The user does not need to make a conscious decision; the eye automatically moves towards the relevant object, without the user even being aware. Motivated by these psycholinguistic findings, our hypothesis is that during human machine conversation, user eye gaze information coupled with conversation context can signal a part of the physical world (related to the domain and the graphic interface) that is most salient at each point of communication. This salience in the physical world will in turn prime what users communicate to the system, and thus can be used to tailor the interpretation of speech input. Based on this hypothesis, this project examines the role of eye gaze in human language production during human machine conversation and develops algorithms and systems that incorporates gaze-based salience modeling to robust spoken language understanding. Supported by NSF (Co-PI: Fernanda Ferreira, University of Edinburgh).

asd