Expanding speech interaction for domestic activities

Dissertation of Nima Zargham (2024)

Due to technological advancements, communicating with computer systems using natural language has become a common phenomenon. Speech-based systems have become widely popular among people in everyday activities due to the intuitive nature of their interaction. Speech interaction inherently encompasses a social component as it reflects the fundamental human capacity for communication and enables interpersonal engagement through verbal exchange.This makes speech interaction with computers an essential topic of research in the field of human-computer interaction. Along with the development of speech-based systems, users' demands and expectations from such systems grow. Despite the popularity of speech interaction, designing a gratifying experience for users interacting with such systems remains challenging. This can be attributed partly to technical constraints, such as challenges with speech recognition, and partly to experiential limitations where these systems fail to meet users' needs and expectations as communication partners.

This dissertation explores human-agent speech interaction in domestic activities through a series of user studies and interviews.The primary focus lies in the practical application of speech technology for everyday activities, particularly within two application domains of homes and video games, to identify factors contributing to successful speech interaction. Drawing inspiration from communication and human-computer interaction models, a novel interaction model is introduced to provide a more nuanced understanding of the dynamics in human-agent speech interaction.
Different dimensions of speech systems are analyzed, including the utility and efficacy of speech systems, diverse representations of speech agents, and the style in which these systems interact with users. By examining users' needs and addressing current issues, designing new features to enhance existing systems, and proactively anticipating potential future challenges, this work takes a comprehensive approach, encompassing a retrospective, current, and future outlook. The aim is to provide a broad perspective on designing speech systems, offering relevant design factors and recommendations to achieve a higher user experience with such systems.
This thesis contributes to the fields of human-computer interaction, voice user interfaces, game user research, and user experience design within both academia and industry.

Dissertation online