Social interaction in augmented reality

Mark Roman Miller; Hanseul Jun; Fernanda Herrera; Jacob Yu Villa; Greg Welch; Jeremy N. Bailenson

doi:10.1371/journal.pone.0216290

Abstract

There have been decades of research on the usability and educational value of augmented reality. However, less is known about how augmented reality affects social interactions. The current paper presents three studies that test the social psychological effects of augmented reality. Study 1 examined participants’ task performance in the presence of embodied agents and replicated the typical pattern of social facilitation and inhibition. Participants performed a simple task better, but a hard task worse, in the presence of an agent compared to when participants complete the tasks alone. Study 2 examined nonverbal behavior. Participants met an agent sitting in one of two chairs and were asked to choose one of the chairs to sit on. Participants wearing the headset never sat directly on the agent when given the choice of two seats, and while approaching, most of the participants chose the rotation direction to avoid turning their heads away from the agent. A separate group of participants chose a seat after removing the augmented reality headset, and the majority still avoided the seat previously occupied by the agent. Study 3 examined the social costs of using an augmented reality headset with others who are not using a headset. Participants talked in dyads, and augmented reality users reported less social connection to their partner compared to those not using augmented reality. Overall, these studies provide evidence suggesting that task performance, nonverbal behavior, and social connectedness are significantly affected by the presence or absence of virtual content.

Citation: Miller MR, Jun H, Herrera F, Yu Villa J, Welch G, Bailenson JN (2019) Social interaction in augmented reality. PLoS ONE 14(5): e0216290. https://doi.org/10.1371/journal.pone.0216290

Editor: Atsushi Senju, Birkbeck University of London, UNITED KINGDOM

Received: November 5, 2018; Accepted: April 17, 2019; Published: May 14, 2019

Copyright: © 2019 Miller et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: This work was supported by two National Science Foundation (https://www.nsf.gov/) grants, IIS-1800922, awarded to JB and GW, CMMI-1840131, awarded to JB. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Augmented reality (AR) has captured the attention of both the public and corporations with its ability to seamlessly integrate digital content with the real-world environment. AR devices like the Microsoft HoloLens and Magic Leap One allow users to see the real physical world but superimpose a layer of digital content such that users see virtual models mixed in with the actual world around them. Typically, the digital objects–which can be anything ranging from a simple shape to a realistic model of a person–are rendered in stereo (i.e., with separate images projected into each eye) to give the illusion of depth when situated next to real objects. Moreover, the digital objects are registered, using cameras and sensors that track a user’s position in an absolute location, such that when a person moves, the object stays in the programmed position.

While the general interest in AR may be recent and growing, academic researchers have been building and testing the technology for decades [1,2]. Many experiments focus on the technical and design aspects of AR, but less is known about how social interactions are affected by the technology. Gordon Allport [3] defined the field of social psychology as “an attempt to understand and explain how the thoughts, feelings, and behavior of individuals are influenced by the actual, imagined, or implied presence of others”. This well-accepted definition is broad enough to include virtual people rendered in AR. Allport’s foresight in extending social psychology to imagined and implied others is quite prescient, given the new types of social interactions which will become possible in AR.

Imagine a cocktail party, with dozens of people socializing, about half of whom are wearing AR headsets. Perhaps some people will choose to invite other virtual avatars into the party, and these can only be seen by those wearing headsets. The presence of avatars might change the way people talk, gesture, and socialize. For example, what happens when an AR user at the party is forced to violate the personal space of either an avatar who is registered in a specific location or a real human who is not aware of the avatar? While similar issues occur with phone calls and videoconferencing, the unique aspect of AR is that these avatars are grounded spatially in the room in a set position among the physical people.

In the example above, the people rendered into the cocktail party are avatars, a real-time rendering of other people. Collaborative AR systems, like the one in the cocktail party example, allow remote or co-located users to interact with each other. Co-located systems usually consist of two or more users wearing AR headsets in the same physical location and are able to see and interact with the same virtual content. Remote systems usually consist of two or more users in different physical locations. A common example consists of one user wearing a headset while remote others assist in a specific task by looking at the same content through either a monitor or a second AR headset [4].

Collaborative AR systems have been explored as tools to improve completion time and reduce mental effort in design tasks [5,6], facilitate communication between collaborators [7] and increase mutual understanding [8]. For a review, see [9]. Overall, these pioneering studies focused on testing the efficacy of various systems, usability, and design, but have yet to examine the social psychological aspects of AR.

Other AR systems feature social interaction by employing embodied agents, which are characters whose verbal and nonverbal behavior are generated algorithmically in response to users’ behavior. For example, in Fragments [10] the application which as of May 2019 has received more user ratings than any other application for the Microsoft HoloLens [11], the AR user interacts with many embodied agents while solving a crime in a game that integrates digital and physical objects into the game’s narrative.

In immersive virtual reality (VR), there have been hundreds of papers examining how users respond to virtual humans, whether they are avatars or agents. For a recent review of this literature, see [12]. However, the extent to which these findings extend to virtual humans in AR is not yet known outside of a handful of studies. A recent review by Kim and colleagues [13] surveyed every paper published in the International Symposium on Mixed and Augmented Reality (ISMAR), the leading academic conference on AR, from the past decade (2008–2017). In their analysis, Kim and colleagues found an increasing trend in evaluative studies (i.e., studies in which AR systems are built and tested for their effectiveness or the psychological effects they have on users). For example, Steptoe, Julier, and Steed [14] demonstrated the difference in visual quality between rendered and real objects reduces feelings of presence. However, this decrease in presence disappeared when both the real and virtual objects were passed through an image filter, eliminating the difference in visual quality. According to the findings from the Kim and colleagues’ survey of ISMAR [13], less than two percent of all papers (9 of 526) examined any type of social collaboration.

Many of these studies in the subset of AR social interaction have focused on qualitative findings, or anecdotal accounts of how the system was received by users. For example, one AR interface presented participants with an agent that floated over their heads. Researchers noted the participants felt uncomfortable when it looked down at them [15]. Similarly, Wagner [16] created a conversational agent describing art pieces in a gallery. Some participants noted they felt insulted when the agent violated social norms and faced away from them while speaking to them. These qualitative observations suggest agents tend to elicit social responses and inform our understanding of human interactions with agents. However, more empirical evidence providing support for these claims is needed.

A number of studies by Welch and colleagues demonstrate that embodied agents can achieve higher social presence (as measured by self-report or behavioral data) by successfully integrating virtual content with the real world. In a study by Kim and colleagues [17], participants interacted with a virtual agent while a real fan was blowing in the room. Participants encountered one of three conditions: the fan did not affect the virtual world, the fan blew a virtual paper, or the virtual agent reacted to this airflow and held down the corner of the paper. The perceived social presence of the virtual agent was greater in the two conditions where virtual content reacted to real physical events. This work suggests that the more virtual content reacts to the physical environment, the more virtual humans seem real.

In another study by Kim and colleagues [18], a virtual agent in a virtual motorized wheelchair interviewed participants. In the two conditions, the virtual agent's behavior was either behaviorally unrealistic (e.g., it passed through doors, did not avoid physical objects, and was not occluded by physical objects in front of it) or is behaviorally realistic (e.g., requested to open the door, asked the participant to move a chair, and was occluded by physical objects in front of it). The participants who saw the behaviorally realistic agent had higher social presence scores and thought of the agent as more intelligent than participants who interacted with the agent that behaved unrealistically.

Moreover, embodiment can also increase social presence and trust in an agent performing tasks. A recent study had participants interact with an intelligent virtual assistant inspired by Amazon's Alexa. The embodiment of the virtual agent was either not visible, visible with gestures, or visible with gestures and locomotion (i.e., it was able to move across the room). Participants rated the agent with the highest level of fidelity (i.e., visibility, gestures, and locomotion) as more socially present and trustworthy compared to the other conditions [19]. This study suggests AR users prefer high fidelity agents that interact with objects in a room using socially typical animations rather than disembodied agents.

The present investigation uses the work by Welch and colleagues as a baseline, and focuses on new questions, namely how AR social interaction changes task performance, nonverbal behavior, and social connection with other physically co-located people. We review the relevant literature for each of these outcomes.

Task performance: Social facilitation and inhibition

Much of the past work building and testing AR applications has been around task performance. In a review of AR applications [13], Kim et al. refer to six common types of applications: military, industry, healthcare, games, tours, and media. In the three that are most common (military, industry, and healthcare applications), the goal is often to improve task speed or quality [20–22].

The addition of virtual humans will likely influence performance. A well-studied theory in social psychology is social facilitation and inhibition. Social facilitation refers to the tendency of people to perform simple tasks faster in the presence of others. Conversely, social inhibition is the tendency to perform complex tasks poorly in the presence of others. These concepts were first studied 120 years ago when Triplett [23] noted faster winding speeds in a reel-winding competition for children in pairs when compared to children who completed the task alone. This effect was further investigated by Allport [24], who found people wrote free chain associations faster in the presence of others than alone, even when competition was explicitly barred. However, later research found an opposing effect. When participants tried to learn nonsense syllables in the presence of others rather than alone, they took more trials to memorize the syllables and made more errors during the process [25].

These contradictory effects were addressed by Zajonc’s drive theory [26], which was able to explain both enhanced and impaired performance in the presence of others by making a distinction between simple and complex tasks. Zajonc’s drive theory posits that the presence of others increases one’s arousal level, or drive, and that this increase in arousal leads to enhanced or impaired performance depending on task difficulty. Since then, multiple studies have replicated these social facilitation and inhibition findings (see [27] for a review). Other explanations have been given for the effect, including self-presentation theory (see [28] for a meta-analysis). Self-presentation theory posits that the audience strongly motivates the participant to perform well, which aids performance regardless of difficulty. The difference between easy and hard tasks arises when the performer becomes embarrassed or distracted by their poor performance on hard tasks.

It is important to note that the study of social facilitation and inhibition has not been limited to the physical presence of real people. Dashiell [29] evaluated performance of a mathematical task in two conditions. In each case, participants were physically alone in their individual experiment rooms; however, in one condition, subjects were aware that others were being tested at the same time. Even though participants had no contact with other participants, the implied presence of others was sufficient to elicit a social facilitation effect. More recently, researchers have expanded the study of social facilitation to include the presence of virtual others (i.e., avatars and agents) in immersive and non-immersive virtual environments.

Early research examining the effect of virtual humans consistently demonstrated a social inhibition effect, with participants struggling to perform complex tasks in front of virtual others, but had been unable to reproduce a social facilitation effect [30–32]. However, Park and Catrambone [33] were able to demonstrate both social facilitation and inhibition effects by using a different experimental method. Instead of having participants train on one task so the tasks varied in familiarity, as the earlier studies had done [30–32], participants were given tasks that were pretested to vary in difficulty. In this study, participants performed three tasks at two levels of difficulty (easy or hard) in the presence of a real person, the presence of an agent, or alone. Results showed both social facilitation and inhibition effects in both the virtual human and real human conditions.

In sum, there is a robust literature showing that an audience—both physically present and in immersive VR—influence performance. The extent to which this extends to AR is critical to understand, given that people may be using this technology in their daily lives, performing tasks ranging from navigation to repairs to business meetings.

Nonverbal behavior: Interpersonal distance and eye-contact

Nonverbal behavior plays a major role in communication [34]. Two well-studied components of nonverbal communication are interpersonal distance and eye-contact. Interpersonal distance refers to the physical distance that individuals maintain during social interactions. Though there are many theoretical accounts of how people regulate interpersonal distance, in general, most people tend to choose a distance that is comfortable, given the context and social relationships among people (see [35] for a review). When it comes to eye-contact, past research has demonstrated that the use of eye-contact helps individuals regulate interactions, express intimacy, provide information, and facilitate collaboration [36].

Similarly, in immersive VR, one of the most studied aspects of nonverbal behavior is interpersonal distance and eye-contact (See [37] for a review). Early work demonstrated that users are reluctant to walk through other virtual humans [38], and that they tend to maintain interpersonal distance with virtual humans [39]. In the past 15 years, there have been many subsequent studies examining interpersonal distance between users and virtual humans in VR suggesting that users tend to follow this social norm with both avatars and agents (see [40] for a recent review). Additionally, past research has also demonstrated that users tend to maintain eye-contact with avatars and agents in virtual environments [41].

While previous work on interpersonal distance and eye-contact provides clear predictions for how people should interact with AR agents–by respecting their space and maintaining a proper distance and eye-contact during an interaction–predictions are less clear for what happens moments after they see the rendering of an agent. AR is different from most media in that it superimposes digital content onto the physical environment. Hence, this medium poses a novel consequence in the tradition of media effects in that users might form associations between the virtual content and the physical objects that may remain even after the AR headset is no longer in use and users are no longer able to see the virtual content. In other words, after the medium has been turned off, affective responses and behaviors towards specific physical objects might change due to the newly formed associations with previously rendered virtual content.

In sum, AR will present unique challenges to the norms of nonverbal behavior in which virtual humans are intermingled with physical ones. Given AR is designed to be used in public places around other physically located people, understanding the effects on nonverbal behavior are critical.

Social connectedness: Users and non-users

In his 1992 novel, Snow Crash, Neal Stephenson discusses “Gargoyles,” people who use AR/VR in public. He explains that “Gargoyles are no fun to talk to. They never finish a sentence. They are adrift in a laser-drawn world” [42]. Of course, the same can be said for cell phone usage. Past research has demonstrated that the mere presence of smartphones during face-to-face (FtF) conversations has an effect on communication outcomes. More specifically, Przybylski and Weinstein [43] showed that when a smartphone was present during a conversation, partner closeness was lower despite the phone not playing an active part in the conversation. In a different study, Misra and colleagues [44] found that conversations between interaction partners in the absence of a smartphone were considered to be significantly higher in quality than conversations where there was a mobile device present. Additionally, people who had conversations in the absence of a smartphone had significantly higher empathic concern scores than people who had conversations in the presence of a smartphone. Vanden Abeele and colleagues [45] demonstrated that when one person actively uses their phone during conversation, their conversation partner formed more negative impressions of them, found them less polite, and possessed poorer perceptions of the quality of conversation compared to those that did not actively use their phone.

When it comes to see-through AR headsets specifically, one of the main goals is to superimpose virtual content onto the user’s real world environment with the purpose of providing the user with additional information about their surroundings [2]. This affordance makes it possible for AR users to interact with virtual content that is visible only to them, which may make bystanders curious or uncomfortable. Furthermore, virtual content may intentionally or unintentionally be rendered on top of people the AR user is interacting with, potentially disrupting the interaction by causing the violation of social norms. In the case of eye-contact, AR may create situations where establishing and maintaining eye-contact is difficult. For example, the AR headset itself may prevent users and non-users from making eye-contact. Additionally, virtual content that occludes people’s faces would lead to an inability for eye-contact to be established by the AR user. However, to our knowledge, there have been no studies assessing the effect of socially interacting with someone who is occluded by private, virtual content.

In AR, it is likely that other people who are not wearing an AR headset (i.e., non-users) in a room will not be aware of all the digital content being rendered to the users. The presence of AR objects or virtual humans may distract users, preventing them from focusing on the non-users they were interacting with. This may lead to a loss of common ground, or mutually shared information [46], between AR users and non-users, and to the violation of multiple social norms (e.g., eye-contact, turn-taking during a conversation, and interpersonal distance). A case study of a conversation between an AR user wearing Google Glass (a type of AR headset) and a non-user suggests that Glass disrupted turn taking between individuals, leading to poor rapport [47]. In the final pages of their survey on AR, Billinghurst, Clark, and Lee [1] focus on the challenges around social acceptance of AR. While Google Glass received some criticism, the authors point out there is still very little empirical research on how AR use is perceived by non-users and the effects that wearing the headset have on social interactions.

Overview of studies

In this investigation, three studies were conducted to assess the social effects surrounding augmented reality use.

Study 1 examines how embodied agents affect the way people perform tasks in the physical world. A robust literature on social facilitation demonstrates that people tend to benefit from an audience when they perform easy tasks but tend to be impaired by the same audience when they perform difficult tasks. Given AR users will likely be rendering virtual humans in AR while they are performing daily tasks in the real world, the extent to which these processes are replicated in AR becomes an imperative question.

Study 2 examines how social interactions in AR will change users’ subsequent nonverbal behavior in the physical world. Given there is a spatial component to AR social interactions [48] it is likely that virtual humans in AR will be associated with objects or locations in the physical room where they were rendered. These associations may affect social behavior even after the virtual content is no longer displayed in the physical environment.

Study 3 examines the effect that the presence of see-through AR headsets during a FtF dyadic interaction has on interpersonal outcomes. Given virtual content is often rendered in specific locations in the physical environment, it is likely that some of it may partially or completely occlude people the AR user is interacting with. Thus, we also examine the effect that interacting with someone whose face is either completely occluded by virtual content or not occluded at all has on interpersonal outcomes.

Study 1: Task performance

Given that the few experiments examining social interaction with AR agents have demonstrated that virtual humans in AR elicit social responses and affect communication outcomes [15–18], we hypothesize the following:

Hypothesis 1. Compared to being alone, participants who perform an easy cognitive task in the presence of an AR agent will have enhanced performance while participants who perform a difficult cognitive task in the presence of the same AR agent will have impaired performance.

Methods

Participants.

A total of 60 participants (32 female, 28 male) were recruited from Stanford University. The recruitment and experiment processes were approved by the Stanford IRB under protocol IRB-45211. The participants shown in figures have given written informed consent to publish their likeness under the CC-BY license. Participants were given course credit for completing the experiment. Fifty-eight participants’ ages were between 18 and 24, and two participants were between 25 and 35 years old. When choosing the sample size, we considered the closest work, which was [33]. Using an estimate of their effects, we ran a power analysis for a within-subjects design and found 60 subjects to be powerful at a β = 0.8 level. This also gave us 15 subjects in a between-subjects design, which is a typical sample size used in previous studies (see [49] for a list of sample sizes of similar studies).

Materials.

Cognitive Task. The cognitive task completed by the participants was an anagram task. Similar to Park and Catrambone [33], anagrams were split into two groups, easy and hard. Easy anagrams were chosen from Tresselt and Mayzner’s [50] anagram set with median solution times between 3 and 13 seconds. Hard anagrams were chosen from the same set but had a median solution time between 17 and 143 seconds. From each of these sets, two posters of ten anagrams were selected, for a total of four posters. Within the same difficulty level, the anagrams were divided such that the posters would be as similar in difficulty as possible. For the complete list of anagrams see S1 Appendix.

Apparatus. Participants wore the Microsoft HoloLens AR headset with a 60 Hz refresh rate and a resolution of 1268 x 720 per eye. The virtual field of view of the HoloLens is about as large as a letter size or A4 paper at arm's length (i.e., horizontal and vertical fields of view of 30° and 17.5° respectively [51]). To see this size in the experiment space, refer to Fig 1. However, the field of view for physical objects was not impaired given the HoloLens is a see-through AR headset and participants can still see the physical environment around them. The device weighed 579 grams, recorded audio, and tracked headset position (x, y, z) and orientation (yaw, pitch, roll) [52].

Download:

Fig 1. Over-the-shoulder view of an actor in place of a participant showing the anagram poster and the virtual human.

The dashed area represents the field of view of the HoloLens.

https://doi.org/10.1371/journal.pone.0216290.g001

The virtual environment was created as an application through the Unity game engine version 2017.3.1f1. The application also included a networking component connecting to the experimenter's laptop and the HoloLens. During the experiment, the researcher could change the virtual content displayed to the participant (e.g., make the agent present or absent). The participants’ solutions to the anagrams were recorded by the headset as an audio (.wav) file.

Content. The virtual agent, introduced as Chris, was matched to the participants’ biological sex in order to avoid sex effects. There was one model of each biological sex, and both male and female character models exhibited idling and walking animations. To simulate natural speech, the agent’s jaw bone was programmed to move up and down based on the volume of the recorded audio.

Physical Space. Participants sat on a chair on one side of a 5.6 m by 6.4 m room. At the other end, a chair was placed next to a door. The agent, if present, appeared to sit in this chair. See Fig 1 for the room setup. On the back side of the door was one of the four posters showing ten anagrams. The poster was placed such that it was not visible to participants when the door was open, but visible when the door was closed. The placing of the chairs and the posters in the room was chosen specifically to prevent the agent from disappearing completely from the participants’ limited virtual field of view while they were completing the task.

Design and procedure.

The experiment adopted a 2x2 design, crossing social context and task difficulty. While participants completed an anagram task, social context was manipulated (i.e., the agent was either present or absent), and the task difficulty was either easy or hard. There were four conditions: social-easy, social-hard, alone-easy and, alone-hard. Participants were randomly assigned into one of the 24 possible orderings of the four conditions. Each condition appeared at each of the four serial positions equally across participants, allowing for both a within-subjects analysis over all four trials and a between-subjects analysis only using the participants’ first anagram trials.

Between two and five weeks before the date of the experiment, participants filled out a prescreening survey to determine which studies they were eligible for. The prescreening data included participant’s biological sex, which was used to match the agent’s sex to each participant. On the day of the experiment, participants entered the lab and filled out the consent form. Participants were then led to the experiment room, where they completed two short training tasks. First was an anagram-solving task with an example anagram and two training anagrams. This ensured participants could see the anagrams and understood the anagram task. Then, the experimenter fitted the AR headset on the participant. The second task was a simple navigation task involving AR objects. The experimenter asked participants to confirm they saw a virtual ball and then to walk towards that ball until it changed color. This process was repeated with a virtual cube. This second task ensured participants were able to see virtual objects and were comfortable wearing the device. Fig 2 shows this process.

Download:

Fig 2. The participant acquainting themselves with the HoloLens by interacting with the sphere and cube.

https://doi.org/10.1371/journal.pone.0216290.g002

Participants then saw the agent rendered in their virtual field of view, and the experimenter who stood in the room physically. The experimenter stood about a meter to the left of the agent and asked the agent to introduce itself, then the agent provided a short introduction. The text of the introduction is given in S1 Protocol. Depending on condition, the experimenter then informed the participant that the agent would either stay in the room while they completed the task or would leave the room. The experimenter instructed participants to solve as many anagrams as possible in the allotted time, to solve in any order they would like, and to speak the solution word aloud each time they solved an anagram. Then, the experimenter taped the correct poster of anagrams on the back side of the open door, which was not visible to the participant, and instructed the participant to begin solving when the door was closed. The participant’s view of the experiment room is given in Fig 1. The experimenter then waited for exactly 3 minutes outside of the room and returned to the room. This process was repeated three more times, once per combination of social context and difficulty. Participants were then led to a different room where they completed a questionnaire. Once the questionnaire was completed, participants were debriefed.

Measures.

Score. The number of anagrams solved was determined by analyzing the recordings of each of the participants. To expedite analysis, a program was created using Python (version 2.7) to go through the audio and cut large sections of silence for each of the recordings. A human coder then listened to the trimmed recordings and determined which words were spoken by participants. Each participant received a point for each anagram solved correctly. However, a few anagrams had more than one solution. For example, the solution for "RBSCU" was listed as "SCRUB" in Tresselt and Mayzner [50], but multiple participants solved it with the word "CURBS". We included these unintended solutions in the participant’s score, but if a participant said both words (e.g., "CURBS" and "SCRUB") only one solution was counted. Scores ranged from 0 to 10 (M = 5.92, SD = 3.08).

Results and discussion

Due to technical issues, six participants were excluded from the analysis. Four participants were excluded because the HoloLens automatically turned off in the middle of the experiment, failing to produce an audio recording. During the session of the other two participants, the network connection between the experimenter's laptop and the HoloLens failed, preventing data collection. After these exclusions, the data from 54 participants (28 female, 26 male) was analyzed.

With proper ordering of conditions we were able to test both a between and within-subject design. While the advantage of a within-subject design is that it accounts for variance due to individual participants, a between-subject has the ability to test initial, novel reactions to a given condition without contamination from previous ones. Thus, we report both analyses separately.

We operationalize social facilitation and inhibition as an interaction effect between the difficulty of the anagram and the presence of the agent on score. For the between-subjects analysis, an ANOVA was performed with the R programming language, version 3.5.1. For the within-subjects analysis, samples were not independent given multiple data points came from the same participant. Consequently, a mixed-effect model, using the ‘nlme’ package version 3.1.137, was used to analyze the within-subjects data. The fixed effects of this model included difficulty, social context, the interaction between difficulty and social context, and condition order. The random effects were the random intercepts per participant.

Between-subjects.

The statistical assumptions that are necessary for an ANOVA are that residuals are normally distributed and have equal variance among each tested group. The residuals were not different from a normal distribution as determined by a Shapiro-Wilk test (W = 0.99, p = 0.82). The variances were not significantly different among conditions, (F(3, 50) = 0.35, p = 0.79).

Manipulation check. The main effect of difficulty (easy vs. hard) on score was significant (easy: M = 7.68, SD = 2.51; hard: M = 4.16, SD = 2.55; F(1, 46) = 63.23, p < .001, d = 1.95). Participants solved more anagrams in the easy conditions than in the hard conditions indicating that the manipulation of difficulty between conditions was successful.

Effect of social context on score. As expected, the main effect of social context (social vs. alone) on score was not significant (F (1, 46) = 0.16, p = .69, d = 0.07). The interaction effect between social context and anagram difficulty was significant, (F(1, 46) = 11.84, p < .01, d = 0.88). To test the simple effects, we conducted two post-hoc t-tests. Social facilitation predicts that participants will solve more easy anagrams in a social context than when they are alone. This was confirmed, (social: M = 8.79, SD = 1.97; alone: M = 6.79, SD = 2.19; t(25.71) = 2.54, p = 0.03; d = 0.96). Social inhibition predicts that participants will solve more hard anagrams alone than socially. This is also confirmed, (social: M = 2.54, SD = 2.07; alone: M = 4.23, SD = 2.13; t(23.98) = -2.06, p = 0.05; d = 0.81). Overall, these results confirm Hypothesis 1. Table 1 shows the means and standard deviations, and Fig 3 shows the means and 95% bootstrapped CI’s by condition.

Download:

Fig 3. Anagrams solved per condition.

This chart displays means and 95% CI’s for the number of anagrams solved in each condition.

https://doi.org/10.1371/journal.pone.0216290.g003

Download:

Table 1. Means and standard deviations of anagrams solved per condition.

https://doi.org/10.1371/journal.pone.0216290.t001

Within-subjects.

The statistical assumptions that are necessary for a mixed-effect linear model are that residuals are normally distributed and have equal variance among each tested group. The residuals were not different from a normal distribution as determined by a Shapiro-Wilk test (W = 1.00, p = 0.79). The variances were not significantly different among groups, whether the groups are defined by our conditions (i.e., social context and difficulty) (F(3, 212) = 1.06, p = 0.37), participant (F(53, 162) = 0.86, p = 0.73), or condition serial order (F(3, 212) = 0.01, p = 1.00).

Manipulation check. The main effect of difficulty (easy vs. hard) on score was significant (easy, M = 7.68, SD = 2.51;. hard, M = 4.16, SD = 2.55; b = -3.42, t(158) = -11.66, p < 0.001, d = 1.39). Participants solved more anagrams in the easy conditions than in the hard conditions, indicating that the manipulation of difficulty between conditions was successful.

Effect of social context on score. The main effect of social context (social vs. alone) on score was not significant (b = -0.05, t(158) = -0.17, p = 0.87, d = 0.05). The interaction effect between social context and difficulty was also not significant (b = -0.19, t(158) = -0.46, p = 0.64, d = 0.04).

Condition serial order effects. The effect of the condition order on score was significant (b = 0.19, t(158) = 2.03, p = 0.04). Participants answered more anagrams over time. These results may be caused by a learning effect and alternatively explain why the repeated-measures analysis did not confirm Hypothesis 1, as change over time dwarfed the effect of condition. Table 2 gives the means and standard deviations of anagram score as well as participant count for each pairing of order and condition.

Download:

Table 2. Means and standard deviations of anagrams solved per condition and order.

https://doi.org/10.1371/journal.pone.0216290.t002

Overall, the results from the between-subjects analysis confirm Hypothesis 1 and replicate Park and Catrambone’s [33] social facilitation and inhibition findings within an AR context. These results provide some of the first empirical evidence suggesting that virtual humans in AR can socially influence performance, adding to the literature on the social effects of AR agents [15–18]. However, these effects did not extend to multiple trials. This may be due to the low behavioral realism of the agent, as it does not speak or otherwise act socially except for the short introduction before the first social condition. An alternative explanation is that the participant becomes accustomed to the presence of the agent and so no longer feels social pressure.

Study 2: Nonverbal behavior

While Study 1 assessed the social influence that virtual humans have on performance, Study 2 examines whether or not users act in accordance with social norms when interacting with an agent in AR. Considering the norms of interpersonal distance, this would mean the subject would not sit directly upon the AR agent, and in regards to eye-contact, participants would rotate bodies to avoid turning their backs on the agent while sitting down. Furthermore, Study 2 examines how the associations formed between an agent’s previously rendered location and the physical space affects social behavior after the agent is no longer visible. Given past research has demonstrated that users tend to follow social norms when interacting with virtual humans [38,39,41], and there is a spatial component associated with AR social interactions, our hypotheses are as follows:

Hypothesis 2. Participants wearing the headset will sit on the chair without the agent more often than on the chair with the agent.

Hypothesis 3. Participants not wearing the headset will sit on the chair that was empty more often than on the chair where the agent was sitting.

Research Question 1. Will participants avoid a rotation direction that requires turning their heads away from the agent as they choose a seat in order to maintain eye-contact?