The contribution of linear perspective cues and texture gradients in the perceptual rescaling of stimuli inside a Ponzo illusion corridor

Gizem Y. Yildiz; Irene Sperandio; Christine Kettle; Philippe A. Chouinard

doi:10.1371/journal.pone.0223583

Abstract

We examined the influence of linear perspective cues and texture gradients in the perceptual rescaling of stimuli over a highly-salient Ponzo illusion of a corridor. We performed two experiments using the Method of Constant Stimuli where participants judged the size of one of two rings. In experiment 1, one ring was presented in the upper visual-field at the end of the corridor and the other in the lower visual-field at the front of the corridor. The perceived size of the top and bottom rings changed as a function of the availability of linear perspective and textures. In experiment 2, only one ring was presented either at the top or the bottom of the image. The perceived size of the top but not the bottom ring changed as a function of the availability of linear perspective and textures. In both experiments, the effects of the cues were additive. Perceptual rescaling was also stronger for the top compared to the bottom ring. Additional eye-tracking revealed that participants tended to gaze more in the upper than the lower visual-field. These findings indicate that top-down mechanisms provide an important contribution to the Ponzo illusion. Nonetheless, additional maximum likelihood estimation analyses revealed that linear perspective fulfilled a greater contribution in experiment 2, which is suggestive of a bottom-up mechanism. We conclude that both top-down and bottom-up mechanisms play important roles. However, the former seems to fulfil a more prominent role when both stimuli are presented in the image.

Citation: Yildiz GY, Sperandio I, Kettle C, Chouinard PA (2019) The contribution of linear perspective cues and texture gradients in the perceptual rescaling of stimuli inside a Ponzo illusion corridor. PLoS ONE 14(10): e0223583. https://doi.org/10.1371/journal.pone.0223583

Editor: Magdalena Ewa Król, SWPS University of Social Sciences and Humanities, POLAND

Received: March 28, 2019; Accepted: September 24, 2019; Published: October 10, 2019

Copyright: © 2019 Yildiz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data can be accessed here: https://latrobe.figshare.com/s/d65ad4f15fc0e2eacec0.

Funding: This work was supported by La Trobe University’s School of Psychology and Public Health and a La Trobe University Scholarship to GYY. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

If two objects at different distances subtend the same visual angle on the retina then the object located at the furthest distance is physically larger as a proportion of distance from the other in a manner obeying Euclidian geometry [1]. Size constancy mechanisms operate so that our visual system considers these physical realities and enables us to perceive objects as having the same size regardless of changes in viewing conditions. To maintain size constancy, the visual system estimates the distance between the object and the eyes from many sources of depth cues and perceptually rescales the retinal input about the size of the object [2, 3]. Linear perspective is one of many pictorial depth cues that the visual system uses to estimate depth [4]. The visual system estimates greater depth when two lines on the retina converge closer together. Another important pictorial depth cue is texture gradient. The retinal size of uniform texture elements, such as stones, shrink with distance. Consequently, the visual system estimates greater depth where texture gradients are smaller. Artists apply this knowledge to create illusory depth on a 2D image to trick us in perceiving depth and size differences [5].

In the Ponzo illusion, two physically identical stimuli appear to be different from each other when placed over the top and bottom sections of converging contextual lines that emulate a vanishing point, like the converging lines of a railway track or the converging walls of a corridor [6]. Specifically, the top stimulus where the contextual lines converge appears to be larger than the bottom one (Fig 1). Misapplied constancy scaling theory is one of several theories that explains the Ponzo illusion. According to the theory, the pictorial depth information in the background will rescale the size of objects in such a manner that those that appear further are perceived larger [2, 7, 8]. With regards to the Ponzo illusion, the visual system interprets the converging lines as parallel lines receding into the distance, as linear perspective cues do in the real world, and perceptually rescales stimuli as a function of how far they appear to be [9, 10]. Many have argued that this perceptual rescaling is driven by top-down modulation arising from knowledge acquired from everyday life experiences about how linear perspective cues inform the brain about how far objects are (e.g. watching trains on railway tracks, cars on highways, etc.) [2, 8].

Download:

Fig 1. Ponzo illusion.

The figure provides an illustration of the Ponzo illusion in its classical configuration with two simple lines converging upwards and two horizontal lines in the centre that are identical in length but appear different.

https://doi.org/10.1371/journal.pone.0223583.g001

However, to this day, the relative contribution of the linear perspective cues and texture gradients to the magnitude of illusory size perception in the context of the Ponzo illusion remains debated despite preliminary research [11, 12]. Leibowitz et al. [11] reported that applying only texture gradients was twice as strong as when only applying linear perspective cues. Moreover, they showed an additive effect when the texture gradients were combined with the linear perspective cues. Specifically, texture gradients alone and linear perceptive cues alone perceptually rescaled the size of stimuli by 20% and 10%, respectively. The presence of both cues perceptually rescaled the size of stimuli by 30%. These additive effects suggested to Leibowitz et al. [11] that the magnitude of the Ponzo illusion is closely dependent upon the availability of the pictorial depth cues. We reason that these effects were driven largely by top-down mechanisms. If the two types of depth cues influenced size perception by separate channels in a bottom-up manner without integration then one might expect to find the same degree of perceptual rescaling with both cues present as when only the stronger of the two cues is present rather than the additive effects seen in the Leibowitz et al. [11] study. This same line of reasoning is also explained elsewhere [13].

However, the study by Leibowitz et al. [11] is not without controversy. Fineman and Carlson [12] questioned whether their choice of a background image for the texture gradient condition included only texture gradients as a depth cue. It could have been the case that this condition also offered linear perspective in how the textures were arranged. To help resolve this issue, the authors tested the contribution of texture gradients using Gibson’s [14] dot patterns. Contrary to the results obtained by Leibowitz et al [11], they demonstrated that texture gradients had little to no effect on perceptual rescaling. Based on these results, the authors favoured a bottom-up explanation of the Ponzo illusion.

In line with Fineman and Carlson’s findings, others have reported that the manipulation of texture gradients does not affect perceptual judgments of size [15, 16]. Similarly, studies examining the effects of texture gradients on perceived depth have shown that the visual system is less sensitive to the manipulation of texture gradients compared to the manipulation of linear perspective cues [17–20]. For instance, Zhang [20] found that perceptual judgments of depth did not change as a function of the availability of texture gradients in an immersive driving simulator, suggesting once again that this kind of pictorial depth cue is not relevant for the perceptual rescaling of size.

So far, the literature reviewed has yielded mixed results. The mixed results could have arisen from a fundamental limitation present in all of the aforementioned studies. Namely, the background images that incorporated the two cues were not graphically additive to those consisting of the presentation of only one. In other words, there was no real graphical subtraction or addition of cues in the background images across conditions. Instead, the images that were previously used consisted of completely different images with little similarities amongst each other. The only study that we are aware of that has used a more systematic approach is one performed by Rennig, Karnath, and Huberle [13]. The authors examined the effects of linear perspective cues and texture gradients on the perceived size of Kanizsa triangles, an illusory contour, over a Ponzo-like background of a corridor. The two pictorial depth cues affected the perception of the Kanizsa triangles differently. Namely, the Kanizsa triangle that appeared further away (i.e. the top one) was perceived larger in the linear perspective but not in the texture gradient condition.

In the present study, we graphically added and removed linear perspective cues and texture gradients in a Ponzo-like illusory display of a hallway (Fig 2) to determine how these manipulations might affect the perceived size of stimuli. The manipulation of these pictorial depth cues allowed us to examine the relative contribution of top-down and bottom-up mechanisms indirectly. We propose that if the two cues perceptually rescale the stimuli separately and produce an additive effect when they are combined together, then we can infer that top-down mechanisms fulfil an important contribution to the illusion. This is because both cues afford predictive values about depth and their presence should influence size perception in an integrative manner on the basis of these affordances. On the other hand, we reason that if only a subset of cues exerts an effect, despite their predictive value, or that both cues do not exert an additive effect, then we can infer that integration was minimal and that bottom-up mechanisms by separate channels play an important role in driving the illusion. It is likely that both mechanisms fulfil a role given the evidence so far. The study tries to shed additional light on their respective contributions.

Download:

Fig 2. Background images in the present study.

A. Ponzo-like illusion display of a hallway with stones (textures) and walls (linear perspective cues). B. Ponzo-like illusion display of walls (linear perspective cues). C. Ponzo-like illusion display of a hallway with stones (texture gradients). D. Control background without depth cues.

https://doi.org/10.1371/journal.pone.0223583.g002

To examine the relative importance of each, we conducted two experiments where participants judged the size of a standard stimulus over one of four different backgrounds ((A) linear perspective cues + textures, (B) linear perspective cues, (C) textures or (D) no cues) (Fig 2). In experiment 1, we presented both the standard and comparison stimuli over the same background image. We hypothesised an illusion that varied in strength as a function of the availability of linear perspective cues and texture gradients. The presentation of the standard and comparison stimuli over the same background is common. However, under this configuration, size-contrast effects can further increase the perceived differences in size that are driven by the pictorial cues. To help minimise size-contrast effects so that we could better isolate the effect of pictorial cues, we presented the comparison stimulus outside of the background so that only one ring was presented over the background in experiment 2. In this case, we hypothesised that the size illusion would be weaker but still present.

In addition, we recorded eye positioning during the task. To the best of our knowledge, eye movements under free gaze conditions have never been measured before in previous investigations of the Ponzo illusion. We hypothesised that participants would spend different amounts of time looking at the upper and lower sections of the Ponzo illusion background. Specifically, we hypothesised that durations in fixation would be larger for the upper visual field since depth cues often draw our attention towards this field in the real world [21]. Moreover, it is well reported that attending to particular parts of an illusion display increases its strength [21–24]. For these reasons, we hypothesised that illusion strength would increase in conditions where participants direct their attention more to the upper visual field.

Method

Participants

Sixteen participants (M_Age = 20.43 years, SD = 2.31, 8 males) participated in experiment 1 and sixteen participants (M_Age = 23.44 years, SD = 9.52, 6 males) participated in experiment 2. All had either normal or corrected-to-normal vision. Prior to the experiments, each participant’s visual acuity, stereo-acuity, and colour vision were measured using the Snellen Chart [25], Randot Contour Circles Test [26], and Ishihara’s Test for Colour Deficiency [27]. Visual acuity was 20/25 or better in each eye and stereo acuity was 70 arcsec (0.02 arcdeg) or less for all participants. None of the participants were colour blind. Written informed consent was obtained from each participant before the experiment. At the end of the experiment, participants received gift cards to compensate for their time and any inconveniencies. The study was carried out in accordance with the Declaration of Helsinki and approved by the La Trobe Human Ethics Committee.

Procedures

Stimuli were presented on an ASUS VG248QE (Taipei, Taiwan) 24" monitor driven by MATLAB (MathWorks, Natick, MA, USA) and Psychtoolbox, Version 3 [28, 29]. The monitor was placed 76 centimetres away from the chin and forehead rest. It was set to a 120 Hz refresh rate with 1920 x 1080 display resolution on a Dell T1700 running Windows 10. Button responses were recorded with a model RB-840 Cedrus Response Pad (Cedrus Corporation, San Pedro, California, USA). For a subset of participants (8 per experiment), we recorded eye positioning using a portable Tobii TX 300 eye-tracker at a sampling rate of 300 Hz (Tobii AB, Stockholm, Sweden). The eye-tracker was placed 60 centimetres away from the chin and forehead rest. Eye-tracking was not performed in everybody due to limited access to this system.

The size perception of two 2-dimensional (2D) red (R = 200, G = 0, B = 0) rings with a thickness of 0.16 degrees was evaluated using the Method of Constant Stimuli [30]. One of the rings was designated as the standard and the other as the comparison stimulus. The standard ring, and the comparison ring in experiment 1, was presented over one of the following backgrounds: (1) linear perspective cues + textures, (2) linear perspective cues, (3) textures, or (4) no cues (Fig 2).

In experiment 1, both the standard and comparison stimuli were presented over the same background image (12.6 × 12.6 degrees) (Fig 3A and 3B). The standard and comparison rings were separated by a vertical distance of 5.24 degrees and a horizontal distance of 3.26 degrees. In experiment 2, we used similar stimuli and followed similar procedures as we used in experiment 1 except for three differences. First, in experiment 2, the background image subtended the same visual angle but was presented over a larger grey area (14 × 18 degrees). Second, the comparison ring was presented outside of the background image within the grey area (Fig 3C and 3D). Third, in experiment 2, the standard and comparison rings were separated by a horizontal distance of 10.74 degrees for the bottom standard ring and 7.52 degrees for the top standard ring. The vertical distance between the standard and comparison rings was 2.62 degrees. In both experiment 1 and experiment 2, the rings were always presented in the same configuration.

Download:

Fig 3. Stimuli and procedures.

Illustration of stimuli and procedures that were used in the linear perspective + texture gradient background in experiments 1 and 2 (A-B and C-D, respectively). For the top standard ring block, the top standard ring was shown for 1 sec followed by an alerting sound cue that signalled the presentation of the bottom comparison ring (A and C for experiments 1 and 2, respectively). For the bottom standard ring block, the bottom standard ring was shown for 1 sec followed by an auditory alerting cue that signalled the presentation of the top comparison ring (B and D for experiments 1 and 2, respectively). In experiment 1, both the standard and comparison stimuli were presented over the same background image (A and B). In experiment 2, the comparison ring was presented outside of the background image within the grey area (C and D). The speaker symbols represent the presentation of the auditory alerting cue.

https://doi.org/10.1371/journal.pone.0223583.g003

For each experiment, there were eight blocks. Each block corresponded to a different condition. Specifically, there was a block for each background with the standard stimulus on the top portion of the image and there was a block for each background with the standard stimulus on the bottom portion of the image (Fig 3). Half of the participants performed the blocks with the standard stimulus on top before performing the blocks with the standard stimulus at the bottom, while the other half of the participants did the reverse. The order of background presentations was randomised for each participant. The standard ring was always kept constant at 2.1 degrees in diameter on all trials and the comparison ring ranged in diameter from 1.64 degrees to 2.54 degrees in 10 increments with 0.1 degrees difference. Each comparison size was shown 10 times in a single background per block. Thus, there were 100 trials per block. The order of trials within each block was randomised.

The graphical reduction of pictorial depth cues was accomplished by removing linear perspective and / or texture cues from a 3D scene of a hallway and walls created in Autodesk 3ds Max (Autodesk, Inc., San Rafael, CA, USA), a program that is currently frequently used to create virtual environments. The following specifications were used to create the background images. The left and right side walls had a length of 1,800 cm and were used as linear perspective cues onto a hallway with a back wall and floor. The walls on the side had different heights to reduce the possibility of the image popping out instead of going into the distance. The left side wall had a height of 195 cm while the right side wall had a height of 130 cm. The back wall with a width of 2,540 cm and a height of 290 cm was placed at the end of a floor that have a length of 1,800 cm and a width of 2,540 cm. The bottom standard stimulus was presented 130 cm away from the virtual camera while the top standard stimulus was presented 1,300 cm away from the virtual camera.

A high-resolution seamless rock wall image was used as the texture cue. Depth information was increased with bump and specular maps of the rock wall. The bump and specular maps of the seamless rock wall image were created in Adobe Photoshop (Adobe Systems Incorporated, San Jose, CA, USA) and assigned to each textured wall. The textured 3D scene of a hallway and walls was used as the background image with linear perspective cues and texture gradients (Fig 2A). Three more background images were rendered by removing linear perspective cues and / or textures. The background image that covered only texture cues was obtained by removing the side walls (Fig 2C) and the image that covered only linear perspective cues was obtained by removing all texture gradients in Autodesk 3ds Max (Fig 2B). Finally, a control background image without depth cues was created by removing both the linear perspective cues and texture gradients, which served as a baseline background (Fig 2D). To assign a colour to the non-textured backgrounds, we measured the average colour of each textured wall in Adobe Photoshop and assigned their average colour in Autodesk 3ds Max. The background images were derived from taking pictures of the virtual environment using a virtual camera placed 1,800 cm from the back wall. The settings of the virtual camera consisted of a full frame of 35 mm, a focal length of 29 mm, and an aperture of f/8. Global lighting of the virtual environment simulated daylight (6,500 K). The centre of the virtual camera from the floor of the hallway was approximately the same height (35 cm) as the participant’s eyes from the testing table (32 cm). All the rendered background images were cropped in Adobe Photoshop.

The participants were provided with 4 practice trials at the start of each block. The eye-tracker was calibrated with a 9-point calibration display at the beginning of each block for the participants who had eye tracking. For both the practice and experimental trials, the participants were asked to judge whether the comparison ring was smaller or larger than the standard ring. Fig 3 illustrates the order of events in a given trial. The standard ring was always presented on the background. Each trial began with a 60 ms auditory alerting cue whenever the comparison ring was presented. The comparison ring was displayed until participants judged whether it was larger or smaller than the standard ring by pressing a button. After button pressing, the comparison ring disappeared before the next trial began one second later. For every trial, the eye tracker collected data from trial onset until a response was made. A break was provided at the end of each block.

Statistical analyses

We created psychometric curves for each condition in each participant based on their responses. This was done by counting the number of times the participant reported the comparison stimulus as appearing “larger” than the standard one at each increment. Using the following logistic function, we calculated the probability (P) of the participant reporting the comparison stimulus at each increment (0.1 degrees) as appearing larger than the standard stimulus: Where b0 and b1 are coefficient estimates based on an initial general linear model (binary logit) fit. From this function, the PSE was calculated as P = 0.5, representing how large the comparison stimulus needed to be for the participant to judge this stimulus as having the same apparent size as the standard stimulus. The resulting curves fit well for the different conditions in each individual in experiments 1 (r(6) ranged between .734 − .989) and 2 (r(6) ranged between .762 − .979). These resulting PSE values were used for the following statistical analyses in each experiment (see S1 Fig in Supplementary Materials to see the mean PSE curves).

To verify whether or not the standard stimulus in a given condition was perceived differently than its physical size, a one-sample t-test against the physical size of the standard ring (100 pixels) was performed. The Bonferroni method was used to correct for multiple comparisons. There were eight one-sample t-tests per experiment. Thus, to report the Bonferroni-corrected p values (p_corr), we multiplied the observed p value (p_uncorr) by the number of comparisons made (i.e., p_corr = p_uncorr × 8).

To test the effects of linear perspective cues and texture gradients on the perceived size of the top and bottom rings, a 2 × 4 repeated-measures analysis of variance (ANOVA) with Visual Field ((1) Top Ring, (2) Bottom Ring) and Background ((1) linear perspective cues + textures, (2) linear perspective cues, (3) textures or (4) no cues) as factors was conducted. Greenhouse-Geisser corrections were applied when the assumption of sphericity was not met according to a Mauchly’s sphericity test.

We further analysed the contributions of linear perspective cues and texture gradients for the top and bottom rings based on a maximum likelihood estimation (MLE) model. According to this model, the visual system optimally combines visual cues by taking the reliability of each cue into account [30–32]. The model is based on two assumptions: (1) lower variance in the data is seen when a visual cue is highly reliable and (2) the visual system gives more importance to highly reliable cues when combining information from different cues. To estimate the relative contributions of linear perspective cues and texture gradients in the perceptual rescaling of size, we computed the weighted linear summation of PSE measurements for texture (S_texture) and linear perspective (S_{linear perspective}) backgrounds from their standard deviations using the following formulas:

Solving for w in each experiment provided the respective contributions of linear perspective cues and texture gradients.

To analyse the eye-tracking data, areas of interest (AOIs) were defined as a 5 × 5 cm (3.78 × 3.78 degrees) square region centred on the standard and comparison rings. AOIs were defined using Tobii Studio eye tracking software prior to data collection (Tobii Technology, Inc). During data collection, the eye-tracking system tabulated whether or not eye gaze was directed in each AOI at every frame lasting 3.3 milliseconds. The number of frames with fixation was then computed for each AOI off-line and converted to seconds to calculate fixation durations. These fixation durations represent the sum (not the average) of all trials. Each condition had the same number of trials and the results indicated that there were no differences in trial durations (which ended when participants made a response) between conditions (see below)–enabling us to compare this measure across conditions. A 2 × 2 × 4 repeated measures ANOVA with Area of Interest ((1) Standard, (2) Comparison), Visual Field ((1) Top Standard Ring Block, (2) Bottom Standard Ring Block) and Background ((1) linear perspective cues + textures, (2) linear perspective cues, (3) textures or (4) no cues) as factors was conducted.

Post-hoc pairwise comparisons using Tukey’s honest significance difference (HSD) method, which corrected for multiple comparisons, were conducted to further examine interactions and effects found significant by all ANOVAs performed on the PSEs and fixation durations. Unless specified otherwise, all reported p values were corrected for multiple comparisons and were based on an alpha level of .05.

Results

Experiment 1

Points of subjective equality (PSEs).

Fig 4 shows the mean PSEs for each background for the top and bottom standard ring blocks. Table 1 provides descriptive statistics. To determine if the different background conditions exerted a change in perception relative to retinal information, we compared the PSEs of each background against the physical size of the standard ring (100 pixels) with one sample t-tests. One sample t-tests revealed that the top standard ring was perceived larger than its physical size across all backgrounds with pictorial depth cues (all p ≤ .008) while the bottom standard ring was perceived smaller than its physical size when it was presented with linear perspective cues only (p = .016). All significant shifts in PSEs were in the expected direction. Namely, PSEs were greater and lower than 100 pixels when the top and bottom rings was the standard ring, respectively.

Download:

Fig 4. PSEs in experiment 1.

Asterisks (*) represent significant differences at p < .05 after Tukey’s HSD corrections were made for multiple comparisons. Daggers (†) represent significant differences from the physical size (100 pixels) of the standard ring at p < .05 after Bonferroni corrections were made for multiple comparisons. The horizontal dashed line denotes the physical size of the standard ring. PSEs were computed from psychometric functions that best fit the data. Error bars represent the standard errors around the mean for within subjects contrasts. These error bars were calculated using procedures described by O’Brien and Cousineau [33].

https://doi.org/10.1371/journal.pone.0223583.g004

Download:

Table 1. Descriptive statistics for PSEs in experiment 1.

A series of independent samples t-tests, which were corrected for multiple comparisons using the Bonferroni method (p_corr), on the PSE values for each condition was performed between participants with (With Eye-Tacker) and without (No Eye-Tracker) eye-tracking.

https://doi.org/10.1371/journal.pone.0223583.t001

The PSEs for the top and bottom rings in each of the four backgrounds were compared with each other. An interaction was observed between Visual Field and Background (F (3, 45) = 25.68, p < .001). Main effects of Visual Field (F (1, 15) = 52.08, p < .001) and Background (F (3, 45) = 3.30, p = .029) were also significant. To further examine the interaction, we conducted Tukey’s HSD pairwise comparison tests. These tests showed that the size of the top ring was consistently perceived larger on backgrounds with depth cues than the one without any cues (all p ≤ .028) while the size of the bottom ring was perceived smaller on the backgrounds with linear perspective cues compared to the one without any cues (both p ≤ .006). There were no differences in the perceived size of the rings when placed on the background with only linear perspective cues versus the one with only texture gradients (both p ≥ .394). Taken together, the presence of depth cues affected the perceived size of the top and bottom rings. S1 Table in the supplementary materials provides the results for all pairwise comparisons examined.

We repeated the above ANOVA on the absolute shifts in PSEs (|100 –PSE|) to confirm if the above interaction in PSEs was driven more strongly by the top ring relative to the bottom ring. The ANOVA revealed main effects of Visual Field (F (1, 15) = 7.91, p = .013) and Background (F (3, 45) = 16.89, p < .001). Likewise, the interaction remained significant (F (3, 45) = 6.11, p = .001). The presence of this interaction confirms that the top ring had a stronger influence than the bottom one. Complementing the ANOVA, the MLE analysis revealed that textures provided more reliable information for both the top (weight _Texture = .52, weight _Linear = .48) and bottom (weight _Texture = .57, weight _Linear = .43) rings. Correlations between the observed and predicted estimates were significant for the top (r (14) = .61, p = .012) and bottom (r (14) = .84, p < .001) rings (See S2 Fig in Supplementary Materials).

Eye-tracking.

Fig 5 shows the fixation durations for each AOI for the top and bottom standard ring blocks. An interaction was observed between AOI and Visual Field (F (1, 7) = 8.50, p = .022). All other interactions did not reach significance (all p ≥ .152). There was a main effect of AOI (F (1, 7) = 7.69, p = .028) but not for Visual Field (F (1, 7) = 2.23, p = .179) or Background (F (3, 21) = 1.18, p = .341). Post-hoc Tukey’s HSD pairwise comparison tests showed that participants attended to the top comparison ring more than the bottom standard ring (p = .016). There were no differences between fixation durations for the top standard and bottom comparison ring (p = .994). Thus, participants fixated on the top comparison ring more than the bottom standard ring when asked to perceptually judge the size of the latter.

Download:

Fig 5. Fixation durations in experiment 1.

Fixation durations are the total amount of time participants gazed at an AOI across all trials for a particular condition. The asterisk (*) represents a significant difference at p < .05 after a Tukey’s HSD correction was made for multiple comparisons. Error bars represent the standard errors around the mean for within subjects contrasts. These error bars were calculated using procedures described by O’Brien and Cousineau [33].

https://doi.org/10.1371/journal.pone.0223583.g005

Additional analyses.

Not all participants had eye-tracking. The question then arises whether or not the participants with eye-tracking are representative of those who did not. For the purposes of verification, we performed a series of independent samples t-tests on the PSE values for each condition (Table 1). Bonferroni corrections were applied to these tests to account for eight comparisons. These additional tests revealed that there were no differences between the participants who had eye-tracking versus those who did not (all p ≥ .272).

We performed an additional ANOVA to determine if the duration of trials differed between conditions. As a reminder, each trial ended when participants made a response. The validity of some of the analyses above depends on an evenly matched duration of trials across the different conditions. ANOVA did not reveal main effects of Visual Field (F (1, 7) = .903, p = .374) or Background (F (3, 21) = .333, p = .802). Likewise, the interaction did not reach significance between the two factors (F (3, 21) = 2.501, p = .087). Thus, trial durations did not differ between conditions. The average trial duration was 2.465 secs (SD = 0.154).