Performance Characterization of Watson Ahumada Motion Detector Using Random Dot Rotary Motion Stimuli

Siddharth Jain

doi:10.1371/journal.pone.0004536

Abstract

The performance of Watson & Ahumada's model of human visual motion sensing is compared against human psychophysical performance. The stimulus consists of random dots undergoing rotary motion, displayed in a circular annulus. The model matches psychophysical observer performance with respect to most parameters. It is able to replicate some key psychophysical findings such as invariance of observer performance to dot density in the display, and decrease of observer performance with frame duration of the display.

Associated with the concept of rotary motion is the notion of a center about which rotation occurs. One might think that for accurate estimation of rotary motion in the display, this center must be accurately known. A simple vector analysis reveals that this need not be the case. Numerical simulations confirm this result, and may explain the position invariance of MST(d) cells. Position invariance is the experimental finding that rotary motion sensitive cells are insensitive to where in their receptive field rotation occurs.

When all the dots in the display are randomly drawn from a uniform distribution, illusory rotary motion is perceived. This case was investigated by Rose & Blake previously, who termed the illusory rotary motion the omega effect. Two important experimental findings are reported concerning this effect. First, although the display of random dots evokes perception of rotary motion, the direction of motion perceived does not depend on what dot pattern is shown. Second, the time interval between spontaneous flips in perceived direction is lognormally distributed (mode≈2 s). These findings suggest the omega effect fits in the category of a typical bistable illusion, and therefore the processes that give rise to this illusion may be the same processes that underlie much of other bistable phenomenon.

Citation: Jain S (2009) Performance Characterization of Watson Ahumada Motion Detector Using Random Dot Rotary Motion Stimuli. PLoS ONE 4(2): e4536. https://doi.org/10.1371/journal.pone.0004536

Editor: Ernest Greene, University of Southern California, United States of America

Received: September 28, 2008; Accepted: December 30, 2008; Published: February 19, 2009

Copyright: © 2009 Jain. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The equipment used in the research belonged to Donald A Glaser, who also provided funding to pay human subjects used in this research. The author was supported in part by a GSR (Graduate Student Researcher) funded by Donald A Glaser and in part by a GSI (Graduate Student Instructor) provided by EECS Department UC Berkeley.

Competing interests: The author has declared that no competing interests exist.

Introduction

Many models of visual motion perception have been proposed [1], [2], [3], [4], [5]. Although much research has since been done on studies of human visual motion perception, little work has been done to psychophysically characterize the performance of these models. This is important for obvious reasons. A correct model of motion sensing should match human psychophysical performance on motion detection, and also agree with what is known currently about the neurophysiology of motion sensitive cells in the brain.

This paper presents a psychophysical performance characterisation of Watson & Ahumada's model of visual motion sensing [3], the first one to do so in my knowledge. The ability of Watson Ahumada motion detector to detect motion in random dot kinematograms is compared against human psychophysical performance. The stimulus, termed the racetrack, consists of random dots displayed in a circular annulus. The dot pattern is refreshed periodically, and a certain fraction of dots are correlated to move either clockwise (CW) or counter-clockwise (CCW) in the next frame. By varying the fraction of dots to be correlated, the amount of motion signal in the display can be controlled (see Movies S1, S2, S3 for illustration). There are many other parameters that can be varied, and performance of both the model and human observers is measured.

The model is able to match human performance with respect to most, but not all, stimulus parameters. For example, it is found that human observers are insensitive to the dot density in the display. The model shows similar behavior. The invariance of observer performance to dot density provides strong evidence against motion models based on matching dots to their nearest neighbors in the next frame [6], [7]. Such models predict that observer performance should decrease with increase in dot density, according to the probability of mismatch formula [8]. This is because as the dot density increases, the chances that the nearest neighbor is not in fact the correlated partner from the previous frame increase. Another experimental finding is that a frame duration of about 30 ms is found to be optimal for motion perception. I explain this result in terms of the spatiotemporal receptive field (STRF) structure of motion sensitive cells. At any time instant t, the response of such cells is roughly based on the value of the spatiotemporal stimulus from time t-T to t, with T of the order of 200 ms. When the frame duration is of the order of T or higher, the input is mostly constant within a window of T ms and therefore the cells fail to detect any motion. On the other hand if frame duration is very low the input may be changing at a rate that the cells cannot handle. This will again result in failure of cells to respond optimally.

The motion in racetrack is rotary as opposed to the more commonly encountered translational case. Associated with the concept of rotary motion is the notion of a center about which rotation occurs. One might think that for accurate estimation of rotary motion in the display, this center must be accurately known. A simple vector analysis presented in this paper reveals that this need not be the case. Numerical simulations confirm this result, and may explain the position invariance of MST(d) cells. Position invariance is the experimental finding that cells that are sensitive to rotary motion are insensitive to where in their receptive field rotation occurs [9].

A special case of the racetrack is when all dots are randomly drawn from a uniform distribution in each frame, i.e., there are no correlated dots. One would expect that in this case the perception would be that of random twinkling noise, since there is no motion embedded in the stimulus. However, about two-thirds of observers report perception of rotary motion. This illusory motion was investigated by Rose & Blake previously, who termed the phenomenon the omega effect [10]. The omega effect is a classic example of paternicity, the tendency of the brain to find meaningful patterns in meaningless noise [11]. Two important results concerning this effect are reported in this paper. First, although the display of random dots evokes perception of rotary motion, the direction of motion perceived does not depend on what dot pattern is shown. Second, the time interval between spontaneous flips in perceived direction is lognormally distributed (mode≈2 s).

It may be worthwhile to mention some aspects of the “Materials & Methods” in this paper that are distinct from the traditional psychophysics paradigm. In the experiments described here, each trial has a 60 s duration. During this time, the direction of rotation changes randomly and the observer is faced with the task of continuously tracking the direction of rotation. Observer performance is calculated by cross correlating observer response with actual direction of rotation. The maximum value of the normalized cross correlation function denoted by χ is taken to be a measure of observer performance. This method is distinct from traditional psychophysics paradigm, in which the display is shown to observer for fraction of a second, and the observer has to judge if motion was perceived CW or CCW. After many trials the confusion matrix and d′ is calculated [12]. The reason for the new method is none, except that it naturally occurred to me. Also it is my opinion that sub-second trial duration may not provide enough time for visual system of observer to reach steady state. One would expect that trial duration should be such that the percent correct and d′ should be independent of trial duration. This can only happen if the system is in steady state. A side-benefit of the new method is that it enables the calculation of reaction time of the observer. This is the time delay at which the normalized cross correlation function reaches its maximum value. It is found that for most observers, reaction time ranges from 0.5–2 s depending on how easy it is to detect motion in the display.

In summary, the paper can be said to have three main contributions:

It presents a psychophysical performance characterisation of Watson & Ahumada's model of visual motion sensing. The model is found to provide a good fit to the experimental data for most, but not all, stimulus parameters.
It shows that for accurate estimation of rotary motion in a display, it is not necessary that the center of rotation be accurately known. This may explain the fact that rotary motion sensitive cells found in MST/MSTd areas of the brain are insensitive to where in their receptive field rotation occurs.
It presents two experimental findings concerning the omega effect. First, observer response is irreproducible. Second, the time interval between spontaneous flips in perceived direction is lognormally distributed (mode≈2 s). These findings suggest the omega effect fits in the category of a typical bistable illusion, and therefore the processes that give rise to this illusion may be the same processes that underlie much of other bistable phenomenon.

Previous Work

Visual motion perception has been a heavily researched topic and hence this paper will necessarily limit itself to a discussion of the most relevant work. Reviews reflecting the state-of-the-art in this area can be found in [13], [14], [15], [16]. Three seminal models of visual motion perception were proposed by Adelson and Bergen (1985), van Santen and Sperling (1985), and Watson and Ahumada (1985) [1], [2], [3]. Central to the Adelson Bergen & Watson Ahumada models is the concept that the entire power spectrum of an image undergoing coherent translation lies on a plane in the spatiotemporal frequency domain [17]. Determining this plane is therefore equivalent to determining the motion of the image. In its original form the Adelson Bergen motion detector is limited to detecting motion in 1D. Its extension to 2D was provided by Heeger (1987), Simoncelli and Heeger (1998) [18], [19]. The model has been refined further in Rust, Mante, Simoncelli, and Movshon (2006) where it is shown that it can capture the full range of pattern motion selectivity found in MT [20]. Emerson, Bergen, and Adelson (1992) did a study in which it was shown that the responses of V1 complex cells from cat's striate cortex were well fitted by the Adelson Bergen model [21]. Moreover, cell responses were found to be inconsistent with the van Santen and Sperling model. Cells sensitive to rotary motion have been discovered in areas MST/MSTd of the brain [22], [23], [9]. These cells have large receptive fields compared to cells in V1/MT. Also, they are not sensitive as to where in their receptive field rotation occurs, a phenomenon termed position invariance [9].

Random dot kinematograms (RDKs) have been widely used in studies of visual motion perception [6], [7], [8], [10], [24], [25], [26], [27], [28]. Newsome & Pare (1998) have remarked that random dot displays are useful because they stimulate primary motion sensing mechanisms while minimizing familiar positional cues [24]. Newsome, Britten, & Movshon (1989) found that a dot correlation of at least six percent is required for monkeys to be able to detect motion in RDKs undergoing translational motion [25]. The present study gives a similar result for human observers. The effect of time-sampled displays on motion perception has been previously researched by Morgan (1980), Watson, Ahumada, & Farrell (1986) [29], [30]. Williams & Sekuler (1984) had studied the effect of dot density on observer performance [8]. They formulated the probability of mismatch formula according to which observer performance should decrease with increase in dot density, a view challenged by the present paper.

A special case of the racetrack is the omega effect, in which a display of dynamic uniformly distributed random dots in a circular annulus evokes perception of illusory rotary motion. This phenomenon was discovered by Rose & Blake (1998) [10] although they trace its origin to as far back as Mackay (1965) [31]. Recently several papers studying illusory motion from Glass patterns have appeared in the literature [32], [33], [34], [35]. Motion perception in such cases, where the spatial form of the stimulus is believed to guide motion perception, has generally been termed as implied motion in order to distinguish it from real motion, in which the display itself contains non-zero motion energy. Geisler (1999) had suggested motion streaks as providing a spatial cue that guides motion perception [36]. Barlow & Olshausen (2004) have explained the phenomenon of motion streaks and flow seen in Glass patterns by pointing out that the power spectrum of a motion blurred image or a Glass pattern exhibits strong anisotropy, which is a characteristic property of a moving image, and therefore excites the mechanisms that normally detect the distortions of local power spectrum caused by motion [37]. It is to be noted that the omega display does not display the anisotropy in power spectrum associated with Glass patterns, yet rotary motion is seen in it.

Materials and Methods

The experimental stimulus used in this study is termed racetrack. Three movies of the stimulus are included with this paper. A Java applet is also available online at http://purl.oclc.org/NET/racetrack. The racetrack stimulus consists of a random dot pattern displayed in a circular annulus. The dot pattern is refreshed periodically. A certain fraction c of the dots, referred to as correlated dots, are rotated by an angle θ in the next frame. The remaining dots have their positions generated randomly and uniformly in Cartesian (x,y), and are representative of noise. The algorithm for generating dots is such that if a dot is correlated in the present frame, it is guaranteed not to be correlated in the next frame. This eliminates the appearance of multiple dot trajectories, and thus the only motion cues in the stimulus are two dot apparent motion cues. Observers see a swarm of dots that appears to rotate clockwise (CW) or counter-clockwise (CCW). The direction of rotation changes randomly according to the polarity of a coin that flips every 3 s. They are instructed to click the left mouse button for CCW motion, and the right mouse button for CW motion. By cross correlating the observer response with the actual direction of rotation of the correlated dots, an estimate of observer performance and reaction time denoted by χ and τ respectively is obtained. This process is illustrated in Figure 1. χ is defined as the maximum value of the normalized cross correlation function. τ is the time delay at which χ occurs. A χ value of 1 indicates perfect detection of the embedded motion. At c = 0 the observer response can still be cross correlated with the input signal, which would have dictated the rotation of correlated dots if there were any in the stimulus. The χ value obtained in this case reflects chance, or zero, detectability of embedded motion. Response reproducibility is quantified by cross correlating observer response curves in response to the same stimulus in multiple trials.

Download:

Figure 1. The dotted curve is the motion generated by the computer, and the solid curve is the motion reported by the observer.

χ is the maximum value of the normalized cross correlation function and τ is the time delay at which χ occurs.

https://doi.org/10.1371/journal.pone.0004536.g001

Definitions and default values of some parameters are as follows: dot correlation c = number of correlated dots/total number of dots; frame duration fd = length of time for which a frame stays on screen, default = 30 ms; dot density dd = dots per unit area, default = 5 dots/degrees²; angle of rotation θ = angle by which correlated dots are rotated, default = 5°, the spatial hop size of a correlated dot = rθ where r is distance of dot from center, θ in radians; inner circle diameter ic = angle subtended by inner circle diameter at the eye, default = 7°; outer circle diameter oc = angle subtended by inner circle diameter at the eye, fixed at 10° in all experiments; dot diameter = 5′, fixed in all experiments; duration of a trial = 60 s. Stimuli were displayed on a NEC MultiSync FP1370 22″ (20″ viewable image size) CRT monitor with display resolution = 640×480@100 Hz; black dots (luminance≈0) against a background luminance of 10.8 cd/m² were displayed; viewing distance = 1.65 m. The range and default values of some parameters is summarised in Table 1.

Download:

Table 1. Default values and range of various parameters used in experiments.

https://doi.org/10.1371/journal.pone.0004536.t001

The study was conducted over a period of several years, and new observers were recruited as old ones dropped out. In all experiments the number of observers is at least four, and number of trials ≥20 for each data point shown in the figures. Error bars in the figures equal one standard deviation (s.d.), unless otherwise stated. Custom software was written by the author in C# to generate the stimuli. The study was approved by Committee for Protection of Human Subjects (CPHS), UC Berkeley. Written informed consent was obtained from subjects.

Model Description

The following steps and Figure 2 describe the complete pipeline used to model observer responses to the racetrack:

Step 1: Stimulus is input to the Watson Ahumada motion detector, which at time t gives the instantaneous optical flow.
Step 2: The optical flow is easily converted into a measure of rotary motion signal by taking cross products with radial vector, followed by weighted averaging. The weights are obtained in Step 1; for each velocity estimate the Watson Ahumada detector is able to provide a confidence/error measure which is used as the weight. Area MST(d) in the brain is believed to carry out this type of processing, where the local motion signals from stage MT are pooled to estimate global patterns of rotation and expansion that guide in heading estimation [9], [38], [39]. The output of this step is denoted by e(t).
Step 3: The human visual system must integrate information over a certain interval of time to compute a reliable estimate of motion. This is achieved by passing e(t) through a moving averages filter, with window size of half a second. The output of this step is denoted by I(t).
Step 4: While doing psychophysical experiments with human observers, the only information available is the direction in which the observer is perceiving motion. Therefore in order to compare model response with experimental psychophysics, I(t) is passed through a level crossing detector (LCD) with thresholds ±B. B = 2σ(I) at c = 0 under default parameters. This choice of B makes the events when I(t) may cross detection threshold, given there is no rotary motion in the stimulus, unlikely. The behavior of level crossing detector is as follows: when input crosses +/−B the detector signals CCW/CW motion respectively, and continues to do so until the input crosses threshold in the opposite direction. When that happens, the LCD flips to the opposite state.

Download:

Figure 2. (a) Block schematic of the model (b) Optical flow output by Watson Ahumada motion detector (c) Model response at various other stages in the pipeline.

https://doi.org/10.1371/journal.pone.0004536.g002

χ, τ can now be computed for the model, and values compared to experimental psychophysics. In its present form the model is strictly deterministic. However, the human visual system necessarily exhibits some variability, characteristic of any real world physical system. In fact as shown in the results section, it is found that at c = 0 observer responses are not reproducible. This variability in response is incorporated into the model by adding Gaussian White Noise (GWN) n(t) to e(t), until the model response reproducibility also drops to zero at c = 0. This occurs for k = σ(n(t))/σ₀≥approximately six, where σ₀ stands for σ(e) at c = 0 under default parameters. Accordingly k was fixed at six. Model simulations were run at a resolution of 128×128 pixels, unless otherwise stated. Circular dots in psychophysical experiments were approximated as squares of equal area in model simulations. In all results, Watson Ahumada sensors are tuned to a center frequency of 0.64 cycles/degrees, unless otherwise stated. The reason for this setting is that it gave acceptable results. It may also be noted that most of the energy in power spectra of natural images is concentrated at relatively low spatial frequencies [40]. Default values of model parameters are summarised in Table 2. My source code for the Watson Ahumada component of the model is publicly available [41].

Download:

Table 2. Default values of model parameters used in simulations.

https://doi.org/10.1371/journal.pone.0004536.t002

Results

We begin with a discussion of the omega effect (c = 0 case of racetrack), and present two important results. First, although the display triggers perception of rotary motion, the direction of motion perceived is not dependent on what dot pattern is shown. Second, the time interval between spontaneous flips in direction exhibits a lognormal distribution.

Omega effect: Response Reproducibility and distribution of spontaneous flips in perceived direction

As mentioned earlier, the omega effect is the c = 0 case of the racetrack. About two-thirds of observers report perception of rotary motion at c = 0, even though there is no motion embedded in the stimulus [42]. The perceived direction of motion changes randomly from time to time. After prolonged viewing most observers can usually choose the perceived direction of motion at will. For some observers the direction of motion switches when a sudden attention grabbing stimulus is given (such as a sudden tap on the back of the head). Some observers have even remarked that mere pressing of a mouse button causes the perceived direction of motion to reverse.

An important characteristic of the omega effect is that an observer gives different responses to the same stimulus in multiple trials. This is quantified in the following way. The observer response curves in response to the same stimulus in two separate trials are cross correlated. Let ζ denote the maximum value of the normalized cross correlation function. ζ is taken to be the measure of response reproducibility. It is found that the value of ζ when the same stimulus is shown in multiple trials is no different than the value of ζ when different stimuli are shown in multiple trials. Thus, the response reproducibility of the omega effect is zero. This may happen because the display is inherently ambiguous like most, if not all, bistable illusions. Both interpretations are equally likely and the brain randomly chooses a configuration at any time instant. It is found that ζ = μ₁±σ₁ = 0.145±0.1048 (mean±s.d.) based on 47 trials in which same stimulus is shown from trial to trial. Further, ζ = μ₂±σ₂ = 0.118±0.1359 based on 67 trials in which different stimuli are shown in multiple trials. One sided t-test to test the null hypothesis μ₁ = μ₂ against the alternate hypothesis μ₁>μ₂ gives t = 1.196. At α = 0.05 level of significance the null hypothesis cannot be rejected (P value = 0.1158).

The foregoing discussion has shown that the reproducibility of response is zero for the omega effect (c = 0). However, intuitively we expect if c is not zero, i.e., some dots are deliberately correlated to undergo rotary motion, then observers should start responding in direction of motion of correlated dots. Figure 3 shows the response reproducibility increases with c as expected (ζ = 1 reflects perfect reproducibility).

Download:

Figure 3. Response reproducibility ζ vs. c.

Both model and humans show zero reproducibility at c = 0, and the reproducibility steadily increases with c, because the motion signal gets stronger.

https://doi.org/10.1371/journal.pone.0004536.g003

Figure 4(a) shows the histogram of the inter flip interval (IFI), which is the time interval between spontaneous reversals in perceived direction of motion, at c = 0. The mode of the histogram occurs at IFI≈2 s. The histogram is well approximated by a lognormal distribution which is evident in Figure 4(b), where the pdf (probability density function) of ln(IFI) is plotted together with a Gaussian fit. The IFI of many bistable illusions is lognormally distributed. Such distributions are common in biology, and one way to interpret them is in terms of the noise driven motion of a state point [43].

Download:

Figure 4. (a) histogram of Inter Flip Interval (IFI) at c = 0 (mode≈2 s) (b) normalised histogram of ln(IFI) together with a Gaussian fit.

The pdf of ln(IFI) given by the model is also shown.

https://doi.org/10.1371/journal.pone.0004536.g004

The mechanisms underlying omega effect are not clear. When dots are displayed in a circular annulus their freedom of movement is restricted. The dots at the boundary cannot move in all 360° directions. In the limit when the annulus width tends to zero, the dots can only move tangentially. This suggests an increase in the omega effect with decrease in annulus width which is experimentally true [10]. When the annulus has appreciable width the dots at boundary are more likely to bounce normal to the boundary. Some observers do report perception of a radial pulsating motion in the omega display [42]. Rose&Blake (1998) postulated that the omega effect arises because of interaction between cells that are sensitive to curvature in the display, and cells that are sensitive to motion [10]. For its part, the Watson Ahumada model outputs a zero mean white noise like signal in response to the omega display, since there are dots bouncing off in all directions randomly. This signal combined with the intrinsic noise n(t) (which at c = 0 is six times stronger than the Watson Ahumada signal) results in rapid zero mean fluctuations. Because of their stochastic nature, these fluctuations become large enough at times to cross the LCD thresholds. The IFI distribution resulting from such stochastic fluctuations is also shown in Figure 4(b) for comparison.

Effect of dot correlation c

Figure 5(a) shows variation of signal detectability χ vs. the dot correlation c. A χ value of 1 means perfect detection, and χ at c = 0 reflects the baseline zero level of χ corresponding to chance detectability. The increase in χ with c is easy to understand, as the value of c directly controls the amount of motion embedded in the stimulus. As can be seen from the figure, the model fits the experimental data very closely. If the threshold for motion perception is defined as the value of c for which χ is one standard deviation higher than the χ value at c = 0, then this gives a threshold of c in the range of 0.03 to 0.06. This is comparable to thresholds reported elsewhere [24], [25]. The experimental method described in this paper allows the measurement of the reaction time τ of an observer. Figure 5(b) shows a graph of the reaction time τ vs. c. For c≈0, τ is about 1.5 s, and decreases steadily with increase in c. It takes less time to recognize the motion signal as the signal gets stronger. At high values of c, τ is about half a second. The model is seen to fit the experimental data well. In general χ and τ are inversely correlated as shown in Figure 5(c). Parameters that tend to increase χ tend to decrease τ and vice-versa.

Download:

Figure 5. (a) χ vs. c (b) τ vs. c (fd = 30 ms, ic = 7°, dd = 5 dots/degrees²) (c) τ vs. χ scatter plot and piecewise linear fit for experimental data.

https://doi.org/10.1371/journal.pone.0004536.g005

Effect of frame duration fd

Figure 6 shows that fd≈30 ms is optimal for motion perception. The same sequence of frames that evoke perception of vivid motion at fd≈30 ms, fail to evoke any perception of motion at fd ≫30 ms. The explanation proposed for the fd effect seen here is as follows. The motion computed by local motion detectors at time t is based on the spatiotemporal signal from time t-T to time t, where T≈200 ms is the temporal size of receptive fields of simple/complex cells found in the primary visual cortex [44]. When fd is too large the input is mostly constant within a window of 200 ms and so motion sensitive cells will fail to detect any motion. On the other hand, if fd is too small the input may be changing at a rate that the cells cannot handle. The bandwidth of the stimulus, viewed as a continuous-time signal, is directly proportional to the rate at which the individual racetrack frames are played. When fd is too low, the correlated dot will stay in the receptive field (RF) of a motion sensitive cell for a very brief interval of time, and will not excite the spatiotemporal RF profile of the cell.

Download:

Figure 6. χ vs. frame duration fd.

c = 0.1, ic = 7°, dd = 5 dots/degrees².

https://doi.org/10.1371/journal.pone.0004536.g006

The model results are close to that of experiment, except for the χ values at fd = 10 ms. This may be because of the high bandwidth of neurons used in the model. It is interesting to note that without noise, χ at fd = 100 ms is at the baseline zero level. If noise is added, χ rises above zero level, and matches value given by human observers. This is reminiscent of the beneficial effect noise may sometimes play in a system, by stochastically boosting a subthreshold signal in the manner of stochastic resonance [45], [46], [47].

Effect of dot density dd

Figure 7 shows the effect of varying the dot density dd in the display. Humans display a remarkable indifference to the dot density in the display. This shows that it is the relative proportion of the correlated dots that matters, not their absolute number. The experimentally observed independence of observer performance on dot density cannot be explained by models of motion perception based on matching dots or features to their nearest neighbors in the next frame [6], [7]. Such models display a marked dependence on dot density in the display according to the probability of mismatch formula [8]. As the dot density increases there are more dots per unit area, and the chances that the nearest neighbor is not the correlated partner increase.

Download:

Figure 7. χ vs. dot density dd.

c = 0.2, ic = 7°, fd = 30 ms. Model simulations done at 256×256 pixel resolution for dd>10.

https://doi.org/10.1371/journal.pone.0004536.g007

A derivation of the probability of mismatch formula follows. A correlated dot is displaced by a distance h in the next frame. A nearest neighbor model operates by matching dots to their nearest neighbors in the next frame. The matching directly gives the local motion vectors, which is the output of the model. Therefore, for the correlated dot to be matched correctly to its partner, no dot should fall within a circle of radius h in the next frame. The probability of this happening, which is equal to the probability of no mismatch equals:where A is the area of display, N is total number of dots in the display, N = A·dd, and we assume that dots are uniformly distributed.

Approximating 1−x as exp(−x) for x sufficiently small, and substituting A·dd for N,therefore,(1)which the formula given in Williams & Sekuler (1984) [8]. The probability of mismatch values for the dot density range used in Figure 7, are tabulated in Table 3. This formula makes it explicit that as the dot density increases, there would be more and more mismatches, and therefore observer performance should decrease with increase in dot density. In reality, however, observer performance is independent of dot density in the display. The Watson Ahumada motion detector is able to capture this independence as shown in Figure 7.

Download:

Table 3. Probability of mismatch values for dot densities in Figure 7.

https://doi.org/10.1371/journal.pone.0004536.t003

It may be noted that if some of the assumptions leading to the formula in equation (1) do not hold, the analytic form of P(mismatch) may no longer be given accurately by 1−exp(−πh²dd). However, the central thesis of the formula that observer performance should decrease with dot density will still remain true. This is because as the dot density increases, there are more dots per unit area, and therefore the expected separation between dots would decrease. When the expected dot separation becomes less than the hop size h, the matching would be dominated by mismatches, and performance would decline. It may be appropriate here to remark on the study of Grzywacz, Watamaniuk and McKee (1995) (figure 1 in their paper) [48]. It appears to me that the authors correctly simulated the Adelson Bergen model and found that it is insensitive to dot density. However, they concluded incorrectly, misguided by the probability of mismatch formula, that psychophysical results should depend on dot density.

Effect of spatial hop size h

The hop size is the amount of displacement given to the correlated dots. By default the correlated dots are rotated by an angle of 5°. With ic = 7°, and angle subtended by outer circle fixed at 10°, this translates to average displacement of visual angle on the eye. Figure 8 shows the effect of varying the hop size for the model and humans, at different dot densities. The correlated dots were rotated by angles of {1,5,10,15,20} degrees, corresponding to average displacements of {0.074, 0.37, 0.74, 1.11, 1.48} degrees visual angle on the eye. The curves for the model and humans are approximately similar. Note in particular that changing dot density does not produce any change in χ. The figures show that as the hop size is increased, motion disappears in the display even though the dot correlation is very high (c = 0.4). This is because if the hop size becomes greater than the RF size, motion sensitive neurons will fail to register motion. Also important is the decrease in χ if the hop size becomes too small. In this case, the spatiotemporal profile of the stimulus will not cross-correlate well with the spatiotemporal RF of motion sensitive cells.

Download:

Figure 8. (a) χ vs. hop size h for human observers (b) χ vs. hop size for model.

c = 0.4, fd = 30 ms, ic = 7°.

https://doi.org/10.1371/journal.pone.0004536.g008

Effect of inner circle diameter ic

The angle subtended by the outer circle diameter is fixed at 10° in all the experiments. Figure 9 shows the effect of varying the angle subtended by the inner circle diameter ic (c = 0.1). It is seen that observer performance falls off as the angle subtended by the inner circle diameter ic is changed from 7° to 9.5°. At ic = 9.5° the annulus is very thin, and appears like a 1D ring rather than a 2D annulus. When ic is small, the noise in the display is uniformly distributed in the sense that if θ is the angle made by the noise vector, then θ is uniformly distributed from −π to +π. Denote the cross product of noise vector with the radial vector by x = sin(θ). Then , and the amount of noise is given by

Download:

Figure 9. χ vs. angle subtended by inner circle diameter ic.

Angle subtended by outer circle diameter is fixed at 10°. c = 0.1, dd = 5 dots/degrees², fd = 30 ms. Model simulations at 256×256 pixel resolution. f₀ denotes center frequency of Watson Ahumada sensors.

https://doi.org/10.1371/journal.pone.0004536.g009

On the other hand, when ic→oc, θ is either –π/2 or +π/2 with equal probability. is still zero butso the amount of noise has apparently doubled in this case. Model performance is seen to partially match psychophysical performance. The curve with center frequency equal to 1.28 cycles/° shows a better fit than the curve with center frequency equal to 0.64 cycles/°. Unfortunately I cannot say why the former curve shows a better fit.

Effect of reverse contrast

If the stimulus is modified such that the correlated dots flip their polarity as they rotate, meaning black dots change to white and vice-versa, then the reverse-phi motion [49], [50], [1] takes place. It is found that the motion perceived by an observer is opposite to the physical displacement of the correlated dots. If the correlated dots move CCW(CW), observer perceives motion in CW(CCW) direction respectively. The Watson Ahumada model is able to capture this phenomenon as shown in Figure 10. If the observer perceives motion in a direction opposite to rotation of the correlated dots, the observer response is negatively correlated with the embedded motion. For this reason χ in Figure 10 is defined as the minimum value of the normalized cross correlation function between the response and input function.

Download:

Figure 10. χ vs. c for contrast reversing dots.

fd = 30 ms, ic = 7°, dd = 2.5 dots/degrees².

https://doi.org/10.1371/journal.pone.0004536.g010

To understand why motion may be perceived in the opposite direction when dots reverse their contrast, consider the signal . It is well known [17] that the Fourier Transform of an image undergoing coherent translational motion lies on a plane, i.e., if then where is the 2D Fourier Transform of , and is velocity. The equation of plane is . This observation yields following algorithm to determine motion in a signal : find the best fitting plane to that passes through the origin. The velocity can be read off the equation of the plane. Now consider what happens when reverses its contrast every . The modified signal is given by , where is a square wave alternating between +1 and −1 every . The Fourier Series of is given by , with , being a constant, and . This gives . Note that . Thus, the Fourier Transform of does not lie on a plane passing through the origin. Instead, the Fourier Transform of consists of infinitely many planes given by as illustrated in Figure 11. Assuming is mostly constant, the best fitting plane to (that also passes through the origin) is ⊥ to . If is normal of the best fitting plane, then . is velocity of under reverse contrast. Letting , and we have

Download:

Figure 11. The Fourier Transform of an image undergoing coherent translational motion + periodic reverse contrast lies on infinitely many planes of the form

, with

being an odd number.

The dashed lines denote the window of visibility [30].

https://doi.org/10.1371/journal.pone.0004536.g011

This equation can be satisfied by many . In particular is a solution, which is motion in opposite direction to . Note that suggests that a faster moving particle should actually appear to move slower! This surprising prediction appears to be true within appropriate range. A display of alternating black and white stripes was made. The width of a stripe was 0.25°. The stripe pattern was translated to the right, and the stripes reversed their contrast after a time interval T. On viewing the display, motion was perceived in the leftward direction instead of right. With fd = T = 30 ms and a hop size of 0.125°, the pattern appeared to be moving slower than with hop size of 0.0625°. Further experiments need to be done to gather numerical data to quantify the effect.

Model sensitivity to center position

By definition of rotation, any measure of rotary motion has to be specified with respect to some center of rotation (more accurately the axis of rotation has to be specified). In all the results presented up till now, the center position used in the simulations was the true center of rotation of the dots. What happens if the true center of rotation is not accurately known, as must be the case in reality? Figure 12 shows a schematic in which point O is the true center relative to which the correlated dots are rotating, and point C is the center relative to which rotary motion is computed by the model. is a motion cue. The rotary motion relative to the true center O is given by , whereas the rotary motion relative to C is given by . We have

Download:

Figure 12. Point O represents the true center of rotation, whereas point C is the center relative to which rotary motion is computed by the model.

The offset is given by where is radius of inner circle.

https://doi.org/10.1371/journal.pone.0004536.g012

The condition is true in case of the racetrack. The uncorrelated dots are uniformly distributed and generate motion cues in all directions with equal probability. The correlated dots generate motion cues in tangential direction, which when summed over the entire 360° annulus add up to zero. The expected value of is thus zero. Therefore it seems accurate knowledge of position of the true center relative to which rotation occurs is not needed. Figure 13(a) shows model sensitivity to knowledge of true center position. The rotary motion is computed by the model relative to a point C that is offset from the true center O. The offset is given by where R_i is radius of inner circle. Two curves are shown: in one there is no noise added to the model, i.e., n(t) = 0, and in the other GWN equal to the default value of σ(GWN)/σ₀ = 6 is added to the model. It can be seen that the χ values are not affected much by uncertainty in knowledge of true center position, and start to deteriorate only when the offset becomes very large. This may explain the experimentally observed position invariance of MST(d) cells, the fact that the cells are insensitive to where in their RF rotation occurs [9].

Download:

Figure 13. χ vs. center relative to which rotary motion is computed (a) full 360° annulus is visible (b) only 90° of annulus is visible.

Type1 – a single 90° sector is visible. Type 2 - two sectors located diametrically opposite to each other, and each 45° in size, are visible. Both curves are for the model.

https://doi.org/10.1371/journal.pone.0004536.g013

It seems that when only a sector of the racetrack is made visible, the condition may not hold true because of the correlated dots. However, if two sectors located diametrically opposite to each other are displayed then . Figure 13(b) shows χ vs. offset for the two cases: type1 when only a single 90° sector is made visible, and type2 when two sectors located diametrically opposite to each other, and each 45° in size, are displayed. Interestingly the model is still robust enough to the offset even when only a sector of the racetrack is displayed, irrespective of whether it is type1 or type2.

Effect of displaying only a sector

Figure 14(a) shows the effect of displaying only a sector of the complete annulus on human observers. Two cases are considered. In type1, a single sector is shown that is randomly positioned. In type2, two sectors located diametrically opposite to each other, and each half the size of the sector in type1, are displayed. It is seen that χ increases monotonically as the sector size increases. It is interesting to note that there is a significant difference in χ for the two cases, even though the total area displayed is the same in the two cases. The corresponding data for the model is shown in Figure 14(b). The model shows an increase in χ with sector size. However, there is no difference between type1 and type2 for the model.

Download:

Figure 14. χ vs. sector.

In case of type 1 only one sector is displayed, whereas in case of type 2 two sectors located diametrically opposite to each other, and each half the size of sector in type 1, are displayed. (a) human performance (b) model performance.

https://doi.org/10.1371/journal.pone.0004536.g014

Effect of inserting random frames

Figure 15 shows effect of inserting K random frames between every pair of correlated frames in the stimulus. It is seen that observer performance does not fall to zero abruptly, but decreases in a graceful manner showing that the human visual system takes multiples frames into consideration when estimating motion. The model performance also does not fall to zero abruptly, but degrades much more rapidly than human performance.

Download:

Figure 15. Effect of inserting K random frames between correlated frames.

c = 0.5, fd = 30 ms, dd = 5 dots/degrees², ic = 7°.

https://doi.org/10.1371/journal.pone.0004536.g015

Dipoles

Instead of displaying dots in an annulus, each dot can be split into two dots – one black and one white forming a dipole. This results in what has been termed as the anti-Glass pattern [51]. The c = 0 case creates a powerful motion illusion that has been previously investigated [35]. The addition of dipoles introduces several new parameters:

the dipole spacing s,
the black to white intensity ratio bwir defined as where I₀, I_b, I_w are luminance of background (fixed at 10.8 cd/m²), black and white dots respectively,
the dipole orientation: tangential or radial as in Figure 16

Download:

Figure 16. (a) tangential dipoles (b) radial dipoles.

Center-to-center spacing = 17′ in both cases.

https://doi.org/10.1371/journal.pone.0004536.g016

Complex patterns of motion are perceived with dipoles in the display, e.g., if dipoles are oriented radially there is tendency to observe radial pulsating motion, even if dipoles are actually rotating with significant rotary motion. Depending upon the parameter settings, motion in opposite directions is also seen. It can become difficult to assign a single motion direction to the whole display, although there is no doubt that there is motion in it. Let RC (reverse contrast) ON denote the setting that if a dipole is correlated, then black changes to white and vice-versa in the next frame. With RC ON, the perception of motion can switch from normal phi to reverse phi depending upon dipole spacing. This section reports results of an experiment investigating χ vs. bwir with center-to-center spacing equal to six minutes, c = 0, and RC ON. The results are summarised in Figure 17. As can be seen the model is able to capture some aspects of psychophysical behavior, but not all of it. In Figure 17, the definition of χ is modified as follows. Let χ₊ denote maximum value of normalized cross correlation function, and χ₋ denote minimum value of normalized cross correlation function. If |χ₊|>|χ₋|, χ = χ₊, otherwise χ = χ₋. When |χ₊|>|χ₋|, observer perceives motion in the direction of displacement of correlated dots and therefore χ is defined to be equal to χ₊ in this case. By similar reasoning, χ is defined to be χ₋ for the other case.

Download:

Figure 17. χ vs. bwir (black to white intensity ratio).

Dipole spacing = 6′, c = 0.5, dd = 2.5 dots/degrees², RC ON, (a) human observers (b) Watson Ahumada model with simulations at 256×256 pixel resolution.

https://doi.org/10.1371/journal.pone.0004536.g017

Discussion

Although this paper shows that the Watson Ahumada motion detector does a good job at detecting motion in random dot kinematograms (RDKs) consonant with human psychophysical performance, it remains to be seen how well it would perform on real world imagery. The challenge here is that although it is straightforward to run the model on real world test cases, how do we accurately measure the optical flow perceived by humans on these test cases? Computer vision papers characterise optical flow performance of a model by either using synthetic image sequences designed to mimic the real world, or using real world image sequences in which the motion of the camera is carefully calibrated [52]. However, as we have already seen in this paper: 1. the same sequence of image frames can produce different perception depending on frame rate, 2. the human visual system takes multiple frames into consideration when determining motion. In the light of these remarks, it is not immediately obvious what the ground truth optical flow (ground truth being defined as the flow perceived by a human) would be for the test cases mentioned above. These caveats should be borne in mind while attempting a performance characterisation of the Watson Ahumada model using the computer vision paradigm. I have placed some preliminary work running the Watson Ahumada model on real world imagery online as a proof of concept [41].

The neurophysiological plausibility of a model is likely to attract heavy debate. Krekelberg (2008) has provided a comprehensive discussion on the biological plausibility of the Reichardt, Adelson Bergen, and gradient based motion detectors [13]. With respect to the Watson Ahumada model, DeAngelis et. al. (1995) and others have found that the Watson Ahumada filters provide an accurate model of simple cell receptive fields (RFs) [44]. Quoting DeAngelis et. al. [53]:

“Rather, simple cell RFs in the joint space-time domain appear to be fit well by a model first proposed by Watson and Ahumada … Based on the Watson-Ahumada formulation, we have modelled space-time RFs of simple cells, as the weighted sum of two space-time separable subunits in a quadrature relationship. This model formulation provides a remarkably good fit to the data from most cells, regardless of their degree of space-time inseparability … In conclusion, to account for space-time RFs of simple cells that differ widely in the degree of space-time inseparability, at least two separable subunits appear necessary as modelled by Watson and Ahumada.”

Although there are similarities between the Watson Ahumada motion detector and the Adelson Bergen motion detector, which is usually the de facto motion detection mechanism used in studies of visual motion perception, there are also some differences. The Adelson Bergen motion detector measures how much power the stimulus has within a spatiotemporal frequency band. Thus a detector tuned to (ω_x0,ω_y0,ω_t0) effectively samples the power spectrum of the stimulus within the vicinity of (ω_x0,ω_y0,ω_t0). Such detectors have been proposed as models of V1 complex cells [1], [21]. The responses of multiple such detectors tuned to different spatiotemporal frequencies are pooled to determine the best fitting plane in the frequency domain [18], [19]. The best fitting plane defines the motion of the stimulus [17]. This processing, although still debatable, is believed to occur in MT. In contrast, with respect to the Watson Ahumada motion detector, information about the motion of the stimulus is encoded in the (most dominant) temporal frequency of oscillation of detector response as per the equation:

The temporal frequencies of oscillation of different detectors tuned to different (ω_x,ω_y) are measured, and then above relationship is used to determine the motion of the stimulus. The neural locus of the stages that perform this computation is unclear. Also unclear is the relationship of the model to what we do know about motion processing in the brain beyond the first stage of spatiotemporal filtering. For example, the model does not state how simple V1 neuron outputs could be combined to generate speed tuned V1 complex and MT cells [54], [55]. Perrone (2005) has put forward a model that explains how the magnitude of the Fourier transform of simple V1 neuron responses can be combined to generate the magnitude of the Fourier transform of a speed tuned neuron [56]. The input V1 neurons that Perrone's model uses are based on the Watson Ahumada filters.

It may be worthwhile to mention that the Watson Ahumada model has been proposed as a model of primary motion sensing mechanisms, what Cavanagh (1991) called passive motion detectors in his paper [57]. The human visual system is a complex parallel distributed system in which modules interact with each other and do not function in isolation, e.g., it is widely accepted now that motion perception interacts with form perception, a view that was not always held in this field. The interactions between modules can give rise to phenomenon that cannot be explained by either module alone. Benton, O'Brien, & Curran (2007) have recently provided example of a fractal rotation stimulus in which rotation is perceived within any arbitrary window applied to the stimulus [58]. The authors assert that the fact that observers can readily perceive fractal rotation is a clear example of a stimulus in which motion extraction is dependent upon the prior analysis of some spatial property (which happens to be the orientation in case of fractal rotation). The omega effect itself is believed to occur because of interactions between form and motion processing circuits in the brain. Although there are growing examples of such stimuli that point to interactions between form and motion, little is known about how these interactions occur. To my knowledge no quantitative model has been put forward to explain these interactions.

In conclusion, the contribution of this paper is to present a performance characterisation of the Watson Ahumada model of human visual motion sensing. The model performance is seen to match human performance with respect to most parameters. It is able to explain some key and important parts of the psychophysical data such as independence of observer performance to dot density in the display, and decrease of observer performance with frame duration of the display. The model insensitivity to the center position relative to which rotary motion is computed, together with the vector analysis presented in the paper, may explain the experimentally observed position invariance of MST(d) cells. In addition, this paper shows that the omega effect of Rose & Blake (1998) is a truly bistable illusion. Although the display of random dots triggers perception of rotary motion, the direction of motion perceived is independent of what dot pattern is shown. The time interval between spontaneous reversals in perceived direction is lognormally distributed as is the case for most bistable illusions. Therefore the processes that give rise to this illusion may be the same processes that underlie much of other bistable phenomenon.

Supporting Information

Movie S1.

A movie of the racetrack for c = 0.

https://doi.org/10.1371/journal.pone.0004536.s001

(8.63 MB AVI)

Movie S2.

A movie of the racetrack for c = 0.3.

https://doi.org/10.1371/journal.pone.0004536.s002

(8.02 MB AVI)

Movie S3.

A movie of the racetrack for c = 0.5.

https://doi.org/10.1371/journal.pone.0004536.s003

(8.94 MB AVI)

Acknowledgments

This work was done when I was a graduate student in Donald A. Glaser's lab at UC Berkeley. I thank him and T. Kumar for the financial support and help which made this work possible. In particular Donald A. Glaser is credited with naming the stimulus described in this paper as the racetrack, and for introducing me to the c = 0 case, which he invented to model the illusory motion perceived in Leviant's Enigma [59]. T. Kumar provided guidance in various aspects of this research, especially the planning of the psychophysical experiments and recruitment of subjects. I thank my friend David Gelbart for proof-reading this paper.

This paper is dedicated to Pink Floyd → The Division Bell → High Hopes.

Author Contributions

Conceived and designed the experiments: SJ. Performed the experiments: SJ. Analyzed the data: SJ. Wrote the paper: SJ.

References

1. Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion. J Opt Soc Am A 2(2): 284–299.
- View Article
- Google Scholar
2. van Santen JPH, Sperling G (1985) Elaborated Reichardt detectors. J Opt Soc Am A 2(2): 300–321.
- View Article
- Google Scholar
3. Watson AB, Ahumada A (1985) Model of human visual-motion sensing. J Opt Soc Am A 2(2): 322–342.
- View Article
- Google Scholar
4. Johnston A, McOwan PW, Buxton H (1992) A computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells. Proc Roy Soc B 250: 297–306.
- View Article
- Google Scholar
5. Perrone JA, Thiele A (2002) A model of speed tuning in MT neurons. Vis Res 42(8): 1035–51.
- View Article
- Google Scholar
6. Ullman S (1979) The Interpretation of Visual Motion. MIT Press.
7. Dawson MRW (1991) The How and Why of What Went Where in Apparent Motion: Modeling Solutions to the Motion Correspondence Problem. Psychological Review 98(4): 569–603.
- View Article
- Google Scholar
8. Williams DW, Sekuler R (1984) Coherent Global Motion Percepts from Stochastic Local Motions. Vis Res 24(1): 55–62.
- View Article
- Google Scholar
9. Graziano MSA, Andersen RA, Snowden RJ (1994) Tuning of MST Neurons to Spiral Motions. J Neurosci 14(1): 54–67.
- View Article
- Google Scholar
10. Rose D, Blake R (1998) Motion Perception: From Phi to Omega. Proc Roy Soc B 353(1371): 967–980.
- View Article
- Google Scholar
11. Shermer M (2008) Patternicity. Sci Am 299(6): 48.
- View Article
- Google Scholar
12. Green DM, Swets JA (1966) Signal detection theory and psychophysics. New York: Wiley.
13. Krekelberg B (2008) Motion detection mechanisms. In: Allan Basbaum, editor. The Senses: A Comprehensive Reference. Oxford: Elsevier Inc.
14. Born RT, Bradley DC (2005) Structure and function of visual area MT. Ann Rev Neurosci 28: 157–189.
- View Article
- Google Scholar
15. Derrington AM, Allen HA, Delicato DS (2004) Visual mechanisms of motion analysis and motion perception. Ann Rev Psych 55: 181–205.
- View Article
- Google Scholar
16. Grzywacz NM, Merwine DK (2003) Neural basis of motion perception. Encyclopedia of Cognitive Science 3: 86–98.
- View Article
- Google Scholar
17. Watson AB, Ahumada A (1983) A look at motion in the frequency domain. Technical Report 84352, NASA Technical Memorandum.
18. Heeger DJ (1987) Model for the extraction of image flow. J Opt Soc Am A 4(8): 1455–1471.
- View Article
- Google Scholar
19. Simoncelli EP, Heeger DJ (1998) A model of neuronal responses in visual area MT. Vis Res 38(5): 743–761.
- View Article
- Google Scholar
20. Rust NC, Mante V, Simoncelli EP, Movshon JA (2006) How MT cells analyze the motion of visual patterns. Nat Neurosci 9(11): 1421–1431.
- View Article
- Google Scholar
21. Emerson RC, Bergen JR, Adelson EH (1992) Directionally selective complex cells and the computation of motion energy in cat visual cortex. Vis Res 32(2): 203–218.
- View Article
- Google Scholar
22. Tanaka K, Saito H (1989) Analysis of Motion of the Visual Field by Direction, Expansion/Contraction, and Rotation cells clustered in the Dorsal Part of the Medial Superior Temporal Area of the Macaque Monkey. J Neurophysiol 62(3): 626–641.
- View Article
- Google Scholar
23. Sakata H, Shibutani H, Ito Y, Tsurugai K, Mine S, Kusunoki M (1994) Functional properties of rotation-sensitive neurons in the posterior parietal association cortex of the monkey. Exp Brain Res 101(2): 183–202.
- View Article
- Google Scholar
24. Newsome WT, Pare EB (1988) A Selective Impairment of Motion Perception following lesions of the Middle Temporal Visual Area MT. J Neurosci 8(6): 2201–2211.
- View Article
- Google Scholar
25. Newsome WT, Britten KH, Movshon JA (1989) Neuronal Correlates of a perceptual decision. Nature 341(6237): 52–54.
- View Article
- Google Scholar
26. Barlow H, Tripathy SP (1997) Correspondence Noise and Signal Pooling in the Detection of Coherent Visual Motion. J Neurosci 17(20): 7954–7966.
- View Article
- Google Scholar
27. Scase MO, Braddick OJ, Raymond JE (1996) What is noise for the motion system? Vis Res 36(16): 2579–2586.
- View Article
- Google Scholar
28. Watamaniuk SNJ, McKee SP, Grzywacz NM (1995) Detecting a Trajectory Embedded in Random-direction Motion Noise. Vis Res 35(1): 65–77.
- View Article
- Google Scholar
29. Morgan MJ (1980) Analogue models of motion perception. Proc Roy Soc B 290(1038): 117–135.
- View Article
- Google Scholar
30. Watson AB, Ahumada A, Farrell JE (1986) Window of Visibility: a psychophysical theory of fidelity in time-sampled visual motion displays. J Opt Soc Am A 3(3): 300–307.
- View Article
- Google Scholar
31. MacKay DM (1965) Visual Noise as a tool of Research. J Gen Psychol 72: 181–197.
- View Article
- Google Scholar
32. Ross J, Badcock DR, Hayes A (2000) Coherent global motion in the absence of coherent velocity signals. Curr Biol 10: 679–682.
- View Article
- Google Scholar
33. Krekelberg B, Dannenberg S, Hoffmann KP, Bremmer F, Ross J (2003) Neural correlates of implied motion. Nature 424: 674–677.
- View Article
- Google Scholar
34. Krekelberg B, Vatakis A, Kourtzi Z (2005) Implied motion from form in the Human Visual Cortex. J Neurophysiol 94: 4373–4386.
- View Article
- Google Scholar
35. Viva MMD, Gori M, Burr DC (2006) Powerful Motion Illusion Caused by Temporal Asymmetries in ON and OFF Visual Pathways. J Neurophysiol 95: 3928–3932.
- View Article
- Google Scholar
36. Geisler WS (1999) Motion streaks provide a spatial code for motion direction. Nature 400: 65–69.
- View Article
- Google Scholar
37. Barlow H, Olshausen B (2004) Convergent evidence for the visual analysis of optic flow through anisotropic attenuation of high spatial frequencies. J Vis 4: 415–426.
- View Article
- Google Scholar
38. Perrone JA, Stone LS (1994) A model of self-motion estimation within primate extrastriate visual cortex. Vis Res 34(21): 2917–2938.
- View Article
- Google Scholar
39. Duffy CJ, Wurtz RH (1995) Response of Monkey MST Neurons to Optic Flow Stimuli with Shifted Centers of Motion. J Neurosci 15(7): 5192–5208.
- View Article
- Google Scholar
40. van der Schaaf A, van Hateren JH (1996) Modeling the Power Spectra of Natural Images: Statistics and Information. Vis Res 36(17): 2759–2770.
- View Article
- Google Scholar
41. Jain S (2007) Optical Flow or Motion Estimation using the Watson Ahumada (WA) algorithm. http://www.codeproject.com/KB/audio-video/Optical_Flow_Estimation.aspx.
42. Anonymous survey available at http://purl.oclc.org/NET/survey.
43. Riani M, Simonotto E (1994) Stochastic Resonance in the Perceptual Interpretation of Ambiguous figures: A Neural Network Model. Phys Rev Lett 72(19): 3120–3123.
- View Article
- Google Scholar
44. DeAngelis DC, Ohzawa I, Freeman RD (1995) Receptive-field dynamics in the central visual pathways. Trends Neurosci 18: 451–458.
- View Article
- Google Scholar
45. Benzi R, Sutera A, Vulpani A (1981) The mechanism of stochastic resonance. J Phys A 14: 453–457.
- View Article
- Google Scholar
46. Gammaitoni L, Hanggi P, Jung P, Marchesoni F (1998) Stochastic Resonance. Rev Mod Phys 70(1): 223–287.
- View Article
- Google Scholar
47. Moss F, Ward LM, Sannita WG (2004) Stochastic Resonance and sensory information processing: a tutorial and review of application. Clinical Neurophysiology 115: 267–281.
- View Article
- Google Scholar
48. Grzywacz NM, Watamaniuk SNJ, McKee SP (1995) Temporal Coherence Theory for the detection and measurement of Visual Motion. Vis Res 35(22): 3183–3203.
- View Article
- Google Scholar
49. Anstis SM (1970) Phi movement as a subtractive process. Vis Res 10: 1411–1430.
- View Article
- Google Scholar
50. Anstis SM, Rogers BJ (1975) Illusory reversal of visual depth and movement during changes of contrast. Vis Res 15: 957–961.
- View Article
- Google Scholar
51. Burr D, Ross J (2006) The effects of opposite-polarity dipoles on the detection of Glass patterns. Vis Res 46: 1139–1144.
- View Article
- Google Scholar
52. McCane B, Novins K, Crannitch D, Galvin B (2001) On Benchmarking Optical Flow. Computer Vision and Image Understanding 84: 126–143.
- View Article
- Google Scholar
53. DeAngelis GC, Ohzawa I, Freeman RD (1996) Reply to TINS letter to the Editor by Wang et. al. Trends in Neuroscience 19: 386.
- View Article
- Google Scholar
54. Perrone JA, Thiele A (2001) Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nat Neurosci 4(5): 526–532.
- View Article
- Google Scholar
55. Priebe NJ, Lisberger SG, Movshon JA (2006) Tuning for Spatiotemporal Frequency and Speed in Directionally Selective Neurons of Macaque Striate Cortex. J Neurosci 26(11): 2941–50.
- View Article
- Google Scholar
56. Perrone JA (2005) Economy of scale: A motion sensor with variable speed tuning. J Vis 5(1): 28–33.
- View Article
- Google Scholar
57. Cavanagh P (1991) Short-range vs. long-range motion: Not a valid distinction. Spatial Vision 5: 303–309.
- View Article
- Google Scholar
58. Benton CP, O'Brien JMD, Curran W (2007) Fractal rotation isolates mechanisms for form-dependent motion in human vision. Biol Lett 3: 306–308.
- View Article
- Google Scholar
59. Kumar T, Glaser DA (2006) Illusory motion in Enigma: A psychophysical investigation. PNAS 103(6): 1947–1952.
- View Article
- Google Scholar

[ref1] 1. Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion. J Opt Soc Am A 2(2): 284–299.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. van Santen JPH, Sperling G (1985) Elaborated Reichardt detectors. J Opt Soc Am A 2(2): 300–321.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Watson AB, Ahumada A (1985) Model of human visual-motion sensing. J Opt Soc Am A 2(2): 322–342.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Johnston A, McOwan PW, Buxton H (1992) A computational model of the analysis of some first-order and second-order motion patterns by simple and complex cells. Proc Roy Soc B 250: 297–306.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Perrone JA, Thiele A (2002) A model of speed tuning in MT neurons. Vis Res 42(8): 1035–51.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Ullman S (1979) The Interpretation of Visual Motion. MIT Press.

[ref7] 7. Dawson MRW (1991) The How and Why of What Went Where in Apparent Motion: Modeling Solutions to the Motion Correspondence Problem. Psychological Review 98(4): 569–603.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref8] 8. Williams DW, Sekuler R (1984) Coherent Global Motion Percepts from Stochastic Local Motions. Vis Res 24(1): 55–62.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref9] 9. Graziano MSA, Andersen RA, Snowden RJ (1994) Tuning of MST Neurons to Spiral Motions. J Neurosci 14(1): 54–67.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Rose D, Blake R (1998) Motion Perception: From Phi to Omega. Proc Roy Soc B 353(1371): 967–980.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref11] 11. Shermer M (2008) Patternicity. Sci Am 299(6): 48.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref12] 12. Green DM, Swets JA (1966) Signal detection theory and psychophysics. New York: Wiley.

[ref13] 13. Krekelberg B (2008) Motion detection mechanisms. In: Allan Basbaum, editor. The Senses: A Comprehensive Reference. Oxford: Elsevier Inc.

[ref14] 14. Born RT, Bradley DC (2005) Structure and function of visual area MT. Ann Rev Neurosci 28: 157–189.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref15] 15. Derrington AM, Allen HA, Delicato DS (2004) Visual mechanisms of motion analysis and motion perception. Ann Rev Psych 55: 181–205.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref16] 16. Grzywacz NM, Merwine DK (2003) Neural basis of motion perception. Encyclopedia of Cognitive Science 3: 86–98.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref17] 17. Watson AB, Ahumada A (1983) A look at motion in the frequency domain. Technical Report 84352, NASA Technical Memorandum.

[ref18] 18. Heeger DJ (1987) Model for the extraction of image flow. J Opt Soc Am A 4(8): 1455–1471.
View Article
Google Scholar

[45] View Article

[46] Google Scholar

[ref19] 19. Simoncelli EP, Heeger DJ (1998) A model of neuronal responses in visual area MT. Vis Res 38(5): 743–761.
View Article
Google Scholar

[48] View Article

[49] Google Scholar

[ref20] 20. Rust NC, Mante V, Simoncelli EP, Movshon JA (2006) How MT cells analyze the motion of visual patterns. Nat Neurosci 9(11): 1421–1431.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref21] 21. Emerson RC, Bergen JR, Adelson EH (1992) Directionally selective complex cells and the computation of motion energy in cat visual cortex. Vis Res 32(2): 203–218.
View Article
Google Scholar

[54] View Article

[55] Google Scholar

[ref22] 22. Tanaka K, Saito H (1989) Analysis of Motion of the Visual Field by Direction, Expansion/Contraction, and Rotation cells clustered in the Dorsal Part of the Medial Superior Temporal Area of the Macaque Monkey. J Neurophysiol 62(3): 626–641.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref23] 23. Sakata H, Shibutani H, Ito Y, Tsurugai K, Mine S, Kusunoki M (1994) Functional properties of rotation-sensitive neurons in the posterior parietal association cortex of the monkey. Exp Brain Res 101(2): 183–202.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref24] 24. Newsome WT, Pare EB (1988) A Selective Impairment of Motion Perception following lesions of the Middle Temporal Visual Area MT. J Neurosci 8(6): 2201–2211.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref25] 25. Newsome WT, Britten KH, Movshon JA (1989) Neuronal Correlates of a perceptual decision. Nature 341(6237): 52–54.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref26] 26. Barlow H, Tripathy SP (1997) Correspondence Noise and Signal Pooling in the Detection of Coherent Visual Motion. J Neurosci 17(20): 7954–7966.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref27] 27. Scase MO, Braddick OJ, Raymond JE (1996) What is noise for the motion system? Vis Res 36(16): 2579–2586.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref28] 28. Watamaniuk SNJ, McKee SP, Grzywacz NM (1995) Detecting a Trajectory Embedded in Random-direction Motion Noise. Vis Res 35(1): 65–77.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref29] 29. Morgan MJ (1980) Analogue models of motion perception. Proc Roy Soc B 290(1038): 117–135.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref30] 30. Watson AB, Ahumada A, Farrell JE (1986) Window of Visibility: a psychophysical theory of fidelity in time-sampled visual motion displays. J Opt Soc Am A 3(3): 300–307.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref31] 31. MacKay DM (1965) Visual Noise as a tool of Research. J Gen Psychol 72: 181–197.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref32] 32. Ross J, Badcock DR, Hayes A (2000) Coherent global motion in the absence of coherent velocity signals. Curr Biol 10: 679–682.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref33] 33. Krekelberg B, Dannenberg S, Hoffmann KP, Bremmer F, Ross J (2003) Neural correlates of implied motion. Nature 424: 674–677.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref34] 34. Krekelberg B, Vatakis A, Kourtzi Z (2005) Implied motion from form in the Human Visual Cortex. J Neurophysiol 94: 4373–4386.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref35] 35. Viva MMD, Gori M, Burr DC (2006) Powerful Motion Illusion Caused by Temporal Asymmetries in ON and OFF Visual Pathways. J Neurophysiol 95: 3928–3932.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref36] 36. Geisler WS (1999) Motion streaks provide a spatial code for motion direction. Nature 400: 65–69.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref37] 37. Barlow H, Olshausen B (2004) Convergent evidence for the visual analysis of optic flow through anisotropic attenuation of high spatial frequencies. J Vis 4: 415–426.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref38] 38. Perrone JA, Stone LS (1994) A model of self-motion estimation within primate extrastriate visual cortex. Vis Res 34(21): 2917–2938.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref39] 39. Duffy CJ, Wurtz RH (1995) Response of Monkey MST Neurons to Optic Flow Stimuli with Shifted Centers of Motion. J Neurosci 15(7): 5192–5208.
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref40] 40. van der Schaaf A, van Hateren JH (1996) Modeling the Power Spectra of Natural Images: Statistics and Information. Vis Res 36(17): 2759–2770.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref41] 41. Jain S (2007) Optical Flow or Motion Estimation using the Watson Ahumada (WA) algorithm. http://www.codeproject.com/KB/audio-video/Optical_Flow_Estimation.aspx.

[ref42] 42. Anonymous survey available at http://purl.oclc.org/NET/survey.

[ref43] 43. Riani M, Simonotto E (1994) Stochastic Resonance in the Perceptual Interpretation of Ambiguous figures: A Neural Network Model. Phys Rev Lett 72(19): 3120–3123.
View Article
Google Scholar

[116] View Article

[117] Google Scholar

[ref44] 44. DeAngelis DC, Ohzawa I, Freeman RD (1995) Receptive-field dynamics in the central visual pathways. Trends Neurosci 18: 451–458.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref45] 45. Benzi R, Sutera A, Vulpani A (1981) The mechanism of stochastic resonance. J Phys A 14: 453–457.
View Article
Google Scholar

[122] View Article

[123] Google Scholar

[ref46] 46. Gammaitoni L, Hanggi P, Jung P, Marchesoni F (1998) Stochastic Resonance. Rev Mod Phys 70(1): 223–287.
View Article
Google Scholar

[125] View Article

[126] Google Scholar

[ref47] 47. Moss F, Ward LM, Sannita WG (2004) Stochastic Resonance and sensory information processing: a tutorial and review of application. Clinical Neurophysiology 115: 267–281.
View Article
Google Scholar

[128] View Article

[129] Google Scholar

[ref48] 48. Grzywacz NM, Watamaniuk SNJ, McKee SP (1995) Temporal Coherence Theory for the detection and measurement of Visual Motion. Vis Res 35(22): 3183–3203.
View Article
Google Scholar

[131] View Article

[132] Google Scholar

[ref49] 49. Anstis SM (1970) Phi movement as a subtractive process. Vis Res 10: 1411–1430.
View Article
Google Scholar

[134] View Article

[135] Google Scholar

[ref50] 50. Anstis SM, Rogers BJ (1975) Illusory reversal of visual depth and movement during changes of contrast. Vis Res 15: 957–961.
View Article
Google Scholar

[137] View Article

[138] Google Scholar

[ref51] 51. Burr D, Ross J (2006) The effects of opposite-polarity dipoles on the detection of Glass patterns. Vis Res 46: 1139–1144.
View Article
Google Scholar

[140] View Article

[141] Google Scholar

[ref52] 52. McCane B, Novins K, Crannitch D, Galvin B (2001) On Benchmarking Optical Flow. Computer Vision and Image Understanding 84: 126–143.
View Article
Google Scholar

[143] View Article

[144] Google Scholar

[ref53] 53. DeAngelis GC, Ohzawa I, Freeman RD (1996) Reply to TINS letter to the Editor by Wang et. al. Trends in Neuroscience 19: 386.
View Article
Google Scholar

[146] View Article

[147] Google Scholar

[ref54] 54. Perrone JA, Thiele A (2001) Speed skills: measuring the visual speed analyzing properties of primate MT neurons. Nat Neurosci 4(5): 526–532.
View Article
Google Scholar

[149] View Article

[150] Google Scholar

[ref55] 55. Priebe NJ, Lisberger SG, Movshon JA (2006) Tuning for Spatiotemporal Frequency and Speed in Directionally Selective Neurons of Macaque Striate Cortex. J Neurosci 26(11): 2941–50.
View Article
Google Scholar

[152] View Article

[153] Google Scholar

[ref56] 56. Perrone JA (2005) Economy of scale: A motion sensor with variable speed tuning. J Vis 5(1): 28–33.
View Article
Google Scholar

[155] View Article

[156] Google Scholar

[ref57] 57. Cavanagh P (1991) Short-range vs. long-range motion: Not a valid distinction. Spatial Vision 5: 303–309.
View Article
Google Scholar

[158] View Article

[159] Google Scholar

[ref58] 58. Benton CP, O'Brien JMD, Curran W (2007) Fractal rotation isolates mechanisms for form-dependent motion in human vision. Biol Lett 3: 306–308.
View Article
Google Scholar

[161] View Article

[162] Google Scholar

[ref59] 59. Kumar T, Glaser DA (2006) Illusory motion in Enigma: A psychophysical investigation. PNAS 103(6): 1947–1952.
View Article
Google Scholar

[164] View Article

[165] Google Scholar

Figures

Abstract

Introduction

Previous Work

Materials and Methods

Model Description

Results

Omega effect: Response Reproducibility and distribution of spontaneous flips in perceived direction

Effect of dot correlation c

Effect of frame duration fd

Effect of dot density dd

Effect of spatial hop size h

Effect of inner circle diameter ic

Effect of reverse contrast

Model sensitivity to center position

Effect of displaying only a sector

Effect of inserting random frames

Dipoles

Discussion

Supporting Information

Movie S1.

Movie S2.

Movie S3.

Acknowledgments

Author Contributions

References