Interactive Multiple Object Tracking (iMOT)

Ian M. Thornton; Heinrich H. Bülthoff; Todd S. Horowitz; Aksel Rynning; Seong-Whan Lee

doi:10.1371/journal.pone.0086974

Abstract

We introduce a new task for exploring the relationship between action and attention. In this interactive multiple object tracking (iMOT) task, implemented as an iPad app, participants were presented with a display of multiple, visually identical disks which moved independently. The task was to prevent any collisions during a fixed duration. Participants could perturb object trajectories via the touchscreen. In Experiment 1, we used a staircase procedure to measure the ability to control moving objects. Object speed was set to 1°/s. On average participants could control 8.4 items without collision. Individual control strategies were quite variable, but did not predict overall performance. In Experiment 2, we compared iMOT with standard MOT performance using identical displays. Object speed was set to 2°/s. Participants could reliably control more objects (M = 6.6) than they could track (M = 4.0), but performance in the two tasks was positively correlated. In Experiment 3, we used a dual-task design. Compared to single-task baseline, iMOT performance decreased and MOT performance increased when the two tasks had to be completed together. Overall, these findings suggest: 1) There is a clear limit to the number of items that can be simultaneously controlled, for a given speed and display density; 2) participants can control more items than they can track; 3) task-relevant action appears not to disrupt MOT performance in the current experimental context.

Citation: Thornton IM, Bülthoff HH, Horowitz TS, Rynning A, Lee S-W (2014) Interactive Multiple Object Tracking (iMOT). PLoS ONE 9(2): e86974. https://doi.org/10.1371/journal.pone.0086974

Editor: Joy J. Geng, University of California, Davis, United States of America

Received: August 21, 2013; Accepted: December 18, 2013; Published: February 3, 2014

Copyright: © 2014 Thornton et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was supported by a National Research Foundation grant funded by the Ministry of Science, ICT and Future Planning of Korea (No. 2012-005741), the WCU (World Class University) program through the National Research Foundation of Korea funded by the Ministry of Education, Science and Technology (R31–10008), the Brain Korea 21 PLUS Program through the National Research Foundation of Korea funded by the Ministry of Education, and NIMH Grant R01 65576. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Psychologists have long been interested in the extent to which we can divide attention [1]–[3]. Across a wide range of experimental paradigms, the general finding has been that while it is clearly possible to allocate attention to more than one object or event, such division almost always results in performance costs, particularly when overall processing demands are high [4]–[6]. Outside of the laboratory, the requirement to divide attention during daily life appears to be ever increasing. The proliferation of mobile technology, for example, often leads to situations where a private information stream, such as a text or e-mail message, is being processed in parallel with a more public activity, such as walking in a crowded street, watching TV with friends, or even holding a face-to-face conversation. One critical situation where the limits of dividing attention become highly relevant is driving. David Strayer and colleagues, for example, have demonstrated that almost any interaction with a mobile device while driving a car can impair vehicle control and situational awareness to a level where lives are put at risk [7]–[8].

The multiple object tracking paradigm (MOT, [9]) has proven to be a very useful laboratory tool for exploring the limits of dividing attention in complex, dynamic contexts (for a review see [10]). In a typical display, observers are shown a fixed number of identical objects. Half of the objects are identified as targets, by briefly highlighting or blinking them. With the highlighting removed, the display is set in motion, with all of the (now identical) objects following random, independent trajectories. At the end of a variable tracking period, the motion stops and the observer is probed for the identity of the target set. The dependent measure is the inferred proportion of targets correctly tracked [11].

The MOT task has proven popular, at least in part, because the displays appear to capture some of the complexity that we might encounter in our day-to-day environment. With this in mind, there are two main findings of particular interest that have emerged. The first is simply that observers are actually able to do the task. That is, MOT is a very powerful demonstration that attention can be divided and controlled across multiple objects for sustained periods of time, despite the motion of the objects across the display. The second major finding is that such attentive tracking is limited to 3–5 items. While observers have no trouble perceiving the motion of dozens or even hundreds of objects, they can only track a handful (cf. [12]). Several explanations have been proposed for this limit, including a fixed set of virtual pointers [13]–[14], flexible attentional resources [15], and limitations in oscillation phase space [16].

The purpose of the current work was to explore how action might influence such limits. The relationship between action and attention has been well established. Indeed some theorists have even suggested that it is the limited capacity to act that determines attentional resources [17]–[19]. Planning an action clearly has important consequences for the deployment of attention. For example, it can facilitate processing at intended target locations [20]–[21] and can modulate the salience of object features [22]–[23] and object groupings [24]–[25].

The majority of these action-attention findings relate to the selection of single targets (i.e. focused attention). How does the need to act influence the deployment and control of divided attention? Here, we introduce a new task aimed at answering this question by exploring performance limits when individuals must interact with multiple objects in addition to simply tracking them. Our aim was both to explore the influence of action during divided attention tasks and to extend MOT to more fully capture the active dimension of day-to-day life that has thus far been ignored in laboratory studies.

interactive Multiple Object Tracking (iMOT)

The task we introduce in the current paper is illustrated in Figure 1. Similar to standard MOT tasks, it consists of a visual display in which multiple identical objects move at random. However, instead of passively tracking the targets, the goal of iMOT is to actively prevent objects from colliding. In designing this task, we wanted to exploit what we see as an interesting difference between laboratory MOT and the real world tasks that seem most directly analogous to it.

Download:

Figure 1. A typical iMOT display for a trial at Level 6 in Experiment 1.

The spheres move at random unless a path is drawn by touching and dragging away with a linear or curved movement. The white path remains visible until the path has been traversed; the object then resumes random motion. The task is to avoid collisions. The timer at the top of the screen counts down from 30 seconds to zero, and the fields along the bottom of the screen provide information about the current trial status.

https://doi.org/10.1371/journal.pone.0086974.g001

When we track objects outside the laboratory, there may be some situations in which we simply want to passively follow objects of interest while ignoring distractors, such as when watching sporting events. However, in many other situations, in addition to tracking, we must also interact with and/or control elements of our environment. To return to an earlier example, when driving in heavy traffic or approaching a busy junction, we need to track and predict the behavior of other vehicles as well as control our own position in space. In CCTV monitoring stations and air traffic control rooms, operators need to both attend to and control multiple channels. For CCTV operators, tracking a group of individuals through an environment requires selection and control of multiple cameras [26]. For air traffic control (ACT) tasks, all designated planes are relevant objects and, particularly during the approach/departure phase, specific actions are required to achieve collision-free allocation of appropriate airspace/runways [27].

The design of our experiments was directly inspired by mobile games such as Flight Control (Firemint Pty Ltd) and Harbor Master (Imangi Studios, LLC) that mimic aspects of the ACT task. In these games, which are typically implemented on touchscreen devices such as the iPad (Apple, Inc.), the players try to keep planes and ships from colliding, while directing them to the appropriate runways or harbors.

In designing the iMOT task, then, we moved away from asking participants to localize the target set or discriminate between targets and distractors. Instead, all objects become targets and the goal is to prevent any object from coming into contact or colliding with any other object. Participants are given control over the trajectories of the objects using the standard touch interface implemented on most mobile devices. Touching and dragging away from an object with a finger either creates a visible path for the object to follow (Experiment 1) or nudges the object in the appropriate direction (Experiments 2 & 3). We note that although the initial iMOT experiments have been carried out on an iPad, the paradigm can easily be implemented on any device with similar displays and touch-response capabilities.

iMOT Task Demands

To perform well in this task participants need to both monitor for impending collisions and to plan and execute motor interventions aimed at keeping specific objects apart. Collision detection might involve actively tracking objects, in the manner of MOT. Alternatively, there might be a more passive collision-detection system. Collision detection might be based on simple proximity, so that a signal is generated if two items get too close to one another, or a more sophisticated algorithm might take into account the speed and trajectory of items to predict potential collisions.

Collision detection is also involved in MOT, of course. Work by Zelinsky and Todor [28] has shown that the visual system responds proactively to potential collisions, shifting gaze to the relevant location in advance in order to help disambiguate potential collisions (Zelinksy and Todor call this behavior “rescue saccades”). However, little or no work has been done on determining “how” collisions are detected in multiple object displays. While there is some evidence that MOT involves prediction [29]–[31], at least in predictable displays [32], we do not know if this extends to collision detection.

The iMOT task differs from MOT in that the participant can actively respond to potential collisions. Consider two possible strategies that might be adopted. A participant with a reactive approach might wait until an collision seemed imminent before taking steps to avoid it. This strategy is similar to the “rescue saccades” described in MOT by Zelinsky and Todor [28]. A participant with a proactive approach, on the other hand, might try to continually modify the position of objects on the screen to maximize the distance between them, thus reducing the likelihood of collisions. As the two approaches would likely give rise to quite different touch behavior, examining intervention style should be able to shed light on which approach is more prevalent.

Although the overall goals of iMOT and MOT are quite different – collision avoidance versus target identification – in terms of task demands, collision management may be a shared requirement. That is, while the goal of iMOT is to avoid collisions, in MOT it may be useful to have a strategy to minimize the impact of collisions on tracking. Thus, the two tasks may rely on a common collision detection mechanism (active or passive), or they may not. In the current paper, we attempt to directly compare iMOT and MOT performance in the same participants as a way to initially assess whether tracking and collision avoidance involve similar cognitive processes. We should note that in making such comparisons we intend to focus purely on performance measures. That is, we do not assume, a priori, that behavior is limited by a fixed set of task-specific mechanism [13]–[14] rather than being determined via the flexible allocation of central attentional resources [15].

As already mentioned above, one of the main aims of this paper is to explore the influence of action when attention needs to be divided. To successfully avoid collisions during iMOT, actions must be planned and effectively executed with respect to a single object at a time. Understanding whether such focused action has consequences for the ability to monitor other parts of the display should become clear by examining iMOT performance. Furthermore, by directly comparing iMOT to MOT performance in the same participants, we hope to shed light on whether these two components – action and attention – operate independently or rely on overlapping cognitive resources.

Experimental Overview

Experiment 1 was intended as “proof of concept”, to demonstrate that observers could in fact successfully perform the iMOT task. We chose a speed of object movement (1°/s) that was relatively sedate, taking into account the need for physical control, and used a staircase procedure to obtain individual thresholds and control distributions. We were interested in both the absolute number of items that could be controlled and the variability of this figure across participants. One prominent feature of MOT is the finding that, in the majority of displays, estimates of tracking capacity show little individual variability around the oft-cited limit of 4–5 items [10]. Factors that are known to modulate group mean estimates in MOT tasks, such as speed of motion [15], [33], set size and display density [34]–[37], are also discussed in Experiment 1.

In Experiment 2, we directly compared MOT and iMOT performance. Using identical displays and motion parameters, we obtained estimates of both MOT tracking performance and iMOT control performance in the same individuals. The goal was to establish the relative demands of the two tasks and to assess whether performance on MOT and iMOT appeared to be drawing on similar resources.

In Experiment 3, we used a dual-task approach to examine whether MOT and iMOT could be performed simultaneously. In single-task displays, participants either tracked or controlled four target objects. Difficulty was manipulated by changing the distractor set-size. In the critical dual-task condition, the same four objects had to be both tracked for later identification and controlled to avoid collisions. Under these conditions, we were interested in establishing how resources would be balanced between the two tasks. If MOT and iMOT relied on completely separate resources, then dual-task performance should be comparable to the single-task baselines. The presence of a dual-task deficit would indicate some overlap in processing resources.

Experiment 1

The purpose of Experiment 1 was to demonstrate that participants could successfully perform the iMOT task. We compared two groups of participants. The first consisted of young adults from Korea University in Seoul. The second were young adults from Swansea University in the UK. The motivation for including this cross-cultural variable, aside from the availability of separate pools of participants, was to probe for possible differences in cognitive style. Previous research has suggested that there may be fundamental differences between East Asian and Western participants, with the former attending more to background context, and the latter to figural elements [38]–[39]. Such differences could potential impact performance in the current task.

Previous research has also suggested that there may be sex differences in spatial selective attention [40]–[42], specifically in the context of multiple object tracking [43]. Therefore, we ensured that each group consisted of an equal number of male and female participants, and we included sex as a factor for exploratory purposes in all three experiments.

Method

Participants.

A total of 24 participants took part in this study on a voluntary basis. A group of 12 younger adults (six female and six male) aged between 18–26 years (M = 24.1, SD = 2.5) were recruited directly from members of the Brain Engineering Department at Korea University. A further group of 12 younger adults (six female and six male), aged between 19–33 years (M = 23.2, SD = 4.3), were recruited from the Psychology Department at Swansea University. All participants were asked to assess their familiarity with game-like tasks on mobile devices on a scale from 1 (no experience) to 5 (expert player). There were no differences between the Korean group (M = 3.2, SD = 1.2) and the Swansea group (M = 3.3, SD = 1.0), t<1, n.s. All participants gave written informed consent, and the methods and procedures conformed to the ethical guidelines set out by the Declaration of Helsinki for testing human participants. All aspects of the procedure was reviewed and approved by the Ethics Committee at Swansea University.

Equipment.

All experiments reported here used a first generation iPad with a screen dimension of 20×15 cm and a resolution of 1024×768 pixels. In this and all subsequent experiments, participants were instructed to hold the iPad in a standard posture: the participant cradled the iPad (in landscape orientation) in their left arm, with the fingers of their left hand grasping the furthest edge of the device. They were told to interact with the objects using the index finger of their right hand. While the viewing distance was not fixed, we estimate that it averaged approximately 50 cm from screen surface to eyes. For this reason, we report stimulus characteristics both in terms of approximate degrees of visual angle (°) and pixels. Text was set to run from left-to-right.

Experiments were run in a quiet environment under low lighting conditions with no overhead lights, in order to minimize screen glare.

Stimuli and Task.

The iMOT task was introduced to participants as a simple game in which the goal was to prevent moving objects from colliding with each other. All participants began with a display containing six objects. If they successfully controlled these objects without collision for 30 s, an additional object would be added on the subsequent trial. Any collision between two objects ended a trial. After a collision, the number of objects would be reduced but would never go below the initial level of six items. Performance was assessed over a total of 30 trials per participant. From a player's perspective, success in the game involved achieving and maintaining the highest level (i.e., greatest number of objects) possible.

Objects were identical orange spheres with a diameter of 52 pixels (1.2°). The objects were shaded to appear lit from above. This was done to enhance the impression of 3D and help segment them from the uniform black background. At the start of each trial, the objects were distributed at equal distances around the circumference of an invisible circle centered on the iPad display. The radius of this circle was 160 pixels (3.1°). The position of spheres around the circle was determined by choosing a random starting angle for the first object and then distributing each subsequent object by adding an equidistant angular step of (360/Set Size) °.The objects were stationary for the first two seconds of the trial, and then began to expand outwards in a straight line, following an angular trajectory equivalent to their position around the circle. In the absence of participant input, each object followed this path for 200 pixels, when a new straight line path would be selected at random. Directions were randomly sampled from the full 360° in 1° increments and the path length varied between 200 pixels (3.9°) and 300 pixels (5.9°). At all times, objects moved at a constant speed of approximately 1°/s.

The participant's task was to keep the objects separated by perturbing their trajectories via the touchscreen interface. Touching and dragging away from an object gave rise to a visible white path that the object would follow. In this experiment, the length and complexity of the path was not restricted. When the object reached the end of a user defined path, it reverted to following random linear paths, as described above. Note, that user input was allowed immediately at the start of the trial, that is, within the first two seconds. In these circumstances, the user defined path would override the default linear expansion for the touched object.

In line with the idea of “game-play”, four information fields were visible to the participants during the entire trial. At the top of the screen was a time counter that reduced from 30 s to 0 s. In the bottom left corner was a collision counter and in bottom right an indication of the current number of objects in the display. At the bottom of the screen in the center was an indication of the number of touches or interventions made during the current trial.

Procedure.

Participants were run in individual sessions. Each session began with a brief questionnaire aimed at establishing educational and work experience, gaming habits and familiarity with mobile devices. Questions were a mixture of open-ended items and rating scales designed to quantify relevant experience. This lasted approximately 5 minutes. Participants were then familiarized with the iPad and the basic display and control components of the task. They were allowed to practice with the application until they felt comfortable. This familiarization phase typically lasted less than 5 minutes, with participants completing two or three practice trials. The main experimental session then began in which participants completed a block of 30 trials, each trial lasting 30 s. At the end of each trial, a self-paced pause was allowed. Participants could wait as long as long as they liked until pressing a “Continue” button. In practice, few of these paused lasted more than 10 seconds, with the entire block being completed within 20 minutes.

Analysis.

The main dependent measure in Experiment 1 was the number of objects that could be successfully controlled for 30 seconds without collision. Our analyses thus focused on the distribution of collision-free trials as a function of set size. As well as reporting the mean of these distributions, in this and all subsequent experiments, we also extracted a full range of parameters (i.e., variability, skewness, kurtosis, maximum) that might help characterize performance. These will be reported in the accompanying tables, but analysis will focus on the central tendency and the maximum level achieved. In Experiment 1, these dependent variables were analyzed using a 2 (Group: KU vs. SU) ×2 (Sex) ANOVA.

We also looked at how often participants touched the objects to change their trajectories, which we termed “interventions”. We calculated the average number of interventions per collision-free trial, and fitted a line to the intervention x set size function of each participant. Both the slope of this function, and the baseline interventions with a set size of 6 items were examined. Average interventions were analyzed using a 2 (Group: KU vs. SU) ×2 (Sex) ×5 (Set Size) ANOVA, while slope and baseline measures used a 2 (Group: KU vs. SU) ×2 (Sex) ANOVA.

Finally, we looked to see whether intervention strategy had any impact on overall performance. To do this we used multiple regression to explore whether the number of items controlled could be predicted from the slope and baseline interaction measures.

Results

Figure 2 shows examples of individual staircase sessions for six participants, three from KU in the left hand column, three from SU in the right hand column. In each panel, the solid line indicates the mean and the dashed line the maximum number of items controlled for that individual. The panels are labeled so that data from the corresponding participant can be found in Tables 1 and 2.

Download:

Figure 2. Example staircase data in Experiment 1.

The left hand column shows data from three Korea University (KU) participants, the right column three Swansea University (SU) participants. The Y-axis indicated the number of items in the current trial, and each data point represents one of the 30 trials in a session. A collision-free trial always results in an increase in set size while any collision results in a decrease, except that set size was not allowed to drop below six items. The solid line shows the mean level achieved by the participant and the dotted line the maximum level. See text for more details.

https://doi.org/10.1371/journal.pone.0086974.g002

Download:

Table 1. Korea University participants from Experiment 1: Individual Parameter Estimates for Distributions of Collision-free Trials.

https://doi.org/10.1371/journal.pone.0086974.t001

Download:

Table 2. Swansea University participants from Experiment 1: Individual Parameter Estimates for Distributions of Collision-free Trials.

https://doi.org/10.1371/journal.pone.0086974.t002

In the upper row are two participants whose performance fluctuated around the lower end of the range. In the first example (KU Female 3), performance initially stays close to the starting level of 6 items, but gradually rises to fluctuate between 7 and 9 items, never exceeding this maximum level. The second example (SU Male 1) also has a maximum of 9 items, but here there is an initial rise and fall, which is repeated before performance stabilizes around 7 items in the latter half of the session. The second row of examples shows participants who were able to successfully control at least 10 items without collision. For KU Male 5, this only occurs once (at trial 16), and performance seems to stabilize for this participant at 9 items. SU Female 2 is able to control 10 items on 4 occasions, but their overall performance shows a more periodic increase and decrease. The final two examples show those participants with the highest sustained performance from the two sites.

As expected, given our one-up, one-down staircase procedure, participants were collision-free on just over half of the 30 experimental trials (Mean = 16.8; SE = 0.1). Figure 3 shows the distribution of these collision-free trials as a function of set size, collapsed across all participants. It is immediately clear that the central tendency of this distribution falls at slightly above 8 items, while the maximum number of items controlled was 13. The full range of parameters extracted from the distributions of each individual participant are summarized in Tables 1 (KU participants) and 2 (SU participants).

Download:

Figure 3. Distribution of collision-free trials in Experiment 1.

Percentage of total collision-free trials, collapsed across all participants from both sites. The maximum number of items controlled was 13 and the mean number of items controlled was 8.4.

https://doi.org/10.1371/journal.pone.0086974.g003

The mean number of items that could be controlled without collision, averaged across participants, was 8.4 (SE = 0.1) and the averaged maximum value was 10.4 (SE = 0.2). These values did not vary as a function Sex or Group and there were no significant main effects or interactions.

Figure 4 illustrates our analyses of the number of interventions. On average, participants made just over 30 control interventions per trial (M = 34.4, SE = 1.6). However, it is clear from the distribution of symbols in Figures 4A and 4B that there were consistent individual differences in intervention strategy. For example, at the starting level of 6 items, the number of interventions across participants ranged from 13 to 41 (M = 25.1, SE = 2.0). These initial differences in intervention strategy also appear to be maintained as the number of objects in the display increases.

Download:

Figure 4. Interventions as a function of set size in Experiment 1.

Data are shown separately for Korea University (KU; Panel A) and Swansea University (SU; Panel B) participants. Data are plotted for each participant using unique symbols and mean performance is represented by the dotted line. Legend codes refer to individual Female (F) and Male (M) participants in Tables 1 and 2, respectively.

https://doi.org/10.1371/journal.pone.0086974.g004

In general, the number of interventions increased with the number of objects participants had to control. All 24 participants had positive intervention x set size slopes, with approximately 4 additional interventions occurring each time a new item was added (M = 4.4, SE = 0.4). The mean goodness of fit for these functions was relatively high (M R² = 0.8, SE = 0.1).

A 2 (Group) ×2 (Sex) ×5 (Set Size) ANOVA on the average intervention data revealed only a significant main effect of Set Size, F(4,64) = 59.6, MSE = 13.6, p<0.001, eta_2 = 0.8. Analysis of the Slope and Baseline values revealed no main effects or interactions.

To explore whether interaction style related to collision performance, we performed a multiple regression analysis with slope and baseline as predictors and mean number of items controlled as the criterion variable. Interaction behavior appeared to contribute very little to the overall success of object control, R² = 0.1, F(2, 21) = 1.0, MSE = 3.2, n.s.

Discussion

There are several findings of interest from this experiment. First, as with standard MOT, it is clear that participants were able to divide their attention between multiple dynamic objects. Here, rather than tracking the objects to identify them, participants were able to monitor the display for impending collisions and execute appropriate actions. Thus, we have shown that attention can be divided across multiple objects in both active and passive contexts.

Second, this ability to control objects and avoid collisions was clearly limited; we found that participants could only control approximately 8 items without collisions. We do not assume that this is a hard limit on human performance on this task. As with MOT [12], [15], [34], we assume that stimulus parameters such as object speed and display density will modulate levels of performance; we will address this issue in Experiment 2. Clearly, however, given any fixed parameter set, we would expect a clear upper limit on how many objects can be controlled. In the current experiment, although there was some individual variation, the estimate of 8 items was surprisingly stable. In particular, we found no variation across experimental site, suggesting that cultural differences play little role in this task. There was no reliable difference between the sexes, although as can be seen in Tables 1 and 2, there was a trend for Male participants to outperform Female participants, a theme we return to in the next experiment.

These results bring up two questions. First, what is responsible for the eight item limit? In MOT, several explanations have been proposed for the capacity limit, including a fixed set of virtual pointers [13]–[14], flexible attentional resources [15], and limitations in oscillation phase space [16]. What might underlie the limitations on iMOT performance? One hypothesis is that iMOT is relying on the same processes that subserve MOT, and therefore whatever explains MOT limitations will explain iMOT limitations. Another possibility is that the iMOT limit is purely a product of the limitations of the motor system. A third option is that the limit is a product of an interaction between the attentional and motor systems. We will return to this question in Experiment 3.

Second, what is the role of intervention strategy? The increase in interventions with set size is easy to understand, since with increasing density, the number of potential collisions is presumably increasing. We also observed consistent individual differences in the number of interventions that were maintained across variations in task difficulty. This is consistent with the suggestion raised in the introduction that some participants may adopt a more reactive intervention strategy and others a more proactive strategy. Perhaps more surprisingly, however, we found no clear relationship between intervention style and collision performance. We return to this issue in the General Discussion.

Experiment 2

The goal of Experiment 2 was to directly compare the ability to actively control objects in iMOT with passive tracking ability as measured by MOT. A new group of Swansea students were asked to complete both tasks in separate blocks of trials. We modified the iMOT task in order to ensure that the visual characteristics of the two types of display were as similar as possible (details are given below). As in Experiment 1, a staircase procedure was used to provide individual estimates of the number of objects that could be tracked/controlled. Our main interest was in how estimates for MOT and iMOT performance would compare given identical displays. In addition to examining overall level differences, we also correlated the performance of individual participants as a first step in determining whether the two tasks appeared to draw on similar resources. We also assessed the impact of the iMOT display modifications by directly comparing performance estimates with those obtained in Experiment 1.