Firestone, C., & Scholl, B. J. (accepted target article for peer commentary). Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral and Brain Sciences.
What determines what we see? In contrast to the traditional "modular" understanding of perception, according to which visual processing is encapsulated from higher-level cognition, a tidal wave of recent research alleges that states such as beliefs, desires, emotions, motivations, intentions, and linguistic representations exert direct top-down influences on what we see. There is a growing consensus that such effects are ubiquitous, and that the distinction between perception and cognition may itself be unsustainable. We argue otherwise: none of these hundreds of studies -- either individually or collectively -- provide compelling evidence for true top-down effects on perception, or "cognitive penetrability". In particular, and despite their variety, we suggest that these studies all fall prey to only a handful of pitfalls. And whereas abstract theoretical challenges have failed to resolve this debate in the past, our presentation of these pitfalls is empirically anchored: in each case, we show not only how certain studies could be susceptible to the pitfall (in principle), but how several alleged top-down effects actually are explained by the pitfall (in practice). Moreover, these pitfalls are perfectly general, with each applying to dozens of other top-down effects. We conclude by extracting the lessons provided by these pitfalls into a checklist that future work could use to convincingly demonstrate top-down effects on visual perception. The discovery of substantive top-down effects of cognition on perception would revolutionize our understanding of how the mind is organized; but without addressing these pitfalls, no such empirical report will license such exciting conclusions.
Raila, H., Scholl, B. J., & Gruber, J. (2015). Seeing the world through rose-colored glasses: People who are happy and satisfied with life preferentially attend to positive stimuli. Emotion, 15(4), 449-462.
Given the many benefits conferred by trait happiness and life satisfaction, a primary goal is to determine how these traits relate to underlying cognitive processes. For example, visual attention acts as a gateway to awareness, raising the question of whether happy and satisfied people attend to (and therefore see) the world differently. Previous work suggests that biases in selective attention are associated with both trait negativity and with positive affect states, but to our knowledge no previous work has explored whether trait happy individuals attend to the world differently. Here, we employed eye tracking as a continuous measure of sustained overt attention during passive viewing of displays containing positive and neutral photographs, to determine whether selective attention to positive scenes is associated with measures of trait happiness and life satisfaction. Both trait measures were significantly correlated with selective attention for positive (vs. neutral) scenes, and this general pattern was robust across several types of positive stimuli (achievement, social, and primary reward), and not due to positive or negative state affect. Such effects were especially prominent during the later phases of sustained viewing. This suggests that people who are happy and satisfied with life may literally see the world in a more positive light, as if through rose-colored glasses. Future work should investigate the causal relationship between such attention biases and one's happiness and life satisfaction.
Ward, E. J.., & Scholl, B. J. (2015). Stochastic or systematic?: Seemingly random perceptual switching in bistable events triggered by transient unconscious cues. Journal of Experimental Psychology: Human Perception & Performance, 41(4), 929-939.
What we see is a function not only of incoming stimulation, but of unconscious inferences in visual processing. Among the most powerful demonstrations of this are bistable events, but what causes the percepts of such events to switch? Beyond voluntary effort and stochastic processing, we explore the ways in which ongoing dynamic percepts may switch as a function of the content of brief, unconscious, independent cues. We introduced transient disambiguating occlusion cues into the Spinning Dancer silhouette animation. The dancer is bistable in terms of depth and rotation direction, but many observers see extended rotation in the same direction, interrupted only rarely by involuntary switches. Observers failed to notice these occasional disambiguating cues, but their impact was strong and systematic: cues typically led to seemingly stochastic perceptual switches shortly thereafter, especially when conflicting with the current percept. These results show how the content of incoming information determines and constrains online conscious perception -- even when neither the content nor the brute existence of that information ever reaches awareness. Thus, just as phenomenological ease does not imply a corresponding lack of underlying effortful computation, phenomenological randomness should not be taken to imply a corresponding lack of underlying systematicity.
Firestone, C., & Scholl, B. J. (2015). Can you experience 'top-down' effects on perception?: The case of race categories and perceived lightness. Psychonomic Bulletin & Review, 22(3), 694-700.
A recent surge of research has revived the notion that higher-level cognitive states such as beliefs, desires, and categorical knowledge can directly change what we see. The force of such claims, however, has been undercut by an absence of visually apparent demonstrations of the form so often appealed to in vision science: such effects may be revealed by statistical analyses of observers' responses, but you cannot literally experience the alleged top-down effects yourself. A singular exception is an influential report that racial categorization alters the perceived lightness of faces, a claim that was bolstered by a striking visual demonstration that Black faces appear darker than White faces, even when matched for mean luminance. Here, we show that this visually compelling difference is explicable in terms of purely low-level factors. Observers who viewed heavily blurred versions of the original Black and White faces still judged the Black face to be darker and the White face to be lighter even when these observers could not perceive the races of the faces, and even when they explicitly judged the faces to be of the same race. We conclude that the best subjectively appreciable evidence for top-down influences on perception does not reflect a genuinely top-down effect after all: instead, such effects arise from more familiar (if subtle) bottom-up factors within visual processing.
Ward, E. J., & Scholl, B. J. (2015). Inattentional blindness reflects limitations on perception, not memory: Evidence from repeated failures of awareness. Psychonomic Bulletin & Review, 22(3), 722-727.
Perhaps the most striking phenomenon of visual awareness is inattentional blindness (IB), in which a surprisingly salient event right in front of you may go completely unseen when unattended. Does IB reflect a failure of perception, or only of subsequent memory? Previous work has been unable to answer this question, due to a seemingly intractable dilemma: ruling out memory requires immediate perceptual reports, but soliciting such reports fuels an expectation that eliminates IB. Here we introduce a way of evoking repeated IB: we show that observers fail to report seeing salient events not only when they have no expectation, but also when they have the wrong expectations about the events' nature. This occurs when observers must immediately report seeing anything unexpected, even mid-event. Repeated IB thus demonstrates that IB is aptly named: it reflects a genuine deficit in moment-by-moment conscious perception, rather than a form of inattentional amnesia.
Firestone, C., & Scholl, B. J. (2015). Enhanced visual awareness for morality and pajamas?: Perception vs. memory in 'top-down' effects. Cognition, 136, 409-416.
A raft of prominent findings has revived the notion that higher-level cognitive factors such as desire, meaning, and moral relevance can directly affect what we see. For example, under conditions of brief presentation, morally relevant words reportedly "pop out" and are easier to identify than morally irrelevant words. Though such results purport to show that perception itself is sensitive to such factors, much of this research instead demonstrates effects on visual recognition -- which necessarily involves not only visual processing per se, but also memory retrieval. Here we report three experiments which suggest that many alleged top-down effects of this sort are actually effects on 'back-end' memory rather than 'front-end' perception. In particular, the same methods used to demonstrate popout effects for supposedly privileged stimuli (such as morality-related words, e.g. "punishment" and "victim") also yield popout effects for unmotivated, superficial categories (such as fashion-related words, e.g. "pajamas" and "stiletto"). We conclude that such effects reduce to well-known memory processes (in this case, semantic priming) that do not involve morality, and have no implications for debates about whether higher-level factors influence perception. These case studies illustrate how it is critical to distinguish perception from memory in alleged 'top-down' effects
Strickland, B., & Scholl, B. J. (2015). Visual perception involves 'event type' representations: The case of containment vs. occlusion. Journal of Experimental Psychology: General, 144(3), 570-580.
Recent infant cognition research suggests that our core knowledge involves "event type" representations: during perception, the mind automatically categorizes physical events into broad types (e.g. occlusion and containment), which then guide attention to different properties (e.g. with width processed at a younger age than height in containment events, but not occlusion events). We tested whether this aspect of infant cognition also structures adults' visual processing. In 6 experiments, adults had to detect occasional changes in ongoing dynamic displays that depicted repeating occlusion or containment events. Mirroring the developmental progression, change detection was better for width vs. height changes in containment events, but no such difference was found for otherwise equivalent occlusion events -- even though most observers were not even aware of the subtle occlusion/containment difference. These results suggest for the first time that event-type representations operate automatically and unconsciously as part of the underlying currency of adult visual cognition.
Liverence, B. M., & Scholl, B. J. (2015). Object persistence enhances spatial navigation: A case study in smartphone vision science. Psychological Science, 26(7), 955-963.
Violations of spatiotemporal continuity disrupt performance in many tasks involving attention and working memory, but such experiments have been limited to the study of moment-by-moment online perception, typically assessed by passive monitoring tasks. Here we ask whether persisting object representations also serve as underlying units of longer-term memory and active spatial navigation, using a novel paradigm inspired by the visual interfaces common to many smartphones. Participants used keypresses to navigate through simple visual environments constituted by grids of "icons" (depicting real-world objects), only one of which was visible at a time through a static virtual "window". Participants found target icons faster when navigation involved persistence cues (via "sliding" animations) compared to when persistence was disrupted (e.g. via temporally matched "fading" animations), with all transitions modeled on smartphone interfaces. Moreover, this difference occurred even after explicit memorization, demonstrating that object persistence enhances spatial navigation in an automatic and irresistible fashion.
Chen, Y. -C., & Scholl, B. J. (2014). Seeing and liking: Biased perception of ambiguous figures consistent with the 'inward bias' in aesthetic preferences. Psychonomic Bulletin & Review, 21(6), 1444-1451.
Aesthetic preferences are ubiquitous in visual experience. Indeed, it seems nearly impossible in many circumstances to perceive a scene without also liking or disliking it to some degree. Aesthetic factors are only occasionally studied in mainstream vision science, though, and even then they are often treated as functionally independent from other aspects of perception. In contrast, the present study explores the possibility that aesthetic preferences may interact with other types of visual processing. We were inspired, in particular, by the inward bias in aesthetic preferences: when an object with a salient "front" is placed near the border of a frame (say, in a photograph), observers tend to find the image more aesthetically pleasing if the object faces inward (toward the center) than if it faces outward (away from the center). We employed similar stimuli, except that observers viewed framed figures that were ambiguous in terms of the direction they appeared to be facing. The resulting percepts were influenced by the frames in a way that corresponded to the inward bias: when a figure was placed near a frame's border, observers tended to see whichever interpretation was facing inward. This effect occurred for both abstract geometric figures (e.g. ambiguously-oriented triangles) and meaningful line drawings (e.g. left-facing ducks or right-facing rabbits). The match between this new influence on ambiguous figure perception and the previously studied aesthetic bias suggests new ways in which aesthetic factors may relate not only to what we like, but also to what we see in the first place.
De Freitas, J., Liverence, B. M., & Scholl, B. J. (2014). Attentional rhythm: A temporal analogue of object-based attention. Journal of Experimental Psychology: General, 143(1), 71-76.
The underlying units of attention are often discrete visual objects. Perhaps the clearest form of evidence for this is the same-object advantage: following a spatial cue, responses are faster to probes occurring on the same object than they are to probes occurring on other objects, while equating brute distance. Is this a fundamentally spatial effect, or can same-object advantages also occur in time? We explored this question using independently-normed rhythmic temporal sequences, structured into phrases, and presented either visually or auditorily. Detection was speeded when cues and probes both lay within the same rhythmic phrase, compared to when they spanned a phrase boundary, while equating brute duration. This same-phrase advantage suggests that object-based attention is a more general phenomenon than has been previously suspected: perceptual structure constrains attention, in both space and time, and in both vision and audition.
Firestone, C., & Scholl, B. J. (2014b). 'Please tap the shape, anywhere you like': Shape skeletons in human vision revealed by an exceedingly simple measure. Psychological Science, 25(2), 377-386.
A major challenge for visual recognition is to describe shapes flexibly enough to allow generalization over different views. Computer vision models have championed a potential solution in medial-axis 'shape skeletons' -- hierarchically arranged geometric structures that are robust to deformations like bending and stretching. Here, we exploit an old, unheralded, and exceptionally simple paradigm to reveal the presence and nature of shape skeletons in human vision. When thousands of participants independently viewed a shape on a touch-sensitive tablet computer and simply tapped the shape anywhere they wished, the aggregated touches formed the shape's medial-axis skeleton. This held across several shape variations, demonstrating profound and predictable influences of even subtle border perturbations and amodally 'filled-in' regions. This phenomenon reveals novel properties of shape representation, and demonstrates (in an unusually direct way) how deep and otherwise-hidden visual processes can directly control simple behaviors, even while observers are completely unaware of their existence.
Firestone, C., & Scholl, B. J. (2014a). "Top-down" effects where none should be found: The El Greco fallacy in perception research. Psychological Science, 25(1), 38-46.
A tidal wave of recent research purports to have discovered that higher-level states such as moods, action-capabilities, and categorical knowledge can literally and directly affect what we see. Are these truly effects on perception, or might some instead reflect influences on judgment, memory, or response bias? Here, we exploit an infamous art-historical reasoning error (the so-called "El Greco fallacy") to demonstrate in five experiments that multiple alleged top-down effects (ranging from effects of morality on lightness perception to effects of action capabilities on spatial perception) cannot truly be effects on perception. We suggest that this error may also contaminate several other varieties of top-down effects, and that this discovery has implications for debates over the (dis)continuity of perception and cognition.
Scholl, B. J., & Gao, T. (2013). Perceiving animacy and intentionality: Visual processing or higher-level judgment? In M. D. Rutherford & V. A. Kuhlmeier (Eds.), Social perception: Detection and interpretation of animacy, agency, and intention (pp. 197-230). Cambridge, MA: MIT Press.
We can identify other social agents in our environment not only on the basis of how they look, but also on the basis of how they move -- and even simple geometric shapes can give rise to rich percepts of animacy and intentionality based on their motion patterns. But why should we think that such phenomena truly reflect visual processing, as opposed to higher-level judgment and categorization based on visual input? This chapter explores five lines of evidence: (1) the phenomenology of visual experience; (2) dramatic dependence on subtle visual display details; (3) implicit influences on visual performance; (4) activation of visual brain areas; and (5) interactions with other visual processes. Collectively, this evidence provides compelling support for the idea that visual processing itself traffics in animacy and intentionality.
Franconeri, S. L., Pylyshyn, Z. W., & Scholl, B. J. (2012). A simple proximity heuristic allows tracking of multiple objects through occlusion. Attention, Perception, & Psychophysics, 74(4), 691-702.
Moving objects in the world present a challenge to the visual system, in that they often move in and out of view as they are occluded by other surfaces. Nevertheless, the ability to track multiple objects through periods of occlusion is surprisingly robust. Here, we identify a simple heuristic that underlies this ability: Pre- and postocclusion views of objects are linked together solely by their spatial proximity. Tracking through occlusion was always improved when the postocclusion instances reappeared closer to the preocclusion views. Strikingly, this was true even when objects' previous trajectories predicted different reappearance locations and when objects reappeared "too close", from invisible "slits" in empty space, rather than from more distant occluder contours. Tracking through occlusion appears to rely only on spatial proximity, and not on encoding heading information, likely reappearance locations, or the visible structure of occluders.
Albrecht, A. R., Scholl, B. J., & Chun, M. M. (2012). Perceptual averaging by eye and ear: Computing summary statistics from multimodal stimuli. Attention, Perception, & Psychophysics, 74(5), 810-815.
Beyond perceiving the features of individual objects, we also have the intriguing ability to efficiently perceive average values of collections of objects across various dimensions. Over what features can perceptual averaging occur? Work to date has been limited to visual properties, but perceptual experience is intrinsically multimodal. In an initial exploration of how this process operates in multimodal environments, we explored statistical summarizing in audition (averaging pitch from a sequence of tones) and vision (averaging size from a sequence of discs), and their interaction. We observed two primary results. First, not only was auditory averaging robust, but if anything it was more accurate than visual averaging in the present study. Second, when uncorrelated visual and auditory information were simultaneously present, observers showed little cost for averaging in either modality when they did not know until the end of each trial which average they had to report. These results illustrate that perceptual averaging can span different sensory modalities, and they illustrate how vision and audition can both cooperate and compete for resources.
Gao, T., Scholl, B. J., & McCarthy, G. (2012). Dissociating the detection of intentionality from animacy in the right posterior superior temporal sulcus. Journal of Neuroscience, 32(41), 14276-14280.
Certain motion patterns can cause even simple geometric shapes to be perceived as animate. Viewing such displays evokes strong activation in temporoparietal cortex, including areas in and near the (predominantly right) posterior superior temporal sulcus (pSTS). These brain regions are sensitive to socially relevant information, but the nature of the social information represented in pSTS is unclear. For example, previous studies have been unable to explore the perception of shifting intentions, beyond animacy. This is due in part to the ubiquitous use of complex displays that combine several types of social information, with little ability to control lower-level visual cues. Here we address this challenge by manipulating intentionality with parametric precision while holding cues to animacy constant. Human subjects were exposed to a "wavering wolf" display, in which one item (the 'wolf') chased continuously, but its goal (i.e. the sheep) frequently switched among other shapes. By contrasting this with three other control displays, we find that the wolf's changing intentions gave rise to strong selective activation in the right pSTS, compared with (1) a wolf that chases with a single unchanging intention; (2) very similar patterns of motion (and motion change) that are not perceived as goal-directed; and (3) abrupt onsets and offsets of moving objects. These results demonstrate in an especially well controlled manner that right pSTS is involved in social perception, beyond physical properties such as motion energy and salience. More importantly, these results demonstrate for the first time that this region represents perceived intentions, beyond animacy.
Liverence, B. M., & Scholl, B. J. (2012). Discrete events as units of perceived time. Journal of Experimental Psychology: Human Perception & Performance, 38(3), 549-554.
In visual images, we perceive both space (as a continuous visual medium) and objects (that inhabit space). Similarly, in dynamic visual experience, we perceive both continuous time and discrete events. What is the relationship between these units of experience? The most intuitive answer may be similar to the spatial case: time is perceived as an underlying medium, which is later segmented into discrete event representations. Here we explore the opposite possibility -- that our subjective experience of time itself can be influenced by how durations are temporally segmented, beyond more general effects of change and complexity. We show that the way in which a continuous dynamic display is segmented into discrete units (via a path shuffling manipulation) greatly influences duration judgments, independent of psychophysical factors previously implicated in time perception, such as overall stimulus energy, attention and predictability. It seems that we may use the passage of discrete events -- and the boundaries between them -- in our subjective experience as part of the raw material for inferring the strength of the underlying 'current' of time.
Newman, G. E., & Scholl, B. J. (2012). Bar graphs depicting averages are perceptually misinterpreted: The within-the-bar bias. Psychonomic Bulletin & Review, 19(4), 601-607.
Perhaps the most common method of depicting data, in both scientific communication and popular media, is the bar graph. Bar graphs often depict measures of central tendency, but they do so asymmetrically: a mean, for example, is depicted not by a point, but by the edge of a bar that originates from a single axis. Here we show that this graphical asymmetry gives rise to a corresponding cognitive asymmetry. When viewers are shown a bar depicting a mean value and are then asked to judge the likelihood of a particular data point being part of its underlying distribution, viewers judge points that fall within the bar to be more likely than points equidistant from the mean, but outside the bar -- as if the bar somehow 'contained' the relevant data. This 'within-the-bar bias' occurred (a) for graphs with and without error bars, (b) for bars that originated from both lower and upper axes, (c) for test points with equally extreme numeric labels, (d) both from memory (when the bar was no longer visible) and in online perception (while the bar was visible during the judgment), (e) both within- and between-subjects, and (f) in populations including college students, adults from the broader community, and in online samples. We posit that this bias may arise due to principles of object perception, and we show how it has downstream implications for decision-making.
Gao, T., & Scholl, B. J. (2011). Chasing vs. stalking: Interrupting the perception of animacy. Journal of Experimental Psychology: Human Perception & Performance, 37(3), 669-684.
Visual experience involves not only physical features such as color and shape, but also higher-level properties such as animacy and goal-directedness. Perceiving animacy is an inherently dynamic experience, in part because agents' goal-directed behavior may be frequently in flux -- unlike many of their physical properties. How does the visual system maintain and update representations of agents' animate and goal-directed behavior over time and motion? The present study explored this question in the context of a particularly salient form of perceived animacy: chasing, in which one shape (the 'wolf') pursues another shape (the 'sheep'). Here the participants themselves controlled the movement of the sheep, and the perception of chasing was assessed in terms of their ability to avoid being caught by the wolf -- which looked identical to many moving distractors, and so could be identified only by its motion. The wolf's pursuit was frequently interrupted by periods in which it was static, jiggling in place, or moving randomly (amidst distractors that behaved similarly). Only the latter condition greatly impaired the detection of chasing -- and only when the random motion was grouped into temporally extended chunks. These results reveal (1) how the detection of chasing is determined by the character and temporal grouping (rather than just the brute amount) of 'pursuit' over time; and (2) how these temporal dynamics can lead the visual system to either construct or actively reject interpretations of chasing.
Liverence, B. M., & Scholl, B. J. (2011). Selective attention warps spatial representation: Parallel but opposing effects on attended versus inhibited objects. Psychological Science, 22(12), 1600-1608.
Selective attention not only influences which objects in a display are perceived, but also directly changes the character of how they are perceived -- for example, making attended objects appear larger or sharper. In studies of multiple-object tracking and probe detection, we explored the influence of sustained selective attention on where objects are seen to be in relation to each other in dynamic multi-object displays. Surprisingly, we found that sustained attention can warp the representation of space in a way that is object-specific: In immediate recall of the positions of objects that have just disappeared, space between targets is compressed, whereas space between distractors is expanded. These effects suggest that sustained attention can warp spatial representation in unexpected ways.
Gao, T., & Scholl, B. J. (2010). Are objects required for object files?: Roles of segmentation and spatiotemporal continuity in computing object persistence. Visual Cognition, 18(1), 82-109.
Two central tasks of visual processing are (1) to segment undifferentiated retinal images into discrete objects, and (2) to represent those objects as the same persisting individuals over time and motion. Here we explore the interaction of these two types of processing in the context of object files -- mid-level visual representations that 'stick' to moving objects on the basis of spatiotemporal properties. Object files can be revealed by object-specific preview benefits (OSPBs), wherein a 'preview' of information on a moving object speeds the recognition of that information at a later point when it appears again on the same object (compared to when it reappears on a different moving object), beyond display-wide priming. Here we explore the degree of segmentation required to establish object files in the first place. Surprisingly, we find that no explicit segmentation is required until after the previews disappear, when using purely motion-defined objects (consisting of random elements on a random background). Moreover, OSPBs are observed in such displays even after moderate (but not long) delays between the offset of the preview information and the onset of the motion. These effects indicate that object files can be established without initial static segmentation cues, so long as there is spatiotemporal continuity between the previews and the eventual appearance of the objects. We also find that top-down strategies can sometimes mimic OSPBs, but that these strategies can be eliminated by novel manipulations. We discuss how these results alter our understanding of the nature of object files, and also why researchers must take care to distinguish 'true OSPBs' from 'illusory OSPBs'.
New, J. J., Schultz, R. T., Wolf, J., Niehaus, J. L., Klin, A., German, T. C., & Scholl, B. J. (2010). The scope of social attention deficits in autism: Prioritized orienting to people and animals in static natural scenes. Neuropsychologia, 48(1), 51-59.
A central feature of autism spectrum disorder (ASD) is an impairment in 'social attention' -- the prioritized processing of socially-relevant information, e.g. the eyes and face. Socially relevant stimuli are also preferentially attended in a broader categorical sense, however: observers orient preferentially to people and animals (compared to inanimate objects) in complex natural scenes. To measure the scope of social attention deficits in autism, observers viewed alternating versions of a natural scene on each trial, and had to 'spot the difference' between them -- where the difference involved either an animate or inanimate object. Change detection performance was measured as an index of attentional prioritization. Individuals with ASD showed the same prioritized social attention for animate categories as did control participants. This could not be explained by lower-level visual factors, since the effects disappeared when using blurred or inverted images. These results suggest that social attention -- and its impairment in autism -- may not be a unitary phenomenon: impairments in visual processing of specific social cues may occur despite intact categorical prioritization of social agents.
Turk-Browne, N. B., Scholl, B. J., Johnson, M. K., & Chun, M. M. (2010). Implicit perceptual anticipation triggered by statistical learning. Journal of Neuroscience, 30(33), 11177-11187.
Our environments are highly regular in terms of when and where objects appear relative to each other. Statistical learning allows us to extract and represent these regularities, but how this knowledge is used by the brain during ongoing perception is unclear. We employed rapid event-related fMRI to measure hemodynamic responses to individual visual images in a continuous stream that contained sequential contingencies. Sixteen human observers encountered these statistical regularities while performing an unrelated cognitive task, and were unaware of their existence. Nevertheless, the right anterior hippocampus showed greater hemodynamic responses to predictive stimuli, providing evidence for implicit anticipation as a consequence of unsupervised statistical learning. Hippocampal anticipation based on predictive stimuli correlated with subsequent processing of the predicted stimuli in occipital and parietal cortex, and anticipation in additional brain regions correlated with facilitated object recognition as reflected in behavioral priming. Additional analyses suggested that implicit perceptual anticipation does not contribute to explicit familiarity, but can result in predictive potentiation of category-selective ventral visual cortex. Overall, these findings show that future-oriented processing can arise incidentally during the perception of statistical regularities.
Albrecht, A. R., & Scholl, B. J. (2010). Perceptually averaging in a continuous visual world: Extracting statistical summary representations over time. Psychological Science, 21(4), 560-567.
Beyond processing individual features and objects, the visual system is also able to efficiently summarize visual scenes in various ways -- e.g. allowing us to perceive the average size of a group of objects. The extraction of such statistical summary representations (SSRs) is fast and accurate, but we do not yet have a clear picture of the circumstances in which they operate. Previous studies have always used discrete input -- either spatial arrays of shapes, or temporal sequences of shapes presented one at a time. Real-world visual environments, in contrast, are intrinsically continuous and dynamic. To better understand how SSRs may operate in natural environments, we investigated the ability to compute average size in visual displays wherein the objects (or sometimes a single object) changed continuously, expanding and contracting over time. The results indicated that perceptual averaging can operate continuously in dynamic displays -- sampling multiple times during a single continuous transformation with no discrete boundaries. Moreover, certain types of dynamic changes (e.g. expansion) influence the resulting perceptual averages more so than others (e.g. contraction), perhaps due to attentional capture. These results collectively illustrate how SSRs may be well adapted to dynamically changing real-world environments.
Gao, T., McCarthy, G., & Scholl, B. J. (2010). The wolfpack effect: Perception of animacy irresistibly influences interactive behavior. Psychological Science, 21(12), 1845-1853.
Imagine a pack of predators stalking their prey. The predators may not always move directly towards their target (as when circling around it), but they may be consistently facing toward it. The human visual system appears to be extremely sensitive to such situations, even in displays involving simple shapes. We demonstrate this by introducing the 'Wolfpack' effect, wherein several moving oriented shapes (darts, or discs with 'eyes') consistently point toward a moving disc. The shapes move randomly, but seem to interact with the disc -- as if they are collectively pursuing it. This impairs performance in interactive displays (including detection of actual pursuit), and leads observers to selectively avoid such shapes when moving a disc through the display themselves. These and other results reveal a novel 'social' cue to perceived animacy. And, whereas previous work has focused on the causes of perceived animacy, these results demonstrate its effects, showing how it irresistibly and implicitly shapes visual performance and interactive behavior.
Scholl, B. J., & Turk-Browne, N. B. (2010). Statistical learning. In B. Goldstein (Ed.), Encyclopedia of Perception, Volume 2 (pp. 935-938). Thousand Oaks, CA: Sage Publications.
It is natural to think of perception in terms of the processing of individual features (such as color and shape) and how they are combined into discrete objects (such as animals and bicycles). This simple characterization underestimates the information that is available in perceptual input, though, since there are also massive amounts of information about how these features and objects are distributed in space and time. In time, for example, eating food at a restaurant is more likely to be followed by paying a bill than by climbing a tree -- just as (in English) the syllable /sci/ is more likely to be followed by /ence/ than by /on/. And in space, for example, a car is more likely to be next to a bicycle than to a stapler. Discovering such regularities is difficult, since they are embedded within complex and continuous environments where not all information is relevant. But the mind is nevertheless sensitive to such regularities, uncovering them in part by means of statistical learning: an automatic and unconscious perceptual process that encodes statistical regularities across space and time. We are often unaware of the operation of statistical learning, yet it may play a crucial role in 'segmenting' the continuous perceptual world into discrete manageable units such as words, events, objects, and scenes. This entry describes statistical learning, how and when it operates, and how it may support online perception.
Scholl, B. J., & Flombaum, J. I. (2010). Object persistence. In B. Goldstein (Ed.), Encyclopedia of Perception, Volume 2 (pp. 653-657). Thousand Oaks, CA: Sage Publications.
Suppose that while playing tennis, an unfortunate swing propels your tennis ball out of the court and into some bushes. When you go to retrieve it, you find two tennis balls there. Which is yours? This is a type of correspondence problem: you must determine which of the two tennis balls corresponds to the one that you just hit out of the court. Though we are seldom explicitly aware of it, the visual system faces this type of problem thousands of times per day, whenever we encounter an object. On every such encounter, the visual system must determine whether a current bit of visual stimulation reflects a new object in the field of view, or an object that was already encountered a moment ago. This is the challenge of object persistence: the perception of the world not only in terms of discrete objects, but in terms of objects that retain their identities as the same individuals over time and motion. This entry notes how the problem of object persistence features in several aspects of perception, and it describes some of the ways that persistence is determined in visual processing.
Doran, M. M., Hoffman, J. E., & Scholl, B. J. (2009). The role of eye fixations in concentration and amplification effects during multiple object tracking. Visual Cognition, 17(4), 574-597.
When tracking spatially extended objects in a multiple object tracking task, attention is preferentially directed to the centers of those objects (attentional concentration), and this effect becomes more pronounced as object length increases (attentional amplification). However, it is unclear whether these effects depend on differences in attentional allocation or differences in eye fixations. We addressed this question by measuring eye fixations in a dual-task paradigm that required participants to track spatially extended objects, while simultaneously detecting probes that appeared at the centers or near the endpoints of objects. Consistent with previous research, we observed concentration and amplification effects: probes at the centers of objects were detected more readily than those near their endpoints, and this difference increased with object length. Critically, attentional concentration was observed when probes were equated for distance from fixation during free viewing, and concentration and amplification were observed without eye movements during strict fixation. We conclude that these effects reflect the prioritization of covert attention to particular spatial regions within extended objects, and we discuss the role of eye fixations during MOT.
Flombaum, J. I., Scholl, B. J., & Santos, L. R. (2009). Spatiotemporal priority as a fundamental principle of object persistence. In B. Hood & L. Santos (Eds.), The origins of object knowledge (pp. 135-164). Oxford University Press.
The impoverished and rapidly changing stimulation that falls on the retina looks very different from the stable world of discrete persisting objects that populates our visual experience. To get from the features on the retina to the objects that we experience, the visual system must solve several correspondence problems. One of these problems has to do with sameness: the visual system must decide whether each bit of stimulation reflects an object that was already encountered (which might occasion the updating of an existing object representation), or a new one (which might occasion the creation of a new object representation). This problem of object persistence has been studied with a wide array of visual phenomena and paradigms, and in several disciplines in cognitive science -- including vision science, developmental psychology, and comparative cognition. The study of object persistence in these different fields has progressed largely independently. Yet strikingly, they have converged on a core principle that guides the creation and maintenance of persisting object representations: the principle of spatiotemporal priority. When identifying objects as the same individuals over the time, the visual system appears to rely on their spatiotemporal histories -- i.e. where, when, and how they were encountered -- to a greater degree than their visual surface features. In this chapter we review the many contexts in which spatiotemporal priority drives computations of object persistence, and we propose explanations at several levels for why spatiotemporal priority plays this dominant role.
Cheries, E. W., Mitroff, S. R., Wynn, K., & Scholl, B. J. (2009). Do the same principles constrain persisting object representations in infant cognition and adult perception?: The cases of continuity and cohesion. In B. Hood & L. Santos (Eds.), The origins of object knowledge (pp. 107-134). Oxford University Press.
In recent years, the study of object persistence -- how the mind identifies objects as the same individuals over time -- has been undergoing a renaissance in (at least) two different fields of cognitive science. First, vision scientists have come to understand some of the principles that control the construction, maintenance, and destruction of object representations in mid-level vision. Second, developmental researchers have identified several principles of 'core knowledge' that constrain object permanence in infants. These two fields have traditionally operated largely independently, but some researchers have suggested that they may in fact be studying the same underlying mental processes. This interesting idea has been used in the past to interpret various empirical results in each field, but the real promise of this approach lies in its ability to drive further progress by generating novel predictions that can then be tested in both fields. The hope is that this approach could spark a useful feedback loop of sorts: For example, infancy research may give rise to specific predictions for adult perception experiments, whose subsequent results may in turn give rise to additional specific predictions for infant cognition experiments. To the extent that this strategy succeeds -- confirming ever more specific predictions as hypotheses are carried back and forth across these fields -- we may obtain support for the idea that these two fields are studying the same thing. In this chapter, we describe two examples of attempts to implement this strategy in practice, while studying two of the most salient principles of core knowledge: continuity and cohesion. In each case, earlier infant research was used to motivate adult perception experiments, which were in turn used to generate and test more specific predictions in further infant studies. These case studies illustrate the utility of bridging the gaps between these two fields, as our knowledge of each is deepened as a result of exploring the connections. In particular, this process has revealed in new ways how violations of these principles of core knowledge in turn have deleterious effects on the underlying object representations themselves. In addition, the results in each case are consistent with the possibility that representations of persisting objects in each domain are controlled by the same principles, and perhaps even the same underlying processes.
Turk-Browne, N. B., & Scholl, B. J. (2009). Flexible visual statistical learning: Transfer across space and time. Journal of Experimental Psychology: Human Perception & Performance, 35(1), 195-202.
The environment contains considerable information that is distributed across space and time, and the visual system is remarkably sensitive to such information via the operation of visual statistical learning (VSL). Previous VSL studies have focused on establishing what kinds of statistical relationships can be learned, but have not fully explored how this knowledge is then represented in the mind: the resulting representations could faithfully reflect the details of the learning context, but they could also be generalized in various ways. We studied this by testing how VSL transfers across changes between learning and test, and discovered a substantial degree of generalization. Learning of statistically-defined temporal sequences was expressed in static spatial configurations, and learning of statistically-defined spatial configurations facilitated detection performance in temporal streams. Learning of temporal sequences even transferred to reversed temporal orders during test, when accurate performance did not depend on order, per se. These types of transfer imply that VSL can result in flexible representations, which may in turn allow VSL to function in ever-changing natural environments.
New, J. J., & Scholl, B. J. (2009). Subjective time dilation: Spatially local, object-based, or a global visual experience? Journal of Vision, 9(2):4, 1-11, http://journalofvision.org/9/2/4/.
Time can appear to slow down in certain brief real-life events -- e.g. during car accidents or critical moments of athletes' performances. Such time dilation can also be produced to a smaller degree in the laboratory by 'oddballs' presented in series of otherwise identical stimuli. We explored the spatial distribution of subjective time dilation: Does time expand only for the oddball objects themselves, only for the local spatial region including the oddball, or for the entire visual field? Because real-life traumatic events provoke an apparently global visual experience of time expansion, we predicted -- and observed -- that a locally discrete oddball would also dilate the apparent duration of other concurrent events in other parts of the visual field. This 'dilation at a distance' was not diminished by increasing spatial separation between the oddball and target events, and was not influenced by manipulations of objecthood that drive object-based attention. In addition, behaviorally 'urgent' oddballs (looming objects) yielded time dilation, but visually similar receding objects did not. We interpret these results in terms of the influence of attention on time perception -- where attention reflects general arousal and faster internal pacing rather than spatial or object-based selection, per se. As a result, attention influences subjective time dilation as a global visual experience.
Turk-Browne, N. B., Scholl, B. J., Chun, M. M., & Johnson, M. K. (2009). Neural evidence of statistical learning: Efficient detection of visual regularities without awareness. Journal of Cognitive Neuroscience, 21(10), 1934-1945.
Our environment contains regularities distributed in space and time that can be detected by way of statistical learning. This unsupervised learning occurs without intent or awareness, but little is known about how it relates to other types of learning, how it affects perceptual processing, and how quickly it can occur. Here we use fMRI during statistical learning to explore these questions. Participants viewed statistically structured vs. unstructured sequences of shapes while performing a task unrelated to the structure. Robust neural responses to statistical structure were observed, and these responses were notable in four ways: First, responses to structure were observed in the striatum and medial temporal lobe, suggesting that statistical learning may be related to other forms of associative learning and relational memory. Second, statistical regularities received enhanced processing in category-specific visual regions (object-selective lateral occipital cortex, and word-selective ventral occipitotemporal cortex), demonstrating that these regions are sensitive to information distributed in time. Third, evidence of learning emerged early during familiarization, showing that statistical learning can operate very quickly and with little exposure. Finally, neural signatures of learning were dissociable from subsequent explicit familiarity, suggesting that learning can occur in the absence of awareness. Overall, our findings help elucidate the underlying nature of statistical learning.
Gao, T., Newman, G. E., & Scholl, B. J. (2009). The psychophysics of chasing: A case study in the perception of animacy. Cognitive Psychology, 59(2), 154-179.
Psychologists have long been captivated by the perception of animacy -- the fact that even simple moving shapes may appear to engage in animate, intentional, and goal-directed movements. Here we report several new types of studies of a particularly salient form of perceived animacy: chasing, in which one shape (the 'wolf') pursues another shape (the 'sheep'). We first demonstrate two new cues to perceived chasing: chasing subtlety (the degree to which the wolf deviates from perfectly 'heat-seeking' pursuit) and directionality (whether and how the shapes 'face' each other). We then use these cues to show how it is possible to assess the objective accuracy of such percepts, and to distinguish the immediate perception of chasing from those more subtle (but nevertheless real) types of 'stalking' that cannot be readily perceived. We also report several methodological advances. Previous studies of the perception of animacy have faced two major challenges: (a) it is difficult to measure perceived animacy with quantitative precision; and (b) task demands make it difficult to distinguish perception from higher-level inferences about animacy. We show how these challenges can be met, at least in our case study of perceived chasing, via two novel tasks based on dynamic visual search (the Find-the-Chase task) and a new type of interactive display (the Don't-Get-Caught! task).
Scholl, B. J. (2009). What have we learned about attention from multiple object tracking (and vice versa)? In D. Dedrick & L. Trick (Eds.), Computation, cognition, and Pylyshyn (pp. 48-79). Cambridge, MA: MIT Press.
If you weren't paying attention, you could be forgiven for thinking that this chapter was part of a collection assembled in honor if several people named Zenon Pylyshyn: the philosopher of psychology who has helped define the relation between mind and world; the computer scientist who has characterized the power of computation in the study of cognition; the cognitive psychologist whose imagery research is in every introductory textbook; and the vision scientist whose ideas and experimental paradigms form a foundation for work in visual cognition. (When I first learned of "Zenon Pylyshyn" in college, I figured that this couldn't really be someone's name, and given the breadth and importance of his contributions I figured that "he" must be some sort of research collective -- a Nicolas Bourbaki of cognitive science. I was lucky to have been able to study later with this excellent research collective in graduate school, though I discovered it was housed in one head.) This chapter is about the last of the Zenons noted above: the vision scientist. In the study of visual cognition, his lasting influence has stemmed in part from the way that he has bucked one of the most dangerous trends in experimental research: whereas most of us too easily fall into the trap of constructing theoretical questions to fit our experimental paradigms, Zenon has consistently managed the reverse. And there is perhaps no better example of this than his development of the multiple object tracking (henceforth MOT) paradigm. This chapter focuses on the nature of MOT, with three interrelated goals: (1) to explore what makes MOT unique -- and uniquely useful -- as a tool for studying visual cognition; (2) to characterize the relationship between attention and MOT; and (3) to highlight some of the most important things we've learned about attention from the study of MOT -- and vice versa.
Turk-Browne, N. B., Isola, P. J., Scholl, B. J., & Treat, T. A. (2008). Multidimensional visual statistical learning. Journal of Experimental Psychology: Learning, Memory, & Cognition, 34(2), 399-407.
Recent studies of visual statistical learning (VSL) have demonstrated that statistical regularities in sequences of visual stimuli can be automatically extracted, even without intent or awareness. Despite much work on this topic, however, several fundamental questions remain about the nature of VSL. In particular, previous experiments have not explored the underlying units over which VSL operates. In a sequence of colored shapes, for example, does VSL operate over each feature dimension independently, or over multidimensional objects in which color and shape are bound together? Here we demonstrate that VSL can be both object-based and feature-based, in systematic ways based on how different feature dimensions covary. For example, when each shape covaried perfectly with a particular color, VSL was object-based: observers expressed robust VSL for colored shape subsequences at test, but failed when the test items were presented monochromatically. However, when shape and color pairs were partially decoupled during learning, VSL operated over features: now observers expressed robust VSL even when the shapes were presented monochromatically at test. These and other results reported here suggest that VSL is object-based, but that sensitivity to feature correlations in multidimensional sequences (possibly another form of VSL) may in turn help define what counts as an object.
Cheries, E. W., Mitroff, S. R., Wynn, K., & Scholl, B. J. (2008). Cohesion as a principle of object persistence in infancy. Developmental Science, 11(3), 427-432.
A critical challenge for visual perception is to represent objects as the same persisting individuals over time and motion. Across several areas of cognitive science, researchers have identified cohesion as among the most important theoretical principles of object persistence: An object must maintain a single bounded contour over time. Drawing inspiration from recent work in adult visual cognition, the present study tested the power of cohesion as a constraint as it operates early in development. In particular, we tested whether the most minimal cohesion violation -- a single object splitting into two -- would destroy infants' ability to represent a quantity of objects over occlusion. In a forced-choice crawling paradigm, 10- and 12-month-old infants witnessed crackers being sequentially placed into containers, and typically crawled toward the container with the greater cracker quantity. When one of the crackers was visibly split in half, however, infants failed to represent the relative quantities, despite controls for the overall quantities and the motions involved. This result helps to characterize the fidelity and specificity of cohesion as a fundamental principle of object persistence, suggesting that even the simplest possible cohesion violation can dramatically impair infants' object representations and influence their overt behavior.
Flombaum, J. I., Scholl, B. J., & Pylyshyn, Z. W. (2008). Attentional resources in tracking through occlusion: The high-beams effect. Cognition, 107(3), 904-931.
A considerable amount of research has uncovered heuristics that the visual system employs to keep track of objects through periods of occlusion. Relatively little work, by comparison, has investigated the on-line resources that support this processing. We explored how attention is distributed when featurally identical objects become occluded during multiple object tracking. During tracking, observers had to detect small probes that appeared sporadically on targets, distracters, occluders, or empty space. Probe detection rates for these categories were taken as indexes of the distribution of attention throughout the display and revealed two novel effects. First, probe detection on an occluder's surface was better when either a target or distractor was currently occluded in that location, compared to when no object was behind that occluder. Thus even occluded (and therefore invisible) objects recruit object-based attention. Second, and more surprising, probe detection for both targets and distractors was always better when they were occluded, compared to when they were visible. This new attentional high-beams effect indicates that the ability to track through occlusion, though seemingly effortless, in fact requires the active allocation of special attentional resources.
Turk-Browne, N. B., Scholl, B. J., & Chun, M. M. (2008). Babies and brains: Habituation in infant cognition and functional neuroimaging. Frontiers in Human Neuroscience, 2, Article 16.
Many prominent studies of infant cognition over the past two decades have relied on the fact that infants habituate to repeated stimuli -- i.e. that their looking times tend to decline upon repeated stimulus presentations. This phenomenon had been exploited to reveal a great deal about the minds of preverbal infants. Many prominent studies of the neural bases of adult cognition over the past decade have relied on the fact that brain regions habituate to repeated stimuli -- i.e. that the hemodynamic responses observed in fMRI tend to decline upon repeated stimulus presentations. This phenomenon has been exploited to reveal a great deal about the neural mechanisms of perception and cognition. Similarities in the mechanics of these two forms of habituation suggest that it may be useful to relate them to each other. Here we outline this analogy, explore its nuances, and highlight some ways in which the study of habituation in functional neuroimaging could yield novel insights into the nature of habituation in infant cognition -- and vice versa.
Newman, G. E., Choi, H., Wynn, K., & Scholl, B. J. (2008). The origins of causal perception: Evidence from postdictive processing in infancy. Cognitive Psychology, 57(3), 262-291.
The currency of our visual experience consists not only of visual features such as color and motion, but also seemingly higher-level features such as causality -- as when we see two billiard balls collide, with one causing the other to move. One of the most important and controversial questions about causal perception involves its origin: do we learn to see causality, or does this ability derive in part from innately specified aspects of our cognitive architecture? Such questions are difficult to answer, but can be indirectly addressed via experiments with infants. Here we explore causal perception in 7-month-old infants, using a different approach from previous work. Recent work in adult visual cognition has demonstrated a postdictive aspect to causal perception: in certain situations, we can perceive a collision between two objects in an ambiguous display even after the moment of potential 'impact' has already passed. This illustrates one way in which our conscious perception of the world is not an instantaneous moment-by-moment construction, but rather is formed by integrating information over short temporal windows. Here we demonstrate analogous postdictive processing in infants' causal perception. This result demonstrates that even infants' visual systems process information in temporally extended chunks. Moreover, this work provides a new way of demonstrating causal perception in infants that differs from previous strategies, and is immune to some previous types of critiques.
Yi, D-J., Turk-Browne, N. B., Flombaum, J. I., Kim, M., Scholl, B. J., & Chun, M. M. (2008). Spatiotemporal object continuity in human ventral visual cortex. Proceedings of the National Academy of Sciences, 105(26), 8840-8845.
Coherent visual experience requires that objects be represented as the same persisting individuals over time and motion. Cognitive science research has identified a powerful principle that guides such processing: objects must trace continuous paths through space and time. Little is known, however, about how neural representations of objects, typically defined by visual features, are influenced by spatiotemporal continuity. Here we report the consequences of spatiotemporally continuous vs. discontinuous motion on perceptual representations in human ventral visual cortex. In experiments employing both dynamic occlusion and apparent motion, face-selective cortical regions exhibited significantly less activation when faces were repeated in continuous vs. discontinuous trajectories, suggesting that discontinuity caused featurally-identical objects to be represented as different individuals. These results indicate for the first time that spatiotemporal continuity modulates neural representations of object identity, influencing judgments of object persistence even in the most staunchly 'featural' areas of ventral visual cortex.
New, J. J., & Scholl, B. J. (2008). 'Perceptual scotomas': A functional account of motion-induced blindness. Psychological Science, 19(7), 653-659.
In motion-induced blindness (MIB), salient objects in full view can repeatedly fluctuate into and out of conscious awareness when superimposed onto certain global moving patterns. Here we suggest a new account of this striking phenomenon: rather than being a failure of visual processing, MIB may be a functional product of the visual system's attempt to separate distal stimuli from artifacts of damage to the visual system itself. When a small object is invariant with respect to changes that are occurring to a global region of the surrounding visual field, the visual system may discount that stimulus as akin to a scotoma, and may thus expunge it from awareness. We describe three new phenomena motivated by this account, and discuss how it can also account for several previous results.
Mitroff, S. R., Scholl, B. J., & Noles, N. S. (2007). Object files can be purely episodic. Perception, 36(12), 1730-1735.
Our ability to track an object as the same persisting entity over time and motion may primarily rely on spatiotemporal representations which encode some, but not all, of an object's features. Previous research using the 'object reviewing' paradigm has demonstrated that such representations can store featural information of well-learned stimuli such as letters and words at a highly abstract level. However, it is unknown whether these representations can also store purely episodic information (i.e. information obtained from a single, novel encounter) that does not correspond to pre-existing type-representations in long-term memory. Here, in an object-reviewing experiment with novel face images as stimuli, observers still produced reliable object-specific preview benefits in dynamic displays: A preview of a novel face on a specific object speeded the recognition of that particular face at a later point when it appeared again on the same object compared to when it reappeared on a different object (beyond display-wide priming), even when all objects moved to new positions in the intervening delay. This case study demonstrates that the mid-level visual representations which keep track of persisting identity over time -- e.g. 'object files', in one popular framework -- can store not only abstract types from long-term memory, but also specific tokens from online visual experience.
Fiser, J., Scholl, B. J., & Aslin, R. N. (2007). A perceptual motion bias during occlusion constraints visual statistical learning. Psychonomic Bulletin & Review, 14(1), 173-178.
Visual statistical learning of shape sequences was examined in the context of ambiguous occluded object trajectories. In a learning phase, participants viewed a sequence of moving shapes with speed profiles that elicited either a 'bouncing' or a 'streaming' percept: one shape moved toward and passed behind an occluder, after which two different shapes emerged from behind the occluder. At issue was whether statistical learning linked both object transitions equally, or whether the percept of bouncing vs. streaming constrained the association between pre- and post-occlusion objects. In familiarity judgments following the learning, participants reliably selected the shape-pair that conformed to the bouncing vs. streaming bias present during the learning phase. A follow-up experiment demonstrated that differential eye-movements could not account for this finding. These results suggest that sequential statistical learning is constrained by the inferences based on spatiotemporal perceptual biases that bind two moving shapes through occlusion, thereby reducing computational complexity of visual statistical learning.
Ben-Shahar, O., Scholl, B. J., & Zucker, S. W. (2007). Attention, segregation, and textons: Bridging the gap between object-based attention and texton-based segregation. Vision Research, 47(6), 845-860.
Studies of object-based attention (OBA) have suggested that attentional selection is intimately associated with discrete objects. However, the relationship of this association to the basic visual features ('textons') which guide the segregation of visual scenes into 'objects' remains largely unexplored. Here we study this hypothesized relationship for one of the most conspicuous features of early vision: orientation. To do so we examine how attention spreads through uniform (one 'object') orientation-defined textures (ODTs), and across texture-defined boundaries in discontinuous (two 'objects') ODTs. Using the divided-attention paradigm we find that visual events that are known to trigger orientation-based texture segregation, namely perceptual boundaries defined by high orientation and/or curvature gradients, also induce a significant cost on attentional selection. At the same time we show that no effect is incurred by the absolute value of the textons, i.e., by the general direction (or, the 'grain') of the texture -- in conflict with previous findings in the OBA literature. Collectively these experiments begin to reveal the link between object-based attention and texton-based segregation, a link which also offers important cross-disciplinary methodological advantages.
Junge, J. A., Scholl, B. J., & Chun, M. M. (2007). How is spatial context learning integrated over time?: A primacy effect in contextual cueing. Visual Cognition, 15(1), 1-11.
Over repeated exposure to particular visual search displays, subjects are able to implicitly extract regularities that then make search more efficient -- a phenomenon known as contextual cueing. Here we explore how the learning involved in contextual cueing is formed, maintained, and updated over experience. During an initial training phase, a group of 'signal first' subjects searched through a series of predictive displays (where distractor locations were perfectly correlated with the target location), followed with no overt break by a series of unpredictive displays (where repeated contexts were uncorrelated with target locations). A second 'noise first' group of subjects encountered the unpredictive displays followed by the predictive displays. Despite the fact that both groups had the same overall exposure to signal and noise, only the signal-first group demonstrated contextual cueing. This primacy effect indicates that initial experience can result in hypotheses about regularities in displays -- or the lack thereof -- which then become resistant to updating. The absence of regularities in early stages of training even blocked observers from learning predictive regularities later on.
Scholl, B. J. (2007). Object persistence in philosophy and psychology. Mind & Language, 22(5), 563-591.
What makes an object the same persisting individual over time? Philosophers and psychologists have both grappled with this question, but from different perspectives -- philosophers conceptually analyzing the criteria for object persistence, and psychologists exploring the mental mechanisms that lead us to experience the world in terms of persisting objects. It is striking that the same themes populate explorations of persistence in these two very different fields -- e.g. the roles of spatiotemporal continuity, persistence through property change, and cohesion violations. Such similarities may reflect an underlying connection, in that psychological mechanisms of object persistence (especially relevant parts of mid-level visual object processing) may serve to underlie the intuitions about persistence that fuel metaphysical theories. This would be a way for cognitive science to join these two disparate fields, helping to explain the possible origins and reliability of some metaphysical intuitions, and perhaps leading to philosophical progress.
Choi, H., & Scholl, B. J. (2006). Perceiving causality after the fact: Postdiction in the temporal dynamics of causal perception. Perception, 35(3), 385-399.
In simple dynamic events we can easily perceive not only motion, but also higher-level properties such as causality, as when we see one object collide with another. Several researchers have suggested that such causal perception is an automatic and stimulus-driven process, sensitive only to particular sorts of visual information, and a major research project has been to uncover the nature of these visual cues. Here, rather than investigating what information affects causal perception, we instead explore the temporal dynamics of when certain types of information are used. Surprisingly, we find that certain visual events can determine whether we perceive a collision in an ambiguous situation even when those events occur after the moment of potential 'impact' in the putative collision has already passed. This illustrates a type of postdictive perception: our conscious perception of the world is not an instantaneous moment-by-moment construction, but rather is formed by integrating information presented within short temporal windows -- so that new information which is obtained can influence the immediate past in our conscious awareness. Such effects have been previously demonstrated for low-level motion phenomena, but the present results demonstrate that postdictive processes can also influence higher-level event perception. These findings help to characterize not only the 'rules' of causal perception, but also the temporal dynamics of how and when those rules operate.
Cheries, E. W., Newman, G. E., Santos, L. R., & Scholl, B. J. (2006). Units of visual individuation in Rhesus Macaques: Objects or unbound features? Perception, 35(8), 1057-1071.
Vision begins with the processing of unbound visual features, which must eventually be bound together into object representations. Such feature binding is required for coherent visual perception, and accordingly has received a considerable amount of study in several domains. Neurophysiological work, often in monkeys, has investigated the details of how and where feature binding occurs in the brain, but methodological limitations have not allowed this research to elucidate just how feature-binding operates spontaneously in real-world situations. In contrast, behavioral work with human infants has demonstrated how we use simpler unbound features to individuate and identify objects over time and occlusion in many types of events, but this work has not typically been able to isolate the role of feature binding in such processing. Here we provide a method for assessing the spontaneity and fidelity of feature binding in non-human primates, as this process is utilized in real-world situations, including simple foraging behaviors. Using both looking-time and manual-search measures in a natural environment, we show that free-ranging rhesus macaques (Macaca mulatta) spontaneously bind features in order to individuate objects across time and occlusion in dynamic events. This pattern of results demonstrates that feature binding is used in subtle ways to guide ecologically relevant behavior in a nonhuman animal, spontaneously and reliably, in its natural environment.
Choi, H., & Scholl, B. J. (2006). Measuring causal perception: Links to representational momentum? Acta Psychologica, 123(1-2), 91-111.
In a collision between two objects, we can perceive not only low-level properties, such as color and motion, but also the seemingly high-level property of causality. It has proven difficult, however, to measure causal perception in a quantitatively rigorous way which goes beyond perceptual reports. Here we focus on the possibility of measuring perceived causality using the phenomenon of representational momentum (RM). Recent studies suggest a relationship between causal perception and RM, based on the fact that RM appears to be attenuated for causally 'launched' objects. This is explained by appeal to the visual expectation that a 'launched' object is inert and thus should eventually cease its movement after a collision without a source of self-propulsion. We first replicated these demonstrations, and then evaluated this alleged connection by exploring RM for different types of displays, including the contrast between causal launching and non-causal 'passing'. These experiments suggest that the RM-attenuation effect is not a pure measure of causal perception, but rather may reflect lower-level spatiotemporal correlates of only some causal displays. We conclude by discussing the strengths and pitfalls of various methods of measuring causal perception.
Wagemans, J., Van Lier, R., & Scholl, B. J. (2006). Introduction to Michotte's heritage in perception and cognition research. Acta Psychologica, 123(1-2), 1-19.
Several decades after Michotte's work was published, it continues to inspire current research in perception, cognition, and beyond. In this special issue we pay tribute to this heritage with a collection of empirical and theoretical papers on amodal completion and the perception of causality, two areas of research within which Michotte's work and ideas have had a lasting influence. As background to better understand the remaining papers, we briefly sketch Michotte's life and work and the scope (in breadth and in depth) of his impact. We then review Michotte's seminal contributions to the areas covered in this special issue, some of the major research discoveries and themes in the intervening decades, and the major open questions and challenges we are still facing. We also include a sneak preview of the papers in this special issue, noting how they relate to Michotte's work and to each other. This review shows both how much influence Michotte has had on contemporary perception and cognition research, and how much important work remains to be done. We hope that the papers in this special issue will serve both to celebrate Michotte's heritage in this respect, and to inspire other investigators to continue the projects he began.
Flombaum, J. I., & Scholl, B. J. (2006). A temporal same-object advantage in the tunnel effect: Facilitated change detection for persisting objects. Journal of Experimental Psychology: Human Perception & Performance, 32(4), 840-853.
Meaningful visual experience requires computations that identify objects as the same persisting individuals over time, motion, occlusion, and featural change. Here we explore these computations in the tunnel effect: When an object moves behind an occluder, and then an object later emerges following a consistent trajectory, we irresistibly perceive a persisting object, even when the pre- and post-occlusion views contrast featurally. We introduce a new change-detection method for quantifying percepts of the tunnel effect. Observers had to detect color changes in displays where several objects oscillated behind occluders and occasionally changed color. Across comparisons with several types of spatiotemporal gaps, as well as manipulations of occlusion vs. implosion, performance was better when objects' kinematics gave the impression of a persisting individual. The results reveal a temporal same-object advantage: better change detection across temporal scene fragments bound into the same persisting object representations. This suggests that persisting objects are the underlying units of visual memory.
Cheries, E. W., Wynn, K., & Scholl, B. J. (2006). Interrupting infants' persisting object representations: An object-based limit? Developmental Science, 9(5), F50-F58.
Making sense of the visual world requires keeping track of objects as the same persisting individuals over time and occlusion. Here we implement a new paradigm using 10-month-old infants to explore the processes and representations that support this ability in two ways. First, we demonstrate that persisting object representations can be maintained over brief interruptions from additional independent events -- just as a memory of a traffic scene may be maintained through a brief glance in the rearview mirror. Second, we demonstrate that this ability is nevertheless subject to an object-based limit: if an interrupting event involves enough objects (carefully controlling for overall salience), then it will impair the maintenance of other persisting object representations even though it is an independent event. These experiments demonstrate how object representations can be studied via their 'interruptibility', and the results are consistent with the idea that infants' persisting object representations are constructed and maintained by capacity-limited mid-level 'object-files'.
Noles, N., Scholl, B. J., & Mitroff, S. R. (2005). The persistence of object file representations. Perception & Psychophysics, 67(2), 324-334.
Coherent visual experience of dynamic scenes requires not only that the visual system segment scenes into component objects, but that these object representations persist, so that an object can be identified as the same object from an earlier time. Object files (OFs) are visual representations thought to mediate such abilities: OFs lie between lower-level sensory processing and higher-level recognition, and track salient objects over time and motion. OFs have traditionally been studied via 'object-specific preview benefits' (OSPBs): discriminations of an object's features are speeded when an earlier preview of those features occurs on the same object, as opposed to on a different object, beyond general display-wide priming. Despite its popularity, many fundamental aspects of the OF framework remain unexplored. For example, though OFs are thought to be involved primarily in online visual processing, we do not know how long such representations persist: previous studies found OSPBs for up to 1500 ms, but did not test longer durations. We explored this issue using a modified 'object-reviewing' paradigm, and found that robust OSPBs persist for more than 5 times as long as previously tested values -- at least 8 s -- and possibly much longer. Object files may be the 'glue' which makes visual experience coherent not just in online moment-by-moment processing, but on the scale of seconds which characterizes our everyday, perceptual experiences. These findings also bear on research in infant cognition, where OFs are thought to explain infants' abilities to track and enumerate small sets of objects, over longer durations.
Marino, A. C., & Scholl, B. J. (2005). The role of closure in defining the 'objects' of object-based attention. Perception & Psychophysics, 67(7), 1140-1149.
Many recent studies have concluded that the underlying units of visual attention are often discrete objects whose boundaries constrain the allocation of attention. However, relatively few studies have explored the particular stimulus cues that determine what counts as an 'object' of attention. Here we explore this issue in the context of the 'two-rectangles' stimuli previously used by many investigators. We first show, using both spatial cueing and divided-attention paradigms, that same-object advantages occur even when the ends of the two rectangles are not drawn. This is consistent with previous reports which have emphasized the importance of individual contours in guiding attention, and our study shows that such effects can occur in displays which also contain grouping cues. In our divided-attention experiment, however, this contour-driven same-object advantage was significantly weaker than that obtained with the standard stimulus, with the added cue of closure -- demonstrating that contour-based processes are not the whole story. These results confirm and extend the observation that same-object advantages can be observed even without full-fledged 'objects'. At the same time, however, these studies show that boundary closure -- one of the most important cues to objecthood per se -- can directly influence attention. We conclude that object-based attention is not an all-or-nothing phenomenon: object-based effects can be independently strengthened or weakened by multiple cues to objecthood.
Mitroff, S. R., Scholl, B. J., & Wynn, K. (2005). The relationship between object files and conscious perception. Cognition, 96, 67-92.
Many aspects of visual perception appear to operate on the basis of representations which precede identification and recognition, but in which discrete objects are segmented from the background and tracked over time (unlike early sensory representations). It has become increasingly common to discuss such phenomena in terms of 'object files' (OFs) -- hypothesized mid-level representations which mediate our conscious perception of persisting objects -- e.g. telling us 'which went where'. Despite the appeal of the OF framework, no previous research has directly explored whether OFs do indeed correspond to conscious percepts. Here we present at least one case wherein conscious percepts of 'which went where' in dynamic ambiguous displays diverge from the analogous correspondence computed by the OF system. Observers viewed a 'bouncing/streaming' display in which two identical objects moved such that they could have either bounced off or streamed past each other. We measured two dependent variables: (1) an explicit report of perceived bouncing or streaming; and (2) an implicit 'object-specific preview benefit' (OSPB), wherein a 'preview' of information on a specific object speeds the recognition of that information at a later point when it appears again on the same object (compared to when it reappears on a different object), beyond display-wide priming. When the displays were manipulated such that observers had a strong bias to perceive streaming (on over 95% of the trials), there was nevertheless a strong OSPB in the opposite direction -- such that the object files appeared to have 'bounced' even though the percept 'streamed'. Given that OSPBs have been taken as a hallmark of the operation of object files, the five experiments reported here suggest that in at least some specialized (and perhaps ecologically invalid) cases, conscious percepts of 'which went where' in dynamic ambiguous displays can diverge from the mapping computed by the object-file system.
Scholl, B. J. (2005). Innateness and (Bayesian) visual perception. In P. Carruthers, S. Laurence, & S. Stich (Eds.), The innate mind: Structure and contents (pp. 34-52). Oxford University Press.
Because innateness is such a complex and controversial issue when applied to higher-level cognition, it can be useful to explore how nature and nurture interact in simpler, less controversial contexts. One such context, perhaps, is the study of certain aspects of visual perception -- where especially rigorous models are possible, and where it is less controversial to claim that certain aspects of the visual system are in part innately specified. The hope is that scrutiny of these simpler contexts might yield lessons which can then be applied to debates about the possible innateness of other aspects of the mind. This chapter will explore a particular way in which visual processing appears to involve innate constraints, and will attempt to show how such processing overcomes one enduring challenge to nativism. In particular, many challenges to nativist theories in other areas of cognitive psychology (e.g. 'theory of mind', infant cognition) have focused on the later development of such abilities, and have argued that such development is in conflict with innate origins (since those origins would have to be somehow changed or overwritten). Innateness, in these contexts, is seen as anti-developmental, associated instead with static processes and principles. In contrast, certain perceptual models demonstrate how the very same mental processes can both be innately specified and yet develop richly in response to experience with the environment. In fact, this process is entirely unmysterious, as is made clear in certain formal theories of visual perception, including those which appeal to spontaneous endogenous stimulation, and those based on Bayesian inference.
Mitroff, S. R., & Scholl, B. J. (2005). Forming and updating object representations without awareness: Evidence from motion-induced blindness. Vision Research, 45(8), 961-967.
The input to visual processing consists of an undifferentiated array of features which must be parsed into discrete units. Here we explore the degree to which conscious awareness is important for forming such object representations, and for updating them in the face of changing visual scenes. We do so by exploiting the phenomenon of motion-induced blindness (MIB), wherein salient (and even attended) objects fluctuate into and out of conscious awareness when superimposed onto certain global motion patterns. By introducing changes to unseen visual stimuli during MIB, we demonstrate that object representations can be formed and updated even without conscious access to those objects. Such changes can then influence not only how stimuli reenter awareness, but also what reenters awareness. We demonstrate that this processing encompasses simple object representations and also several independent Gestalt grouping cues. We conclude that flexible visual parsing over time and visual change can occur even without conscious perception. Methodologically, we conclude that MIB may be an especially useful tool for studying the role of awareness in visual processing and vice versa.
Endress, A. D., Scholl, B. J., & Mehler, J. (2005). The role of salience in the extraction of algebraic rules. Journal of Experimental Psychology: General, 134(3), 406-419.
Recent research has alleged that humans and other animals have sophisticated abilities to extract both statistical dependencies and rule-based regularities from sequences. Most of this work has demonstrated the surprising flexibility and generality of such processes, but an equally important project is to explore their limits. Here we take up this project in the context of rule-based generalizations. As a case study, we demonstrate that only repetition-based structures with repetitions at the edges of sequences (e.g., ABCDEFF but not ABCDDEF) can be reliably generalized in certain situations, even though token-repetitions can easily be discriminated at both the edges and in the middles of sequences. This suggests important new limits on rule-based sequence learning, and suggests new interpretations of earlier work alleging rule-based generalization in infants. Rather than implementing a formal process of extracting regularities which operates over all patterns equally well (as in a digital computer), rule-based learning may be a highly constrained and piecemeal process driven by 'perceptual primitives' -- type-operations that are highly sensitive to perceptual factors.
Turk-Browne, N. B., Junge, J. A., & Scholl, B. J. (2005). The automaticity of visual statistical learning. Journal of Experimental Psychology: General, 134(4), 552-564.
Our visual environment contains massive amounts of information involving the relations between objects in space and time, and recent studies of visual statistical learning (VSL) have suggested that this information can be automatically extracted by the visual system. Here we explore the automaticity of VSL in several ways, using both explicit familiarity and implicit response-time measures. We demonstrate that (a) the input to VSL is gated by selective attention; (b) VSL is nevertheless an implicit process, since it operates during a cover task and without awareness of the underlying statistical patterns; and (c) VSL constructs abstracted representations which are then invariant to changes in extraneous surface features. We conclude that VSL both is and is not automatic: it requires attention to select the relevant population of stimuli, but the resulting learning then occurs without intent or awareness.
Alvarez, G. A., & Scholl, B. J. (2005). How does attention select and track spatially extended objects?: New effects of attentional concentration and amplification. Journal of Experimental Psychology: General, 134(4), 461-476.
Real-world situations involve attending to spatially extended objects, often under conditions of motion and high processing load. We explored such processing by requiring observers to attentionally track a number of long, moving lines. Concurrently, observers responded to sporadic probes, as a measure of the distribution of attention across lines. We found that attention is concentrated at the centers of lines during tracking, despite their uniformity. Surprisingly, this center-advantage grew as the lines became longer: not only did observers get worse near the endpoints, but they became better at the lines' centers -- as if attention became more concentrated as the objects became more extended. These results begin to show how attention is flexibly allocated in online visual processing within extended dynamic objects.
Most, S. B., Scholl, B. J., Clifford, E., & Simons, D. J. (2005). What you see is what you set: Sustained inattentional blindness and the capture of awareness. Psychological Review, 112(1), 217-242.
This paper reports a theoretical and experimental attempt to relate and contrast two traditionally separate research programs: inattentional blindness and attention capture. Inattentional blindness refers to a common phenomenon in which people fail to notice unexpected objects and events when their attention is otherwise engaged. Insights about the roles of visual properties and top-down attentional set can be drawn from research on attention capture, which traditionally employs implicit indices (e.g., response times) to investigate automatic shifts of attention. Yet, because attention capture research usually measures performance, whereas inattentional blindness research measures awareness, the two fields have existed side by side with no shared theoretical framework. Constructing that framework requires that insights from one literature be related to and tested in the other, and here we set this process in motion. We propose a theoretical unification influenced by a "perceptual cycle model" (Neisser, 1976), and we then systematically adapt several of the most important effects from the attention capture literature to the context of sustained inattentional blindness experiments. Although some stimulus properties can influence noticing of unexpected objects, the most influential factor affecting noticing is a person's own attentional goals. We conclude that many -- but not all -- aspects of attention capture also apply to the capture of awareness, but that these two classes of phenomena remain importantly distinct.
Scholl, B. J., Simons, D. J., & Levin, D. T. (2004). 'Change blindness' blindness: An implicit measure of a metacognitive error. In D. T. Levin (Ed.), Thinking and seeing: Visual metacognition in adults and children (pp. 145-164). Cambridge, MA: MIT Press.
Most people have strong but mistaken intuitions about how perception and cognition work. Such intuitions can give rise to especially pernicious 'metacognitive errors', which are directly fueled by visual experience. Here we explore one such metacognitive error, which infects our intuitions about visual awareness and the perception of change. In 'change blindness' studies, observers fail to notice large changes made to displays when they are viewing, but typically not attending, the changed regions. This phenomenon has been the focus of much recent research, largely because it is so surprising: people vastly overestimate their change detection ability. Here we demonstrate and quantify an implicit effect of this metacognitive error, and explore some of the factors which mediate it. Observers viewed an original and a changed photograph which repeatedly alternated, separated by a brief blank interval. They were told that the change could be added to the 'flickering' at any time. In reality the change was added either immediately (Experiment 1) or after 4 s (Experiment 2). Upon detecting the change, observers were informed of their response time and were then asked to estimate when the change had been added. Observers underestimated the degree to which they were 'change blind', typically inferring that the change had been added much later than it actually was. Average estimates ranged up to 31 s after the 'flickering' began -- over 85 times the correct value. Such effects were further magnified in an additional study (Experiment 3) which employed natural scenes and changes specifically designed to induce a high degree of this 'change blindness blindness' (CBB). These studies collectively demonstrate that CBB can persist across many trials in an actual change-detection task and provide a new way to quantify and explore the factors which mediate CBB. This research highlights the extent to which we can overestimate the fidelity of some aspects of visual processing.
Choi, H., & Scholl, B. J. (2004). Effects of grouping and attention on the perception of causality. Perception & Psychophysics, 66(6), 926-942.
Beyond perceiving patterns of motion in simple dynamic displays, we can also perceive higher level properties such as causality, as when we see one object collide with another object. Though causality is a seemingly high-level property, its perception -- like the perception of faces or speech -- often appears to be automatic, irresistible, and driven by highly constrained and stimulus-driven rules. Here, in an exploration of such rules, we demonstrate that perceptual grouping and attention can both influence the perception of causality in ambiguous displays. We first report several types of grouping effects, based on connectedness, proximity, and common motion. We further suggest that such grouping effects are mediated by the allocation of attention, and we directly demonstrate that causal perception can be strengthened or attenuated based on where observers are attending, independent of fixation. Like Michotte, we find that the perception of causality is mediated by strict visual 'rules'. Beyond Michotte, we find that these rules operate not only over discrete objects, but also over perceptual groups, constrained by the allocation of attention.
Mitroff, S. R., & Scholl, B. J. (2004). Seeing the disappearance of unseen objects. Perception, 33(10), 1267-1273.
Because of the massive amount of incoming visual information, perception is fundamentally selective. We are aware of only a small subset of our visual input at any given moment, and a great deal of activity can occur right in front of our eyes without reaching awareness. While previous work has shown that even salient visual objects can go unseen, here we demonstrate the opposite pattern, wherein observers perceive stimuli which are not physically present. In particular, we show in two motion-induced blindness experiments that unseen objects can momentarily re-enter awareness when they physically disappear: in some situations, you can see the disappearance of something you can't see. Moreover, when a stimulus changes outside of awareness in this situation and then physically disappears, observers momentarily see the altered version -- thus perceiving properties of an object that they had never seen before, after that object is already gone. This phenomenon of 'perceptual reentry' yields new insights into the relationship between visual memory and conscious awareness.
Scholl, B. J., & Nakayama, K. (2004). Illusory causal crescents: Misperceived spatial relations due to perceived causality. Perception, 33(4), 455-469.
When an object A moves toward an object B until they are adjacent, at which point A stops and B starts moving, we often see a collision -- i.e. we see A as the cause of B's motion. Many studies have explored the spatiotemporal parameters which mediate the perception of causality, but this work is seldom related to other aspects of perception. Here we report a novel illusion, wherein the perception of causality affects the perceived spatial relations among two objects involved in a collision event: observers systematically underestimate the amount of overlap between two items in an event which is seen as a causal collision. This occurs even when the causal nature of the event is induced by a surrounding context, such that estimates of the amount of overlap in the very same event are much improved when the event is displayed in isolation, without a 'causal' interpretation. This illusion implies that the perception of causality does not proceed completely independently of other visual processes, but can affect the perception of other spatial properties.
Flombaum, J. I., Kundey, S. M., Santos, L. R., & Scholl, B. J. (2004). Dynamic object individuation in rhesus macaques: A study of the tunnel effect. Psychological Science, 15(12), 795-800.
A manual search experiment with rhesus monkeys (Macaca mulatta) explored dynamic object individuation in the tunnel effect: Subjects watched as a lemon rolled down a ramp and came to rest behind a tunnel (Occluder 1), and then as a kiwi emerged and became occluded at the end of its path behind a screen (Occluder 2). When the kiwi emerged at about the time that the lemon should have (had it continued its motion), subjects searched for food only behind Occluder 2 -- apparently perceiving the lemon transform into a kiwi on the basis of spatiotemporally continuous motion. In contrast, when a brief pause interrupted the occlusion of the lemon and the emergence of the kiwi, monkeys searched for food behind both occluders. With further control conditions, this demonstrates a spatiotemporal bias -- similar to work in adult visual perception -- in the computation of object persistence when faced with a dynamic correspondence problem.
Mitroff, S. R., Scholl, B. J., & Wynn, K. (2004). Divide and conquer: How object files adapt when a persisting object splits into two. Psychological Science, 15(6), 420-425.
Coherent visual experience requires not only segmenting incoming visual input into a structured scene of objects, but also binding discrete views of objects into dynamic representations which persist across time and motion. However, surprisingly little work has explored the principles which guide the construction and maintenance of such persisting object representations. What causes a part of the visual field to be treated as the same object over time? In the cognitive development literature, a key principle of object persistence is cohesion: objects must always maintain a single bounded contour. Here we demonstrate for the first time that mechanisms of adult mid-level vision are also affected by cohesion violations. Using the 'object file' framework, we test whether object-specific preview benefits (OSPBs) -- a hallmark of persisting object representations -- are obtained for dynamic objects which split into two during their motion. We find that OSPBs do not fully persist through such cohesion violations without incurring significant performance costs. These results illustrate for the first time how cohesion is employed as a constraint which guides the maintenance of object representations in adult mid-level vision.
vanMarle, K., & Scholl, B. J. (2003). Attentive tracking of objects vs. substances. Psychological Science, 14(5), 498-504.
Recent research in vision science, infant cognition, and word-learning all suggest a special role for the processing of individual discrete objects. But what counts as an object? Answers to this question often depend on contrasting object-based processing with the processing of spatial areas, or unbound visual features. In infant cognition and word-learning, though, another salient contrast has been between rigid cohesive objects and nonsolid substances. Whereas objects may move from one location to another, a nonsolid substance must pour from one location to another. Here we explore whether attentive tracking processes are sensitive to dynamic information of this type. Using a multiple-object tracking task, we show that subjects can easily track 4 in 8 identical unpredictably-moving entities which move as discrete objects from one location to another, but cannot track similar entities which noncohesively 'pour' from one location to another -- even when the items in both conditions follow the same trajectories at the same speeds. Other conditions reveal that the inability to track multiple 'substances' stems not from the violations of rigidity or cohesiveness per se, since subjects are able to track multiple non-cohesive collections and multiple non-rigid deforming objects. Rather, the impairment is due to the dynamic extension and contraction during the 'substance-like' motion, which render 'the' location of the entity ambiguous. These results demonstrate a convergence between processes of mid-level adult vision and infant cognition, and in general help to clarify what can count as a persisting dynamic 'object' of attention.
Scholl, B. J., & Nakayama, K. (2002). Causal capture: Contextual effects on the perception of collision events. Psychological Science, 13(6), 493-498.
In addition to perceiving the colors, shapes, and motions of objects, we can also perceive higher-level properties of visual events. One such property is causation, as when you see one object collide with another and cause it to move. We report a striking new type of contextual effect on the perception of such collision events. Consider an object (A) which moves toward a stationary object (B) until they are adjacent, at which point A stops and B starts moving along the same path. Such 'launches' are perceived in terms beyond these kinematics: as noted in Michotte's classic studies, we see A cause B's motion. When A and B fully overlap before B's motion, however, observers typically see a completely non-causal 'pass', despite salient featural differences: one shape remains stationary while another passes over it. In the presence of a distinct nearby launch event, however, this stimulus is 'captured': it too is now irresistibly seen as causal. This contextual capture requires that the context event be present for only 50 ms surrounding the 'impact', but is destroyed by only 200 ms of temporal asynchrony. We report such cases, and others, that help define the rules which the visual system uses to construct percepts of seemingly high-level properties like causation.
Scholl, B. J., & Simons, D. J. (2001). Change blindness, Gibson, and the sensorimotor theory of vision. [Commentary] Behavioral and Brain Sciences, 24(5), 1004-1005.
We suggest that the sensorimotor 'theory' of vision is really an unstructured collection of separate ideas, and that much of the evidence cited in its favor at best supports only a subset of these ideas. As an example, we note that work on 'change blindness' does not "vindicate" (or even speak to) much of the sensorimotor framework. Moreover, the ideas themselves are not always internally consistent. Finally, the proposed framework draws on ideas initially espoused by James Gibson, but does little to differentiate itself from those earlier views. For even part of this framework to become testable, it must specify which sources of evidence can support or contradict each of the component hypotheses.
Most, S. B., Simons, D. J., Scholl, B. J., Jiminez, R., Clifford, E., & Chabris, C. F. (2001). How not to be seen: The contribution of similarity and selective ignoring to sustained inattentional blindness. Psychological Science, 12(1), 9-17.
When people attend to objects or events in a visual display, they often fail to notice an additional, unexpected, but fully visible object or event in the same display. This phenomenon is now known as "inattentional blindness." We present a new approach to the study of sustained inattentional blindness for dynamic events, in order to explore the roles of similarity, distinctiveness, and attentional set on the detection of unexpected objects. In Experiment 1, we find that the similarity of an unexpected object to other objects in the display influences attentional capture: the more similar an unexpected object is to the attended items, and the greater its difference from the ignored items, the more likely it is that people will notice it. Experiment 2 explores whether this effect of similarity is driven by selective ignoring of irrelevant items or by selective focusing on attended items. Experiment 3 suggests that the distinctiveness of the unexpected object alone cannot entirely account for the similarity effects described in the first two experiments: when attending to black items or white items in a dynamic display, nearly 30 percent of observers fail to notice a bright red cross move across the display, even though it has a unique color, luminance, shape, and motion trajectory and is visible for 5 seconds. Together, the results suggest that inattentional blindness for ongoing dynamic events depends both on the similarity of the unexpected object to the other objects in the display and on the observer's attentional set.
Scholl, B. J., Pylyshyn, Z. W., & Feldman, J. (2001). What is a visual object? Evidence from target merging in multiple object tracking. Cognition, 80(1/2), 159-177.
The notion that visual attention can operate over visual objects in addition to spatial locations has recently received much empirical support, but there has been relatively little empirical consideration of what can count as an 'object' in the first place. We have investigated this question in the context of the multiple object tracking paradigm, in which subjects must track a number of independently and unpredictably moving identical items in a field of identical distractors. What types of feature clusters can be tracked in this manner? In other words, what counts as an 'object' in this task? We investigated this question with a technique we call target merging: we alter tracking displays so that distinct target and distractor locations appeared perceptually to be parts of the same object, by merging pairs of items (one target with one distractor) in various ways -- for example by connecting item locations with a simple line segment, by drawing the convex hull of the two items, and so forth. The data show that target merging makes the tracking task far more difficult, to varying degrees depending on exactly how the items are merged. The effect is perceptually salient, involving in some conditions a total destruction of subjects' capacity to track multiple items. These studies provide strong evidence for the object-based nature of tracking, confirming that in some contexts attention must be allocated to objects rather than arbitrary collections of features. In addition, the results begin to reveal the types of spatially organized scene components that can be independently attended, as a function of properties such as connectedness, part structure, and other types of perceptual grouping.
Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition, 80(1/2), 1-46.
What are the units of attention? In addition to standard models holding that attention can select spatial regions and visual features, recent work suggests that in some cases attention can directly select discrete objects. This paper reviews the state of the art with regard to such 'object-based' attention, and explores how objects of attention relate to locations, reference frames, perceptual groups, surfaces, parts, and features. Also discussed are the dynamic aspects of objecthood, including the question of how attended objects are individuated in time, and the possibility of attending to simple dynamic motions and events. The final sections of this review generalize these issues beyond vision science, to other modalities and fields such as auditory objects of attention and the infant's 'object concept'.
Most, S. B., Simons, D. J., Scholl, B. J., & Chabris, C. F. (2000). Sustained inattentional blindness: The role of location in the detection of unexpected dynamic events. Psyche, 6(14).
Attempts to understand visual attention have produced models based on location, in which attention selects particular regions of space, and models based on other visual attributes (e.g., in which attention selects discrete objects or specific features). Previous studies of inattentional blindness have contributed to our understanding of attention by suggesting that the detection of an unexpected object depends on the distance of that object from the spatial focus of attention. When the distance of a briefly flashed object from both fixation and the focus of attention is systematically varied, detection appears to have a location-based component. However, the likelihood that people will detect an unexpected event in sustained and dynamic displays may depend on more than just spatial location. We investigated the influence of spatial location on inattentional blindness under precisely controlled, sustained and dynamic conditions. We found that although location-based models cannot fully account for the detection of unexpected objects, spatial location does play a role even when displays are visible for an extended period.
Scholl, B. J. (2000). Attenuated change blindness for exogenously attended items in a flicker paradigm. Visual Cognition, 7(1/2/3), 377-396.
When two scenes are alternately displayed, separated by a mask, even large, repeated changes between the scenes often go unnoticed for surprisingly long durations. Change blindness of this sort is attenuated at "centers of interest" in the scenes, however, supporting a theory of change blindness in which attention is necessary to perceive such changes (Rensink, O'Regan, & Clark, 1997). Problems with this measure of attentional selection -- via verbally described 'centers of interest' -- are discussed, including worries about describability and explanatory impotence. Other forms of attentional selection, not subject to these problems, are employed in a 'flicker' experiment to test the attention-based theory of change detection. Attenuated change blindness is observed at attended items when attentional selection is realized via involuntary exogenous capture of visual attention -- to late-onset items and color singletons -- even when these manipulations are uncorrelated with the loci of the changes, and are thus irrelevant to the change detection task. These demonstrations ground the attention-based theory of change blindness in a type of attentional selection which is understood more rigorously than are 'centers of interest'. At the same time, these results have important implications concerning the nature of exogenous attentional capture.
Scholl, B. J., & Tremoulet, P. (2000). Perceptual causality and animacy. Trends in Cognitive Sciences, 4(8), 299-309.
Certain simple visual displays consisting of moving two-dimensional geometric shapes can give rise to percepts with high-level properties such as causality and animacy. This article reviews recent research on such phenomena, which began with the classic work of Michotte and of Heider and Simmel. The importance of such phenomena stems in part from the fact that these interpretations seem to be largely perceptual in nature -- to be fairly fast, automatic, irresistible, and highly stimulus-driven -- despite the fact that they involve impressions typically associated with higher-level cognitive processing. This research suggests that just as the visual system works to recover the physical structure of the world by inferring properties such as three-dimensional shape, so too does it work to recover the causal and social structure of the world by inferring properties such as causality and animacy.
Scholl, B. J., & Leslie, A. M. (1999a). Modularity, development, and 'Theory of Mind'. Mind & Language, 14(1), 131-153.
Psychologists and philosophers have recently been exploring whether the mechanisms which underlie the acquisition of 'theory of mind' (ToM) are best characterized as cognitive modules or as developing theories. In this paper, we attempt to clarify what a modular account of ToM entails, and why it is an attractive type of explanation. Intuitions and arguments in this debate often turn on the role of development: traditional research on ToM focuses on various developmental sequences, whereas cognitive modules are thought to be static and 'anti-developmental'. We suggest that this mistaken view relies on an overly-limited notion of modularity, and we explore how ToM might be grounded in a cognitive module and yet still afford development. Modules must 'come online', and even fully-developed modules may still develop internally, based on their constrained input. We make these points concrete by focusing on a recent proposal to capture the development of ToM in a module via parameterization.
Scholl, B. J., & Pylyshyn, Z. W. (1999). Tracking multiple items through occlusion: Clues to visual objecthood. Cognitive Psychology, 38, 259-290.
In three experiments, subjects attempted to track multiple items as they moved independently and unpredictably about a display. Performance was not impaired when the items were briefly (but completely) occluded at various times during their motion, suggesting that occlusion is taken into account when computing enduring perceptual objecthood. Unimpaired performance required the presence of accretion and deletion cues along fixed contours at the occluding boundaries. Performance was significantly impaired when items were present on the visual field at the same times and to the same degrees as in the occlusion conditions, but disappeared and reappeared in ways which did not implicate the presence of occluding surfaces (e.g. by imploding and exploding into and out of existence, instead of accreting and deleting along a fixed contour). Unimpaired performance did not require visible occluders (i.e. Michotte's tunnel effect) or globally consistent occluder positions. We discuss implications of these results for theories of objecthood in visual attention.
Leslie, A. M., Xu, F., Tremoulet, P. D., & Scholl, B. J. (1998). Indexing and the object concept: Developing 'what' and 'where' systems. Trends in Cognitive Sciences, 2(1), 10-18.
The study of object cognition over the last twenty-five years has proceeded in two largely non-interacting camps. One camp has studied object-based visual attention in adults, while the other has studied the object concept in infants. We briefly review both literatures and distill from the adult research a theoretical model that we apply to findings from the study of infancy. The key notion in our model of object representation is the 'sticky' index, a mechanism of selective attention that points at a physical object in a location. An object index does not represent any of the properties of the entity at which it points. However, once an index is pointing to an object, the properties of that object can be examined and featural information can be associated with or 'bound' to its index. The distinction between indexing and feature binding underwrites the distinction between object individuation and object identification, a distinction that turns out to be crucial in both the adult attention and the infant object concept literatures. By developing the indexing model, we draw together two disparate literatures and suggest new ways to study object-based attention in infancy.
Drop me a note (firstname.lastname@example.org) if you'd like a hardcopy of any of these papers.
Back to the Publications page