Set Size, Clutter & Complexity

Size: px

Start display at page:

Download "Set Size, Clutter & Complexity"

Archibald Job McBride
5 years ago
Views:

1 Set Size, Clutter & Complexity A review of You are simple Quantities Aude Oliva "I think the next century will be the century of complexity." Stephen Hawking

2 To be complex or not to be complex Unfamiliar Character Mathematics Transparency camouflage stupidity Number of objects Number of elements

3 Definition Set size: increases with the number of objects in the display Clutter: increases with the number of features or objects and their variance Complexity: increase with the number of distinguishable parts and the number of connections between these parts Two domains of interest: Image memory and visual search

4 Role of background complexity on visual search performances The interaction of Set size * Complexity Wolfe, J.M., Oliva, A., Horowitz, T. S, Butcher, S., & Bompas, A. (2002). Segmentation of Objects from Backgrounds in Visual Search Tasks. Vision Research, 42,

5 Visual Search in Complex Background Classical Visual Search Visual Search in background Complexity: clutter Complexity: Junctions Complexity: camouflage Complexity: Scaling of object RT Slope (e.g. 40 msec) cost Intercept Efficient Set size (# of items) Wolfe, Oliva, Horowitz et al (2002)

6 Visual Search in Complex Background Hypothesis 2: initial separation Hypothesis 1: Each object must be mechanism. A single operation separately extracted from the background. Increasing background complexity should add a cost for each item examined. separates possible target objects from the background and then search is performed on a display with the same set size (complexity may add additional candidate objects). Separating more complex backgrounds may take longer. Different slope Same slope RT Complex RT Complex Simple Simple Set size (# of items) Set size (# of items)

7 Effect of background on Search The effect of background on object search: Background and objects are separated initially in a pre-attentive step and the search operates on a subset of candidates items. Example: search for T among Ls L L L L T Initial percept Separation stage Search stage Wolfe, Oliva, Horowitz et al (2002)

RT(msec) Experiment 1: Search among cluttered desks 1200 1100 1000 900 800 700 2 4 6 8 10 12 14 Set size Empty

8 RT(msec) Experiment 1: Search among cluttered desks Set size Empty desk ( 31 msec/item ) Simple desk ( 37 msec/item ) Messy desk ( 35 msec/item ) Clutter produces an additive RT cost.

9 Experiment 2 : The Walls T junctions Broken T junctions T Junctions w/line terminator control No junctions no vertical X junctions Broken X junctions X Junctions w/line terminator control Blank square control

msec Slope = 22 msec Mean = 596 msec Slope = 21 msec Mean = 664 msec Efficiency of search (slope) is not

10 Search among junctions walls Slope = 17 msec Mean = 562 msec Slope = 16 msec Mean = 580 msec Slope = 17 msec Mean = 587 msec Slope = 21 msec Mean = 618 msec Slope = 18 msec Mean = 628 msec Slope = 24 msec Mean = 637 msec Slope = 22 msec Mean = 596 msec Slope = 21 msec Mean = 664 msec Efficiency of search (slope) is not affected by the background complexity. The complexity produce an additive RT cost. Results favor the clean-up hypothesis.

11 Experiment 3: Camouflaged Target Purpose : To systematically vary the similarity of target and background spatial frequency (SF) content. Method : Backgrounds were textures composed of the same spatial frequencies as the target or of a lower SF component (.5x and 0.125x) or of an higher SF (2x and 8x). We plot relative log SF from low to high SF. (-0.9,- 0.3,0,0.3,0.9 relative log units). Participants performed two search tasks: searching for one target or searching for two targets. Set sizes were 1,4,7,10 items. Logic: If clean-up is done only once, then the cost of the background will be similar for 1T and 2T tasks. All the backgrounds had the same histogram (a gaussian distribution of gray-level). The targets and distractors were of different contrasts: 3 stdev (easely discriminable), 2 stdev and 1 stdev( (almost camouflaged ) from the background mean.

Coarser scale Camouflaged search Finer scale -0.

9 Hyp 1: Clean up twice (once for T1, again for

2 T search Relative RT 1 T search 2 T search

12 Coarser scale Camouflaged search Finer scale F = Hyp 1: Clean up twice (once for T1, again for T2) Hyp 2: Clean up once Relative RT 1 T search 2 T search Relative RT 1 T search 2 T search Low 0 High Frequency of the background Low 0 High Frequency of the background

13 Experiment 3 : Results Relative mean RT (ms) 1T-Target Present Relative mean RT (ms) 1T-Target Absent 150 2T-Target Present 300 2T-Target Absent Frequency of the background (log) Slopes for 1 Target =44,31,51,44,37 msec/item (not significant) Slopes for 2 Targets =80,80,80,84,80 msec/item (not significant) Frequency of the background (log) Slopes for 1 Target =99,95,84,92,102 msec/item (not significant) Slopes for 2 Targets =140,123,122,132,145 msec/item (not significant) Efficiency of the search (slope) is not affected by the background complexity. The complexity produces an additive RT cost that is dependent on the spatial frequency similarities between the target and the background. Results in favor of the clean-up hypothesis (#2).

14 Level of camouflage F = Low contrast target High contrast target

15 Exp 4: Complexity as a scaling of target object Frequency F/8 Complexity = log(0.125) Frequency F/4 Complexity = log(0.25) Frequency F Complexity = Max Frequency 2F Complexity = log(2) background with color-cue stimuli RT (msec) 1000 Color cue No color cue Slope (msec) 100 Color cue No color cue Background Frequency (in log) Background Frequency (in log)

16 Background Complexity Representation For a search task purpose, a precise representation of the structure of the background may not be relevant. However, the background complexity affects initial perceptual stage. Observers are able to quickly and effortlessly determine the mean size of a set of heterogeneous circles (Treisman and colleagues). Is the background scene, for the purpose of a search task, represented by a statistical summary of features?

17 Background Complexity Representation

Experiment 5: Examples of Backgrounds Single Pattern Frequency F/8 4 regions Complexity = log(4) Single Pattern Frequency F/4 16 regions Complexity =

with color-cue stimuli = 1/2 Pattern(F/8)+1/4 Pattern(F/4)+1/4 Pattern(F) = 1/2 log(4) + 1/4 log(16) + 1/4 log(256) = log(16) Composed Pattern

18 Experiment 5: Examples of Backgrounds Single Pattern Frequency F/8 4 regions Complexity = log(4) Single Pattern Frequency F/4 16 regions Complexity = log(16) Single Pattern Frequency F 256 regions Complexity = log(256) Single Pattern Frequency 2F 1024 regions Complexity = log(1024) Single Pattern with color-cue stimuli = 1/2 Pattern(F/8)+1/4 Pattern(F/4)+1/4 Pattern(F) = 1/2 log(4) + 1/4 log(16) + 1/4 log(256) = log(16) Composed Pattern Complexity equivalent to Size 16 regions Single Pattern Complexity of Size 16 regions Those two patterns have the same level of complexity in regard to the target

statistical summary of background complexity Composed Region pattern of complexity equivalent to Single pattern (1024) RT(msec) 800 750 700 Color cue- Observed RT Color cue- Predictive RT No color

19 statistical summary of background complexity Composed Region pattern of complexity equivalent to Single pattern (1024) RT(msec) Color cue- Observed RT Color cue- Predictive RT No color cue- Observed RT No color cue- Predictive RT Predictive RT = 0.5 * RT (Pattern of complexity 256) * RT (Pattern of complexity 1024) * RT (Pattern of complexity 16384) Background Frequency (log) Performances of visual search on the Composed Backgrounds may be predicted from the average of the single region backgrounds

20 Search in background A single operation separates possible target objects from the background. Search then proceeds through the set of target objects, ignoring the background. Clutter Junctions Camouflage Target scaling Item/Item cost

21 Conclusions Observers can separate candidate targets from a complex background in a single preattentive step. Background information adds an additive RT cost at the beginning of the search. This initial separation mechanism makes sense in regard to a gist mechanism: a very fast computation of features over the whole image, that correspond to background information.

22 Definition Set size: increases with the number of objects in the display Clutter: increases with the quantity of features or objects and their variance Complexity: increase with the quantity of distinguishable parts and the quantity of connections between these parts Two domains of interest: Image memory and visual search

23 Why studying visual complexity? It is a paradox. When the parts of a complex are separated or conceptualized as a whole, the valence of the complexity changes and the pattern becomes simpler No crowding. There are almost no studies related to visual complexity in real world scenes It is impossible. to characterize visual complexity because it is too complex. Scene gist does not care. Model of scene categorization (Spatial envelope) is independent of visual complexity of the image

24 But really? Why? coast Ruggedness Conceptual Space landscape forest mountain Neighbours Openness Scenes are composed of numerous objects, textures and colors which are arranged in a variety of spatial layouts. However, scene categorization (e.g. street, kitchen, park), unlike other visual processes (e.g., search), seems to be unaffected by the level of visual complexity of a scene. Large space (200 m), urban scenes Large space (200 m), urban scenes, in perspective, busy

25 Spatial Envelope A picture can be represented by a vector of length N corresponding to N perceptual properties of space (Here we show N=3, for openness, {expansion or ruggedness} and roughness. O i Ex i Rn i O i Rg i Rn i { S,S,S { { S,S,S { { S,S,S { { S,S,S { Each perceptual dimension corresponds to one axis of a multidimensional space into which scenes with similar space properties are projected together.

26 2 b Complex From latin complexus = entwined, twisted together In order to have a complex, you need 2 or more elements which are joined in such a way that it is difficult to separate them Intuitively, an object is more complex if more parts can be distinguished and if more connections between them exists Logically, more parts to be represented means more time to searched among or to compute. The components of a complex cannot be separated without destroying it (by separation, you break the connections). The method of analyzing or decomposition into independent modules may not be used to simplify the modeling of a complex object. From Heylighen (1997)

27 Complexity is Variety The representation of the visual complexity is likely to combine both levels of varieties (parts and surface styles). Intuitively, complex scenes should contain a larger variety of parts and surfaces styles, as well as more relationships between these regions than do simpler scenes. Oliva et al (2004)

complexity is a complex property Visual complexity may be a function of: - Variety of elements (contours, objects) - Variety of surface styles (textures,

28 complexity is a complex property Visual complexity may be a function of: - Variety of elements (contours, objects) - Variety of surface styles (textures, colors, materials) - Variety of surface modulators (shadows, light sources) - Variety of Symmetries - Variety of Spatial layout - Subjective experience (familiarity)

29 Image Regularities Simple scene Perceptually good Complex scene Perceptually bad

30 Image Regularities Simple scene Perceptually good Complex scene Perceptually bad Mirror symmetry

31 Image Regularities Simple scene Perceptually good Complex scene Perceptually bad Mirror symmetry

32 Image Regularities Simple scene Perceptually good Complex scene Perceptually bad Mirror symmetry

33 Image Regularities Simple scene Perceptually good Complex scene Perceptually bad Let s consider toy examples of the simplest and the most complex scene (spatially-variant pattern)

34 Image Regularity The good An empty pattern The bad A random pattern Mathematics defined simplicity as the degree to which an object can be faithfully compressed, meaning without lost of information (Feldman, 1997, 2004).

35 Perceptual Regularity: Symmetry The good An empty pattern The bad A random pattern No features or many features displayed randomly share similar perceptual regularities

36 Perceptual Regularity: Symmetry The good An empty space The bad A random space Empty and Random spaces are maximally symmetric Not in the absolute position of the parts, but in the probability that a part will be find at a particular location. The essence of symmetry: one part is sufficient to reconstruct the whole

37 Perceptual Regularity: Stationarity The good An empty space The bad A random space Features in empty and random space are stationary Stationarity : the probability that a component or set of features will be found at any location in the pattern is the same.

38 Perceptual Regularity: Stationarity The good An empty space The bad A random space Features in empty and random space are stationary Stationarity is the probability that a component or set of features will be found at any location in the pattern is the same.

39 Perceptual Regularity The good An empty space The Ugly The bad A random space

40 Perceptual Regularities The good An empty space The Ugly The bad A random space Lower symmetry Lower stationarity But less complex than bad??

41 What is visual complexity? Optimum regularity? Increase performances Variety of features? Degree of perceived visual complexity

42 What is visual complexity? Optimum regularity? Increase performances Variety of features? Degree of perceived visual complexity

43 What is visual complexity? Optimum regularity? Increase performances Variety of features? Degree of perceived visual complexity

complexity requires a precise definition of what visual

Two levels of visual complexity: (1) the complexity inside the

44 What is visual complexity? Low Medium High Any rigorous study of the perception of visual complexity requires a precise definition of what visual complexity is. Two levels of visual complexity: (1) the complexity inside the image (perceptual complexity) (2) the task related visual complexity (cognitive complexity) First Question: How can we represent the perceptual complexity of a scene?

45 Visual Complexity Research Program (1) How do human observers represent visual complexity? What is the content of that representation? c2 Starting to search for the perceptual dimension(s) c1 underlying perceived visual complexity (2) How does our visual system handle visual complexity in scenes? What are the perceptive and mnemonic mechanisms used to faithfully (or not) compressed visual complexity? Starting to look at individual scenes of various degrees of visual complexity for memory tasks c3

46 Representation of Visual Complexity in natural scenes Oliva, A., Mack, M.L., Shrestha, M., & Peeper, A. (2004). Identifying the Perceptual Dimensions of Visual Complexity of Scenes. The 26th Annual Meeting of the Cognitive Science Society Meeting, Chicago, August 2004

47 Representing complexity: Textures Rao and Lohse (1993) Heaps & Handel (1999): The visual complexity of a texture defined as the degree of difficulty in providing a verbal description of a texture The degree of perceivable structure of a texture (goodness or simplicity) depends on two major perceptual dimensions: (1) Repetitiveness (vs, disorganization) (2) Uniform Orientation (vs. randomness) Degree of perceived visual complexity

48 Representing visual scene complexity Question: How can a cognitive system represent the degree of visual complexity of a scene? (e.g. variety of objects) Hypothesis 1: visual complexity of a scene can be represented along a single dimension (e.g. there exists a eureka filter computing visual complexity) Alternative hypothesis: visual complexity is represented by a multi-dimensional space of perceptual dimensions. How do task constraints modulate the perceived visual complexity of a scene? (the flexibility of the features used to represent visual complexity)

49 The Shape of a complexity representation p1 p3 c1 c2 S1 S2 p3 C c3 1- Unique Perceptual Dimension C The features or properties related to visual complexity can be combined into one perceptual dimension (like mean depth estimation) 2- Multi-dimensional Space {c1,c2,, cn} Most of visual complexity variability is explained by an identifiable number of perceptual dimensions. The weight of each dimension may vary with task constraints, but the principal dimensional vocabulary remains the same (like determining the basic-level category of a scene) 3- Flexible Space: Space 1, Space N The properties that each human observer uses to represent visual complexity varies. There is no specific dimensional vocabulary that is used for representing visual complexity (maybe like the emotional valence of a scene)

50 Representing Visual complexity These three hypotheses of representation of visual complexity are not mutually exclusive: for a particular task, the visual complexity space could be skewed towards a line (e.g. one perceptual property like quantity of objects is preferentially used), but for a different task, the space of visual complexity might take into account multiple dimensions. In a first study, we aim to tease apart the three levels of representation. We evaluated the degree of agreement that participants had when asked to judge the perceived visual complexity of indoor scenes.

organize images into groups of visual complexity (minimum 3 groups, maximum 24), taking into account objects, colors, textures, space and lighting information.

51 Norming visual complexity Rating the visual complexity ~ 1000 scenes. frequency High Medium Low Complexity Norming: 100 scenes (selected at random among the 1000) were presented on a 23 monitor, and 40 participants were asked to organize images into groups of visual complexity (minimum 3 groups, maximum 24), taking into account objects, colors, textures, space and lighting information. Complexity was defined as follows: if you would only glance once at the picture, how difficult will it be to describe the scene to somebody else so that she can find it among similar images. Participants did an average of 4 trials of 100 different images each.

52 Hierarchical Classification Task 100 scenes selected along the full complexity scale. After each subdivision, participants described the criteria they used to split the images.

53 Hierarchical grouping task

54 Constraints on the complexity space Two groups of participants (N=17 per group), were told different definitions of visual complexity. Control group: Visual simplicity is related to how easy it will be to give a verbal description of the image and remember it after seeing it for a short time. Visual complexity is related to how difficult it will be to give a verbal description of the image and remember it after seeing it for a short time. Structure group: Visual complexity is related to the structure of the scene and not merely to color or brightness. Simplicity is how you see that objects and regions are going well together. Complexity is related to how difficult it is to make sense of the structure of the scene.

55 Representing Visual Complexity To differentiate between the three shapes of complexity space (1 Dimensional, N Dimensional or N spaces): (1) Qualitative analysis: which criteria did participants use? (2) How consistent are participants in ranking images along visual complexity? (3) What is the underlying representation of visual complexity? (Multi-dimensional scaling)

56 Criteria of Visual Complexity Criteria of visual complexity and their % for the primary and the secondary divisions. Criteria Quantity of: object texture color Quantity total Clutter Symmetry Openness Layout organization Contrast Group:Structure <1 Group:control Taxonomy of visual criteria. (1) Quantity refers to objects, textures, colors (2) Clutter is a relational criteria relationship between quantity of objects and space. (3) Openness refers to the amount of space (4) Symmetry refers to mirror symmetry (5) Layout organization: description of the type of layout (centralized, grid).

57 Rankings Correlation With the hierarchical grouping task, scenes were classified into 8 bins of complexity. Images within each group were given the same complexity value. Within each group, we computed the Spearman s rank order correlation for each possible pairings of subjects,. If participants were consistent, correlation ranking should be high. Control group: r=0.62 (0.15) Structure group: r=0.61 (0.14) High consistency rankings Low consistency rankings

58 Multi-dimensional scaling The MDS provides a visual representation of the pattern of proximities (i.e., similarities or distances) among the images and inform about the underlying representation Criteria given by participants may be redundant with each other (a scene cannot have a high degree of clutter and a lot of open space). The number of dimensions of an MDS space are decorrelated. They correspond to the number of independent ways in which images can be sensed to resemble or differ.

59 Multi-dimensional scaling (structure group) Clutter/Quantity

60 Second Axis of MDS Structure group No mirror symmetry Control group Mirror symmetry Correlation between images projected onto the second principal axis in each group drops to 0.33

61 Conclusion Correlation between the image ranks projected onto the first axis of the MDS-control group and MDS-structure group is 0.98, suggesting the existence of a principal dimension of complexity (clutter). Additional secondary dimensions participate in the the estimation of complexity (e.g. symmetry, openness). The dimensions of complexity are modulated by task constraints. What is the shape of the complexity space? It looks like a multi-dimensional space (clutter, quantity of color, texture, openness, symmetry, layout organization) possibly skewed towards a principal dimension (clutter).

62 Memorize these pictures

72 Which of the following pictures have you already seen?

74 NO

76 NO

78 NO

80 NO

82 YES

84 NO

85 Memory Confusion You have seen these pictures You were tested with these pictures

86 Memory Confusion You have seen these pictures You were tested with these pictures

87 Human image memory Memory of complex images is outstanding but We remember the meaning or gist of an image, its spatial layout but not the objects.

88 Question Memory of real-world images is known to be very good, but little is known about the mechanisms that observers may use to encode and represent complex visual information in the domain of natural images and scenes. A characteristic of natural scenes is their variability in quantity of objects, spatial arrangements and scale. This variability presents the question of whether perceptual and mnemonic mechanisms depend on the degree of visual complexity of a scene image.

89 Role of Visual complexity on memory Low clutter High clutter

90 Role of Visual complexity on memory Low clutter High clutter Easy to remember? (less confusion) Difficult to remember? (more confusion) Increase of false alarms

91 Memory Confusion: very simple and very complex images are equally difficult to remember % of error (false alarms) d High complexity Low complexity

Possible Interpretation Performance for images of high and low complexity were

lower degree of discriminability than scenes of medium complexity.

92 Possible Interpretation Performance for images of high and low complexity were equivalent suggesting that image encoding is not a mechanism depending merely on the quantity of objects. Overall, the results suggest that scenes of high and low visual complexity have a lower degree of discriminability than scenes of medium complexity. What do images of high and low visual complexity have in common? High Medium Low details uniform (1) Low distinctiveness value (2) Lack of variation in the quantity and type of information

93 Summary Temporal browsing of information: each picture can be shown for only msec in a sequence (Mary Potter, 1975) and still be recognized. You will memorize the storytelling of the picture, its meaning and spatial layout, but miss a lot of visual information, including some important objects. Visual complexity may have a complex interaction on memory processes

94 Complexity in spatial scales Does spatial layout resolution for scene recognition vary with image categories and task? Or is there an universal spatial layout resolution independent of categorization?

Spatial Scale Layout and Scene Categories Highway Street Closeup Buil ding Coast Country Forest Mountain M 0 c/i 73 87 67 62 65 67 91 60 71 2 c/i 78 90 79 86 73 65 89 79 80 4 c/i 80 91 84 87 73 70 89

95 Spatial Scale Layout and Scene Categories Highway Street Closeup Buil ding Coast Country Forest Mountain M 0 c/i c/i c/i Slope(%) Performances of categorization in 8 basic level groups (chance levels is 12.5%), for scene representation at 0,2 and 4 cycles/image. The diagnostic spatial layout resolution varies with scene category. Increasing in spatial layout does not mean increasing in visual complexity.

Scene-Centered Description from Spatial Envelope Properties

Scene-Centered Description from Spatial Envelope Properties Aude Oliva 1 and Antonio Torralba 2 1 Department of Psychology and Cognitive Science Program Michigan State University, East Lansing, MI 48824,