Mech 296: Vision for Robotic Applications Lecture 2: Color Imaging 2. Terminology from Last Week Data Files ASCII (Text): Data file is human readable Ex: 7 9 5 Note: characters may be removed when transferring files from one machine to another Binary: Data file is machine readable Ex: FZn Image Pixel Values Binary (Black or White): Image pixels have one of two values, or Grayscale (Monochrome): Images pixels have one of many values, commonly - 255 Color: Images pixels are described by a vector (i.e. RGB), each element of which takes many values 2.2
Color Video Previously we have considered monochrome video gxy (, ) Today, we will add new vision modalities to our imaging processing toolbox: color gxy (, ) = gred ( xy, ) ggreen( xy, ) gblue ( xy, ) Background Reading: Forsythe & Ponce, Chapter 4-6 2.3 Color: Physics vs. Perception EM Waves from NASA: http://imagers.gsfc.nasa.gov/ems/visible.html Primary Colors from Wikipedia: http://en.wikipedia.org/wiki/rgb 2.4 2
Imaging Systems Cameras and other imaging systems are designed to reproduce human perception They don t reproduce the full scene, They only capture what humans perceive Engineering a digital camera requires a careful balance of physics and perception 2.5 Human Vision Radiometry Eye Perception Natural Scene Human 2.6 3
Imaging System Radiometry Image Capture Screen Eye Image Encoding Computer Voltage Conversion Perception Natural Scene Camera Computer Monitor Human 2.7 Color Reproduction Natural Light Screen Eye Eye Trichromacy In color matching experiments, a minimum of three weighted primary colors (of different frequencies: λ,λ 2,λ 3 ) are needed to match a test light (at λ 4 ) Grassman s Laws, from psychology. Test lights sum: A combination of two test lights can be matched by summing the weighted primaries associated with each individual test light 2. Transitivity: If two test lights are represented by the same set of weighted primaries, then they are the same frequency 3. Scaling: If a test light is made brighter by a factor k, then the primary weights must also be scaled by k to provide a match In short, the human response to light is linear 2.8 4
Physiological Basis Cone Transfer Functions In Human eye, there are three types of cones, with sensitivity peaks for short, medium and long wavelength light. Each type of cone has a transfer function, p, which relates the measured energy, E, to the incident energy, E i : E = p( λ) E i ( λ) pshort ( λ) p( λ) = pmed ( λ) plong ( λ).9.8.7.6.5.4.3.2. pshort ( λ ) pmed ( λ ) plong ( λ) 35 4 45 5 55 6 65 7 75 8 85 Wavelength, λ (nm) Data Source: http://www-cvrl.ucsd.edu/index.htm Physiological Basis Cone Transfer Functions Short Short Wavelength Cone Medium Wavelength Cone Long Wavelength Cone.996 p( λ B ) =.75.45.5 p( λ G ) =.78.626 p( λ R ) =..995 Primaries: λ B = 444.44 nm λ G = 526.32 nm λ R = 645.6 nm 5
Trichromacy: Forming a Basis Transfer function vectors for (almost) any three colors form a basis Nothing special about R,G,B as primaries Can obtain complete basis for all colors, as long as the p vectors are linearly independent Can express any other transfer function vector as a linear combination of the basis vectors (i.e. the primaries, R,G & B) The f vector is the weight vector for the linear combination.996.5 P= p( λr) p( λg) p( λb).75.78. =.45.626.995 p( λ) = Pf( λ) Weight Vectors vs. Transfer Functions The human color response may be expressed equivalently as. A set of cone transfer functions, p 2. Given a set of primaries, as a set of weighting functions, f This is useful, because computer displays can only produce three primary colors (RGB) f P p( λ) ( λ ) = 2.2 6
RGB Weights Problem: Some of the weights are negative Negative weights are not realizable by an RGB monitor (energy > ) Not all colors visible by humans can be represented by an RGB monitor! Current TV development: use 5 color projection to extend realizable color set 3.5 3 2.5 2.5.5 Red Green Blue -.5 35 4 45 5 55 6 65 7 75 8 85 Wavelength, λ (nm) Data Source: http://www-cvrl.ucsd.edu/index.htm 2.3 Mixtures of Colors A typical object scatters a spectral range of colors The spectrum of a color mixture, S(λ), is the incoming energy, E i, per unit wavelength of the incident radiation The measured energy, E, equals the incident energy, E i, weighted by the transfer function, p E = S( λ ) p( λ) dλ and p( λ ) = Pf ( λ ) Λ: Visible Spectrum A modified set of RGB weights, f, is defined for the mixture If a computer monitor uses these weights, a human viewer will perceive a color mixture, rather than a pure color f ' = S λ f( λ) dλ Λ: Visible Spectrum f ' = P E ( ) S de i dλ ( λ ) = ( λ ) These f values are what we manipulate in the computer, almost 7
Imaging System Radiometry Image Capture Screen Eye Image Encoding Computer Voltage Conversion Perception Natural Scene Camera Computer Monitor Human 2.5 Image Capture Like human eye, most cameras use three receptor types Camera receptors have different transfer functions than the cones in the human eye p ( λ) p = Pf ( λ) camera human Fuji: http://www.shortcourses.com/choosing/how/3.htm 2.6 8
Designing Camera Color Response Ideally, the color transfer functions for the camera could be related to the human color response by a linear transform The basis for the set of functions defined over the visible spectrum is infinite in size (i.e. Fourier Series). Thus the linear transformation, A, between the camera and human color representations is only an approximation; however it is a good approximation..9.8.7.6.5.4.3.2. Cone Transfer Functions 35 4 45 5 55 6 65 7 75 8 85 Wavelength, λ (nm) p ( λ) = A p ( λ) = A Pf( λ) camera LSQ human LSQ 2.7 RGB Weights Scale with Energy Key point: Values expressed in RGB colorspace are linearly related to the energy received at each camera pixel (i.e. the ratio of R : G : B is the spectrum, S, is scaled by a constant) ( λ ) = ( λ ) S ks Energy received is integral over spectrum of incident light E = S p d = ke camera Λ: Visible Spectrum ( λ) camera ( λ) λ E = S pcamera d Λ: Visible Spectrum ( λ ) ( λ) λ The RGB weights, f, are linearly related to the received energy at the camera f ' = kp A E f' = P A E The weights, f, scale with the brightness of the incoming energy spectrum, S; in this sense the RGB unit vector describes color, the RGB magnitude describes total energy content f ' = kf ' camera 2.8 9
Imaging System Radiometry Image Capture Screen Eye Image Encoding Computer Voltage Conversion Perception Natural Scene Camera Computer Monitor Human 2.9 Gamma Encoding Energy Levels The RGB values stored in a computer do not represent energy Rather the RGB levels in a computer image correspond to levels of human perception RGB levels encoded to equal voltage commands for a Cathode Ray Tube (CRT) monitor This CRT conversion curve is close to the curve for human brightness perception The gamma (γ) correction is the key parameter in this map f g Energy weights measured by camera perception encoded weights used in computer Note: here and in the rest of this document, the prime ( ) will be dropped from the spectral mixture weights, f, which equal the RGB energies measured by the camera
Image Encoding The CRT conversion is performed on a normalized scale, where f max is the saturation point for human vision and g max is the upper limit for g (i.e. 255 for a one byte value) / γ f f.99.99, >.8 g fmax fmax = gmax f f 4.5,.8 fmax fmax γ = 2.22 g/g max.9.8.7.6.5.4.3.2..2.4.6.8 f/f max 2.2 CRT Conversion In principle, the CRT conversion simply inverts the mapping used in image encoding The standard gamma is γ =2.22 This is the γ for good visibility in dim background light For televisions and computer monitors, gamma is increased to provide better visibility in daylight (γ =2.25 2.5) Some computers make other adjustments to the gamma mapping (i.e. Macintosh) Some manufacturers quote γ without including an offset term this may cause substantial variation in quoted gamma, as low as.5 or as high as 3.5 f/f max.9.8.7.6.5.4.3.2..2.4.6.8 g/g max 2.22
.9.8.7.6.5.4.3.2...2.3.4.5.6.7.8.9.9.8.7.6.5.4.3.2..9.8.7.6.5.4.3.2. Brightness Level Perception Human perception of brightness is also nonlinear with the energy of the incoming signal In fact, the conversion curve for human perception is very close to that for CRT voltage conversion This is the real reason we continue to use the CRT-based standard, even as display technologies evolve Humans can perceive about 64 distinct levels across brightness (g/g max ) By discretizing with 256 brightness levels, can achieve smooth gradations of brightness, with no noticeable jumps If we discretized linearly in f/f max, we would need 2 4 discrete levels g/g max.9.8.7.6.5.4.3.2. Human Perception CRT Conversion.2.4.6.8 f/f max 2.23 In the computer, the RGB levels, g, correspond to human perception, which is nonlinearly related to the energy measured from the incoming spectrum Radiometry Image Capture Screen Eye Image Encoding Computer Voltage Conversion Perception Natural Scene Camera Computer Monitor Human f/f max f/f max g/g..2.3.4.5.6.7.8.9..2.3.4.5.6.7.8.9 max g/g max f/f max g/g max 2.24 2
Radiometry Radiometry analyzes geometry of light rays We will examine radiometry only briefly, noting its impact on light intensity (energy in visual spectrum) For Lambertian objects (those with a dull surface finish), radiometry causes. Strong brightness variations over the object surface 2. Minimal color differences over the object surface The same observations are true for light sources (i.e an LED on a robot) For these reasons, color may be a more useful than grayscale information in segmenting a target 2.25 Radiometry - Sources Source effects Propagation Effect (r 2 Falloff): Energy is conserved over a spherical shell propagating outward from the light source. The energy per unit area drops off as the inverse of the radius from the light source to the camera (direct rays), from the light source to a surface (indirect rays), and from the surface to the camera (indirect rays) Source Directionality: Spot lighting (which uses a reflector or lens to orient the light rays) has directionality. An objects area is foreshortened, so that it has a smaller effective area, if it is tilted away from the light s direction of travel Occlusion and Shadow: The direct light path between the source and an object in the environment may be blocked (occluded) by some obstacle. In this case, the light that arrives at the object surface comes from a different source, from secondary bounce (i.e. reflected from a wall), or from wave effects (which are not modeled by radiometry) Color: Different types of sources (incandescent and fluroescent lights, for instance) emit light with different color spectra. 2.26 3
Radiometry - Surfaces Surface effects Absorption: A fraction of the incoming light is absorbed. This fraction, which depends on wavelength, gives the object color. Scattering: The remainder of the energy is scattered or specularly reflected back toward an object. The ratio of the outgoing to incoming light is expressed by the bidirectional reflectance distribution function (BRDF), ρ In general ρ is a function of surface location, x, light wavelength, λ, and the angles of the incoming and outgoing rays, relative to the surface. Lambertian surfaces scatter radiation diffusely, uniformly in all directions. Matte surfaces are approximately Lambertian with, ρ a function of x and λ only (i.e. the color at a point on the object) Specularly reflective surfaces act like mirrors. For these surfaces, ρ is independent of wavelength, λ, but highly dependent on incoming and outgoing angles Most object lie in between these extremes and can be modeled with a ρ consisting of the summation of two terms, one a diffuse Lambertian term and the other a specular reflection term. 2.27 Radiometry Camera θ Camera Effects Lens and Aperture Size: The bigger the lens and aperture, the greater the light energy acquired. Automatic Gain Control: Camera electronics may adjust image brightness to reduce saturation at the high or low end. Vignetting: Multiple lenses are generally used for focus and zoom. These systems capture less energy for rays at large angles from the camera centerline. Thin Lens Radiometry: The direction of the incoming light ray and the orientation of the source and camera relative to that ray cause foreshortening (i.e. a reduction of the source and receiver areas in the orientation of the traveling ray). Images are less bright away from the camera centerline. 2.28 4
Radiometry Surface Model Lumped radiometry model for surface Spectral energy at a pixel, S(λ), is the sum of a diffuse term, a spectral term, and an interreflection term S( λ) = k S ( λ) + k S ( λ) + i( λ) d d s s S(λ): Spectral energy measured at a pixel S d (λ): Energy spectrum for diffuse scattering, depends on on S s (λ) and ρ(λ) k d : Diffuse spectrum scaling due to geometry & hardware S s (λ): Energy spectrum for light source k s : Spectral reflection scaling term to geometry & hardware i(λ): Interreflections and secondary effects (neglect) ρ(λ): Wavelength sensitivity for Lambertian BRDF 2.29 Representation of an Object in RGB Space Further Simplify Model Assume the surface is Lambertian and illuminated by a single type of lighting. Neglect the interreflection term, which is small Consider RGB color cube Under these conditions, the color of an object s surface is uniform (same spectrum everywhere), but scaled by the brightness coefficient, k. Thus the RGB energy vector is linear. f = kf S( λ) = k S ( λ) d if d ( λ ) = ( λ ) S ks 2.3 5
Simplified Object Model in RGB Space Thus the pixels for our simplified diffuse surface should lie on a line For RGB energy values, f(λ), the line passes through zero For perceptual RGB values, g(λ), the line need not pass through zero Pixels associated with a point source of light (i.e. an LED) should display a similar, linear behavior Blue Blue.8.6.4.2.8.6.4.2.2.2.4.4.6.6 Red Red.8.8 f = kf.5.5 Green Green g = kg + g 2.3 Including Specularities Specularities (highlights) are local phenomena, so k d S d is approximately constant over the area of the reflection S( λ) = k S ( λ) + k S ( λ) d d s s Blue.8.6.4.2 Each specularity adds an offshoot from the RGB line which defines the observed object.2.4.6 Red.8.5 Green 2.32 6
Including Saturation At edges of the color cube (in shadows and in areas of strong lighting), one or more of the color channels may saturate Blue.8.6.4.2.2.4.6 Red.8.5 Green 2.33 Using Real Data RGB Plot 5.8 5.6 2 5 5 2 25 B.4.2 5 5 2 5 5 2 25.2.4.6 R.8 With real image Multiple light sources Some surface specularity at all points Sample noise Linear Trend Still Evident 2.34.5 G 7
Using Real Data RGB Plot 5.8 5.6 2 5 5 2 25 B.4.2 5.2.4.6 R.8.5 G 5 2 5 5 2 25 Example 2: Top surface of binder exhibits spectral reflection (daylight) 2.35 Using Real Data RGB Plot 5.8 5.6 2 5 5 2 25 B.4.2 5 5 2 5 5 2 25.2.4.6 R.8 Example 3: Manila envelope shows strong brightness variations part of the folders are in shadow.5 G 2.36 8
Threshold Color Segmentation Only two color components need to be considered, if brightness is excluded In order to remove brightness variations, fit a line to object pixel data then measure the perpendicular distance of points from the line fit Establish a threshold cutoff on the perpendicular distance: if RGB distance is below threshold, include pixels in segment g = gred ggreen g g is perception-based RGB vector blue gred () g () green gblue() Robj =, Gobj, B = obj = g red ( N) ggreen ( N) gblue( N) G = c R + c obj LSQ obj LSQ obj 2 B = dr + d obj 2 R obj, G obj, B obj are vectors of RGB data for all pixels ( : N) in a patch associated with some object The line fit for the RGB data comprises four unknown parameters Matlab Implementation Line Fit When there are more equations than unknowns, there is no exact solution for the system of equations The least-squares approach minimizes error to give a good approximate solution Left Divide (\): Matlab implements least-squares as a left divide. For the system Ax=b, type x=a\b to obtain the least squares solution, x Gobj = Robj c LSQ B obj = Robj d LSQ For the current problem, solve these two sets of linear equations 2.38 9
Perpendicular Distance Before finding the perpendicular distance of the data from the line fit, it is first necessary to remove the offset term For the nth pixel, the offset vector, v, is formed from the nth RGB vector, g, and the offset terms (c 2, d 2 ) vn ( ) = gn ( ) c d To find the perpendicular distance, we next remove the component of v parallel to the line fit. The parallel component of the error is obtained by a dot product. v ( n) = v lˆ [ ] 2 2 The l-vector is the unit vector parallel to the line fit lˆ = l /( l l) 2 where l = [ c d ] Subtracting the parallel component, v, from v gives the error perpendicular to the line fit, e e ( n) = vv ( n)ˆ l Perpendicular Distance Norm 2 Orthogonal coordinates 5 The perpendicular distance may be computed using any suitable norm. -norm: diamond 2-norm: circle infinity-norm: square To perform thresholdbased color segmentation, choose a norm and a threshold. All points with an error norm below the threshold are included in the segment e ( n) Threshold < 5-5 - -5-2 -3-2 - 2 3 5 5-5 - -5 2 5 5-5 - Orthogonal coordinates -25-2 -5 - -5 5 5 2 Orthogonal coordinates e e 2 e -5-2 -3-2 - 2 3 2.4 2
B Real-Time Coding For real-time image processing using Matlab, computational time is important Tip : Avoid for loops they make Matlab run slowly (the projection algorithm, for example, can be implemented in Matlab without using any loops) Tip 2: An image may contain thousands or millions of data points. To keep the operation count low, use simple operations when possible (i.e. the infinity norm runs faster than the one norm, which runs faster than the two norm) You can test your timing in Matlab using the tic and toc commans 2.4 Color Segmentation - Example RGB Plot 2 Orthogonal coordinates 5 5.8.6 5.4.2-5 - 5 2 Human Segmentation 5 5 2 25.2.4.6 R.8.5 G -5-2 -3-2 - 2 3 Automated Segmentation 2 4 6 5 8 2 4 5 6 2 5 5 2 25 8 2 5 5 2 25 2
Special Case Limited Variations in Luminance If the luminance (perceived brightness) is limited to a small range of k values, then it may be possible to segment directly in RGB space, without first subtracting off the brightness component In this case simply apply the norm threshold directly in RGB space -norm: octahedron 2-norm: sphere infinity-norm: cube 2.43 Example Special Case RGB Plot 5.8.6 B 5.4.2 2 5 5 2 25.2.4.6 R.8.5 G 2 4 5 6 8 2 5 4 6 2 5 5 2 25 8 2 5 5 2 25 2.44 22
Other Color Spaces Not all systems work with the RGB color format Other commonly used systems Linear Transformations of RGB CIE (Commission International d Eclairage) XYZ System used for color comparison; has negative primaries so all weights, f, are positive. applycform: matlab conversion for device independent color spaces (such as CIE-XYZ) CMY (Cyan-Mauve-Yellow) used for absorptive pigments (i.e. inkjet printer) YIQ (Luminance, Chrominance) used for television signals (i.e. NTSC format) rgb2ntsc: matlab conversion of image into YIQ color space Nonlinear Transformations HSV (Hue, Saturation, Value) formalizes notion of a color wheel; uses stretched polar coordinates about a brightness axis (R+G+B)/3, using energy-based RGB coordinates rgb2hsv: matlab conversion of image into YIQ color space 2.45 Summary In computer vision, color values reflect:. Technology tuned to human perception (trichromacy) 2. Physics (radiometry) Diffuse scatters (i.e. matte objects) appear as lines in RGB color space Color based segmentation performs best, in general, if brightness is removed prior to color comparison 2.46 23