A Video-Based Face Detection Method Using a Graph Cut Algorithm in Classrooms

Size: px
Start display at page:

Download "A Video-Based Face Detection Method Using a Graph Cut Algorithm in Classrooms"

Transcription

1 A Video-Based Face Detection Method Using a Graph Cut Algorithm in Classrooms Jiun-Lin Guo, Chiung-Yao Fang*, Yi-Chun Li, and ei-wang Chen Department of Computer cience and Information Engineering National aiwan Normal University aipei, aiwan *Corresponding Author: violet@csie.ntnu.edu.tw ABRAC his study presents a face detection method for student identification applied in classrooms with varying illumination and complex environments. he face detection system should be able to process many students simultaneously, sometimes more than thirty. Moreover, the faces may not be directly in front of the camera. hese changes of head pose increase the difficulty of the face detection. In this paper, an improved dynamic graph cut algorithm is applied to extract foregrounds and to detect a subject s skin color regions. he main advantage of using the graph cut algorithm to extract regions is that the extracted results are usually smooth and complete, because they take into account the relationship with neighboring pixels. Moreover, these methods are more suited to tolerate small shifts in the camera than pixel-based models. his paper proposes an improved dynamic graph cut algorithm that can deal with a sequence of frames, reduce the running time of the graph cut algorithm, and automatically provide the hard constraints for the input frames. Finally, the subject s facial region is selected from the detected skin color regions. he experimental results show that the proposed method can robustly detect the subject s face at various illuminations and complex classroom environments. Keyword: dynamic graph cut algorithm, foreground extraction, skin color detection, video-based face detection

2 1. Introduction Classroom observation is the process of recording the instructor's teaching practices and student actions, including facial expressions. he results can be used to refine the materials or improve the teaching methods. However, to obtain these types of data manually is time consuming and may disturb the students learning. Many researchers have focused on developing automatic systems to collect this data quickly (Fang, Kuo, Lee, & Chen, 2011). In this work, a face detection method is proposed to robustly detect the students faces at various illuminations and in complex classroom environments. It is hoped that the high quality of the detection results will be helpful for recognizing a student s facial expressions, including concentration, distraction, and drowsiness. o observe and record the facial expressions of students during classes allows an evaluation of the teaching methods, which is a very important factor in classroom observation in educational research. his study develops a student face detection system that can solve the following issues. Firstly, a student face detection system should be able to process successive frames, as the system is applied to classroom observation of multiple subjects. econdly, the system should be able to process many students simultaneously, sometimes in excess of thirty. Finally, students faces may not always be directly in front of the camera. hey may turn their heads to talk to each other, to look at the teacher, or to read from the blackboard during class. hese changes of head pose will increase the difficulty of face detection. Face detection features that are invariant with respect to different head poses should also be considered when developing the face detection system. ince color features are invariant with respect to different head poses, this paper proposes a color-based method to extract skin color and detect faces. he main problem impacting a color-based method is the effect of illumination. If this problem can be solved, the color features will be robust under various head poses. he range of skin color in various color spaces has been studied extensively; researchers have collected large numbers of images with skin color regions and investigated the suitable range of skin color within given color spaces. HV (igal, claroff, & Athitsos, 2004; Zhang, Jiang, Liang, & Liu, 2010) and YCrCb (Chai & Ngan, 1999; Mahmoud, 2008; Phung, 2002) are two color spaces commonly used in skin color detection since the distributions of the skin colors are more concentrated in these two color spaces than in others. Color compensation techniques can be applied to the input images (Zhang et al., 2010) to deal with the illumination effect. Moreover, igal et al. (2004) used a Markov chain and Bayes rule to model the color change parameters in an attempt to dynamically learn the suitable range of skin color. In this study, a fixed skin color interval of the HV color space, similar to the one proposed by Zhang et al.

3 (2010), is used as the initial skin color range of the system. his skin color range is dynamically updated through an improved graph cut algorithm proposed in this paper. In addition, a reliable foreground extraction technique is required. his paper also proposes a foreground extraction technique, an improved graph cut algorithm, that is suitable to the classroom environment. Foreground extraction is usually achieved through pixel-based modeling. For instance, a mixture of Gaussians is a commonly used model (Elgammal, Duraiswami, Harwood, & Davis, 2002; Kae & Bow, 2010). It provides a model for estimating background colors; however, the background colors must initially occupy a higher proportion of time. econdly, the pixel-based model does not take the color information of neighbors into account; thus, noise may affect the foreground extraction results. Finally, the pixel-based model is sensitive to small shifts in the camera by only a few pixels. herefore, many researchers (Heikkilä & Pietikäinen, 2006; Cheng, Gong, chuurmans, & Caelli, 2011) have applied complex pixel-based methods to improve the results, attempting to use the neighboring information to obtain smoother foreground extraction results. he traditional graph cut algorithm (Wu & Leahy, 1993; Juan & Boykov, 2006) provides another foreground extraction concept. his method was originally developed for single image segmentation and requires some initial information inputted manually by the user (Wu & Leahy, 1993). Users should provide some hard constraints (i.e., several labeled pixels) using brush tools to construct the distributions of the colors of background and foreground pixels. Application of the graph cut algorithm, based on the hard constraints, allows the system to extract the foreground pixels. he main advantage of using the graph cut algorithm to extract the foreground is that the extracted results are usually smooth and complete, since the method naturally takes the relation of neighboring pixels into account. Moreover, this method is more resistant to small shifts of the camera than pixel-based models. In this paper, an improved dynamic graph cut algorithm is proposed to deal with sequences of frames, reduce the running time of the traditional graph cut algorithm, and automatically provide the hard constraints of the input frames. he proposed method is used to extract the foreground and detect a student s face at various illuminations and in complex classroom environments. 2. ystem Overview Figure 1 shows a flowchart of the face detection system. he proposed system first uses an improved dynamic graph cut algorithm to extract the foregrounds of the input frame. A dynamic skin color range, based on the extracted foregrounds, is applied to detect skin color regions. he preliminary result of skin color detection will then be refined by applying another graph cut algorithm. Finally, the face regions are

4 automatically selected from the refined skin regions and the process is complete. It should be noted that the kernel technique of the proposed system is the graph cut algorithm. o apply the graph cut algorithm to video foreground extraction, two issues should be considered: (1) the information about hard constraints should be provided automatically, and (2) the time complexity of the graph cut algorithm should be reduced for a real-time system. his study solves these issues to improve the graph cut algorithm. he graph cut algorithm is introduced in the following section. Input Frames Foreground Extraction kin Color Region Detection kin Color Region Refinement Face Region election Color Range Updating Fig. 1. Flowchart of the face detection system. 3. Graph Cut Algorithm 3.1 Energy function Image labeling methods can be used to extract the foreground pixels of the input frame. he fitness of the image labeling methods can be measured using energy functions. Given a frame I with N pixels, the frame can be labeled and recorded by a binary vector A ( a1, a2,..., a N ), where ai {0,1}, i 1,2,..., N. For each pixel i, a 1 indicates a foreground pixel; otherwise, it is a background pixel ( a 0 ). i Boykov and Jolly (2001) defined an energy function of a labeled vector A: E(A) = R(A)+ lb(a), (1) where l is a constant. In Eq. (1), R(A) is the region part and B(A) is the smooth part. hey are defined as: and: N R( A) r( a ), (2) i 1 i i

5 log P( Ii O) if ai 1 ra ( i ), (3) log P( Ii B) otherwise where P( Ii O ) and P( I B ) are the conditional probabilities that indicate the i probabilities of the color of pixel i, I i, occurring in the foreground (O) and background (B) color models, respectively. hese two color models can be obtained from histograms computed from the hard constraint. he smooth part BA ( ) is defined by each pair of neighboring pixels (4-connected or 8-connected). Let p and q indicate two neighboring pixels, p q and 1 p, q N. hus: B( A) b( p, q) ( ap, aq), (4) pq, where 1 if ap aq ( ap, aq) 0 otherwise and b p q I I, and I p and I q 2 (, ) exp{ ( p q) } are the color values of the pixels p and q, respectively. In Eq. (2), R(A) indicates the penalty of the unfitness between the labeled results and the background or foreground color model. In comparison, in Eq. (4), B(A) indicates the penalty of the color similarity of each pair of neighboring pixels that have different labels. he energy function is well defined. he next step is to determine the optimal labeling that can minimize the energy function. 3.2 Weighted graph construction he traditional graph cut algorithm proposed by Boykov and Jolly (2001) is a technique that can be used to minimize the energy function to find the optimal labeling. o apply the well-known minimum-cut maximum-flow approach, a weighted graph should first be constructed. Boykov and Jolly (2001) introduced two additional virtual vertices, the foreground and background terminals, in the weighted graph construction. In Figure 2, the weighted graph construction is shown. Given a frame I, a graph G ( V, E) can be constructed, where the set of vertices V contains all pixels in the frame and the foreground and background terminals, and. In the graph, each pixel has two types of links, neighboring links and terminal links. hese links form the set

6 of edges E. A neighboring link, the n-link, connects two neighboring pixels on the frame, while a terminal link, the t-link, connects each pixel and or. In the 4-connected case, one pixel has four n-links, as shown in Figure 2. Let p and q be two neighboring pixels on the frame, and and be the foreground and background terminals, respectively. he weight of the n-link between p and q, W n link ( p, q), is defined as: 2 Wn link ( p, q) exp{ ( I p Iq) }, (5) where p q and 1 p, q N. Background terminal t-link Image pixel p n-link t-link Foreground terminal Fig. 2. Construction of the weighted graph. It should be noted that the weights of the n-links are defined by the smooth part of the energy function, equal to the function b in Eq. (4). he weight of the t-link between p and the foreground terminal is defined as: 4 if p O Wt link ( p, ) 0 if p B, (6) log P( I p B) otherwise where and are constants. If pixel p is labeled as a background pixel, then the weight of the t-link between p and the foreground terminal is set to zero. If pixel p is labeled as a foreground pixel, then the weight of the t-link between p and is set to 4l +e to ensure it is larger than the sum of all its n-links. Moreover, according to the function r shown in Eq. (3), the weights of the t-links of the unlabeled pixels are set to the region part of the energy function. imilarly, the weight of the t-link between p and the background terminal is similarly defined as:

7 4 if p B Wt link ( p, ) 0 if p O. (7) log P( I p O) otherwise his completes the construction of the weighted graph. Boykov and Jolly (2001) have proved that the minimum cut of a weighted graph is equivalent to the optimal labeling of its corresponding frame, and the sum of all the weights of the links crossing the minimum cut can obtain the minimum energy of the energy function. he most significant property is that after performing the minimum cut algorithm, each pixel of the frame will connect to one and only one terminal vertex. he labeling result can be obtained by looking at which terminal each pixel is connected to. 4. Foreground Extraction using a Graph Cut 4.1 Hard constraints of foreground areas As mentioned previously, hard constraints indicate the initial labeling of the foreground and background pixels. In this section, we propose an automatic technique to label the foreground and background pixels as the hard constraints. wo probability models for the intensity distribution of the background and foreground pixels can be constructed based on the hard constraints. Given two color frames, a difference image can be obtained by pixel-to-pixel intensity value subtraction in each color channel. If one of the frames represents the background image, then the difference image may display the foreground objects. An example is shown in Figure 3. Figure 3(a) indicates a background image captured in a classroom, and Figure 3(b) shows a frame with some students in the same classroom. he difference image of these is shown in Figure 3(c). In Figure 3(c), the darkest pixels correspond to the background pixels, since their difference values are close to zero. Moreover, even the colors are changed in the subtraction process, as the students can be observed roughly in the difference image. If these two frames are both represented by an RGB color model, then their pixel-to-pixel difference histogram in each channel can be easily obtained.

8 (a) (b) (c) Fig. 3. An example showing how the difference image is obtained. Fig. 4. he histograms of the R, G, and B channels (from left to right) of the difference image shown in Fig. 3(c). Figure 4 shows the histograms of the R, G, and B channels, from left to right, of the difference image shown in Fig. 3(c). In Figure 4, since the number of background pixels is always large, most of the difference values are close to zero. Binarization of the difference image using a threshold value can reveal the foreground objects. Chiu et al. (2010) proposed a fast algorithm to determine a suitable threshold. ince the difference histogram is not symmetric around the zero point, using only one threshold is not sufficient to obtain the foreground object. hus, the method is improved and the threshold is defined more precisely. wo thresholds, study. D and H, are used in this D H 40% 10% 20% 30% D 0 H Fig. 5. An illustration of threshold determination.

9 * Let H D (x) be the difference histogram. Chiu et al. (2010) believe that if x is a suitable threshold, then * H ( ) D x should be the local minimum of the difference * histogram. hus, x should satisfy H ' ( * D x ) 0 and '' ( x * ) 0. Instead of searching the local minima from the middle zero point, the system starts from points D and H, shown in Figure 5, which are determined by segments of the foreground area in the previous frame. he first local minima outside these two starting points are determined as the thresholds D and H, respectively. earching the thresholds from the starting points, D and H D H, prevents the small noises near the middle of the difference histogram from having an impact. Figure 6 shows an example of the image binarization procedure, where Figure 6(a) shows the input frames, Figure 6(b) shows the foreground extraction results using only one threshold proposed by Chiu et al. (2010), and Figure 6(c) shows the foreground extraction results using two thresholds, D and H. A comparison of the corresponding frames shown in Figures 6(b) and (c) shows the foreground areas extracted using two thresholds are more complete and correct. (a) (b) (c) Fig. 6. An example of image binarization. (a) he input frames, (b) the foreground extraction results using only one threshold, and (c) the foreground extraction results using two thresholds. 4.2 Foreground extraction he next step is to use the foreground extraction results as the hard constraint pixels

10 to construct the distribution of the background and foreground. he color distribution of the foreground area, P ( x B), is constructed from the foreground extraction result, while the color distribution of the background area, P ( x O), can be directly computed from the background image stored in the system. o prevent gradual changes in the background image from impacting the classification, after the refinement of the graph cut, the intensity values of the pixels in the background image are updated if the pixels of the input frame are classified as background pixels: I ( i, j) I ( i, j) (1 ) I( i, j), (8) B where B I B is the background image and I is the input frame. ymbol is an input parameter to adjust the speed of updating the background pixels. In Figure 6(c), an observation of the face areas of the student in front shows that the foreground areas are still defective. hus, using the foreground areas as the hard constraints, applying the graph cut algorithm can obtain more accurate foreground results. Figure 7 shows the final foreground extraction result refined by the graph cut algorithm. Fig. 7. he final foreground extraction result refined by the graph cut algorithm. 5. kin Color Region Detection using a Graph Cut 5.1 Hard constraints of skin color regions he graph cut algorithm is also applied to detect skin color regions, since: (1) skin color pixels are usually collected into several compact regions in input frames and (2) the distribution of skin color is highly concentrated in hue components. Figure 8 shows an example in which the skin color regions are compact in hue components. One can observe that on the skin regions, such as faces and hands, the hue values of the pixels are very similar. hus, a fixed skin color interval in the HV color space, 0 H 50 and , is used to detect the initial skin color pixels. hese pixels are regarded as the hard constraints of the skin color regions.

11 (a) (b) Fig. 8. An example showing the skin color regions: (a) the input frames and (b) their corresponding hue values. Note that a fixed skin color interval usually selects most skin pixels, but the detection result is always broken due to a few outliers. Moreover, the lighting conditions in a classroom vary over time, affecting the color distribution of the skin pixels. his means the suitable skin color interval of the input frame should not be fixed. herefore, once the graph cut algorithm is applied to detect the complete skin region, a dynamic learning scheme should be used to update the interval every frame. hus, the hard constraints of the skin color regions can match the changes in lighting conditions. he proposed algorithm to initialize and update the skin color interval is as follows: 1. Construct the hue histogram H of the skin color pixels obtained by the graph cut algorithm. Let the value of each bin in the histogram H be h, where 0 i If the system initializes, find the top three values in the interval [0, 60]; otherwise, find three peak values nearest the center of the previous skin color interval. Calculate the average s to be the center of the new interval. 3. um the histogram values located left of s to obtain the left portion P l : s 255 P h h, (9) l j i j 0 i 0 and calculate the right bound value r, which should satisfy: r hj Pl. j s 4. Compute the variance s of the histogram values that are located in [0, r]. 5. et the interval 2, 2 ] as the new interval. [ s s s s It should be noted that skin color pixels are distributed on the left side of the hue histogram, since the skin colors are close to red in the hue component. In addition, to i

12 prevent the new interval from diverging too much, the maximum width of the interval is bounded by 60. If the width of the interval is larger than the maximum, then the system only shifts the interval but does not change its width. 5.2 kin color region extraction he distribution of the skin color values P ( x skin) is calculated from the detection results by the skin color intervals, while the distribution of non-skin color values P ( x ~ skin) is obtained from all the foreground pixels, with the exception of the skin color pixels. In Figure 9, an example of skin-color region extraction is shown. Figure 9(a) shows the input frames and Figure 9(b) shows the hard constraints detected by the initial skin color interval. Figure 9(c) shows the skin-color region extraction results of the graph cut algorithm. One can observe that the skin color regions shown in Figure 9(c) are more complete and compact than those shown in Figure 9(b). (a) (b) (c) Fig. 9. An example of the skin-color segmentation result: (a) the input frames, (b) the hard constraints detected by the initial skin color interval, and (c) the skin-color region extraction results of the graph cut algorithm.

13 6. Face regions selection he final stage of the proposed system is the selection of face regions. he face regions can be selected from the skin color regions obtained in the previous section. he face regions usually contain more complicated textures than non-face regions, for example, the arm and leg regions. Moreover, the shape of a face region is similar to a circle or a rectangle than the shapes of non-face regions. hus, given a skin color region and its corresponding bounding box, several criteria are proposed to distinguish face regions and non-face regions. 1. Regularity: Regularity is used to evaluate the skin region s similarity with its bounding box, which is computed by: A R, (10) B where A is the area of a skin color region, and B is the area of its bounding box. he R value of the face regions should be larger than Aspect ratio: he aspect ratio is defined as the ratio of the height and width of the bounding box. In this study, the aspect ratio of a face region is set in the range [0.5, 1]. 3. Convexity: he shape of a face region should be convex, which means the center of mass must be inside the region. 4. Number of corners: ince face regions are more complex than non-face regions, the number of corners in face regions should be larger than that in non-face regions. In this study, the smallest eigenvalue of the Hessian matrix is used as the corner feature. Given an image I: di 2 di di ( ) ( )( ) ( p) ( p) dx dx dy H, (11) 2 di di di ( )( ) ( ) ( p) ( p) dx dy dy where ( p) is a mask with a 3 3 patch size. In Figure 10(a), the green boxes represent the bounding boxes of the skin color regions and, in Figure 10(b), the green points are the corners detected in the skin color regions. Note that the number of corners in a face region is larger than that in a non-face region.

14 (a) (b) Fig. 10. An example of the corner detection results on skin color regions (a) bounding boxes of skin color regions and (b) the corners detected in the skin color regions. Using these criteria, most of the face regions can be selected quickly and accurately. However, there are still some challenging cases. Firstly, some skin color regions in the background that can be regarded as noise will affect the selection results. econdly, several skin color regions may be occluded by others and merge into a single region. Finally, the face regions under some head poses may not contain enough corners. hus, a temporal face-region finding scheme and a probability updating function are applied to avoid inaccurate detection of some face regions. Let R i, j be a face region, which indicates the j th face region in frame i, i j B, be its corresponding bounding box, and N i, j be the number of skin color pixels in B i, j. he temporal face-region finding scheme is as follows. If R i 1, j in frame i 1 is not found, then no face region in frame i 1 is detected near i j R,, and the system computes the number of skin color pixels, Ni 1, j, inside B i, j. In this case, if 1 Ni 1, j Ni, j, then a shift vector is computed to obtain the center of Bi 1, j. his shift 2 vector begins from the center of B, and ends at the center of the skin color region i j inside B, in frame i 1. An example of the shift vector computation is shown in i j Figure 11. Figure 11(a) shows a bounding box (the green box) detected in frame i and Figure 11(b) shows the noisy skin-color detection result and the shift vector (the blue arc) detected in frame i+1. he shift vector can help the system track the face regions in frame i 1.

15 (a) (b) Fig. 11. An example of computing the shift vector: (a) a bounding box and (b) the noisy skin color detection result and its shift vector. he probability update function is defined below. For a skin color region R i, j, the probability of R, being a face region is denoted by: i j 1 if Ri, jsatifies the criteria Pf ( Ri, j ) Pf ( Ri, j ) c1 if Ri, j doesn't satify the criteria, (12) Pf ( Ri, j ) c2 if Ri, j is not found where c 1 and c 2 are two predefined constants to adjust the length of time of R i, j under the second and third conditions. In this study, if the probability of a face region decreases to zero, the system will remove the face region. 7. Dynamic Graph Cut Algorithm he traditional graph cut algorithm is memory and time consuming. A large graph should be constructed in each frame, containing (the image size) vertices and approximately edges. Application of a max-flow min-cut algorithm on such a large graph requires substantial computation time. Kohli and orr (2007) proposed a dynamic graph cut algorithm to reduce the computation time. he dynamic graph cut algorithm can utilize the previous graph cut result in frame i as the initial graph in frame i 1 only changes some weights of edges. However, changing the weights of edges changes the flow capacity. If the resulting capacity is less than the flow, a flow inconsistency will occur and the structure of the graph is destroyed. Kohli and orr s strategy is to re-parameterize the graph, which can update the flow capacity corresponding to the new graph without affecting the image labeling result. Let 1 and 2 be two different assignments of the weights on the graph, then 2 is called the re-parameterization of 1 if and only if

16 arg min E( A ) arg min E( A ). he method to re-parameterize a graph involves A 1 2 A modifying the t-links and n-links in a specific mechanism. Figure 12 shows an example of how a graph is re-parameterized by modifying t-links. Figure 12(a) shows a residual graph without the path, which is equivalent to a flow network with a max-flow passing through. After frame i 1 is inputted, an inconsistent flow on edge {, p}, which is assigned by a negative-weight t-link, is obtained. he solution is to add a positive value a on both {, p} and {, p} to update the weights of the edges to avoid a negative-weight t-link occurring, as shown in Figure 12(b). he updated graph is a re-parameterized graph updating the weights of the t-links and is free from flow inconsistency (Figure 12(c)) a p p p a 0 5 (a) (b) (c) Fig. 12. An example showing the re-parameterization of a graph by modifying t-links: (a) the residual graph with a negative-weight t-link, (b) adding a positive value on the t-links, and (c) the re-parameterized graph. 0 0 Figure 13 shows an example of the re-parameterization of a graph by modifying its n-links. In Figure 13(a), the flow inconsistency occurs on edge { q, p} after frame i 1 is inputted. o avoid the occurrence of negative-weight edges, a positive value a is added to the weights of { q, p}, { p, }, and {, p}. his updating can be regarded as reversing the overflow to avoid the inconsistency, as shown in Figure 13(b). It should be noted that a should be the minimum positive value to avoid raising other flows while also avoiding the negative-weight n-links. he updated graph is a re-parameterized graph updating the weights of n-links and is free from flow inconsistency (Figure 13(c)).

17 q a a a 0 p q p q p a a 0 1 (a) (b) (c) Fig. 13. An example of graph re-parameterization by modifying n-links: (a) the residual graph with a negative-weight n-link, (b) reversing the overflow by adding a positive value, and (c) the re-parameterized graph. (a) (b) Fig. 14. wo examples of speed up achieved by the dynamic graph cut: (a) the input videos and (b) the runtimes required to find the max-flow. he dynamic graph cut algorithm provides a scheme for modifying the weights of a residual graph in order to reduce the overhead of constricting and deconstructing large graphs for every frame. Furthermore, since the graph modified by the dynamic graph cut is an almost full flow network, it should greatly reduce the runtime required to

18 apply the max-flow algorithm on the graph. Figure 14 shows two examples of achievable speed up. Figure 14(a) shows the first frames of two input videos, and Figure 14(b) shows their corresponding runtime to find the max-flow. he blue lines show the runtime of the traditional graph cut algorithm in each frame, and the red lines show the runtime of the dynamic graph cut algorithm. It can clearly be seen that the runtime of the dynamic graph cut algorithm is less than half of those of the traditional algorithm. ince the dynamic graph cut algorithm is more effective than the traditional graph cut algorithm, it may be improved further. In video input, successive frames are very similar, while the objects in them do not move a lot. hus, the system can simply modify the weights of some special edges of the graph in the following frames and leave the others unchanged. A temporal difference with thresholds is used to decide if the connected edges of a vertex should be modified or not. hese thresholds are equal to the global thresholds, D and H, as shown in ection 4.1. Figure 15 shows an example of the runtime of the improved dynamic graph cut algorithm. Figure 15(a) shows the first frame of the input video and Figure 15(b) shows the runtimes required to find the max-flow. In Figure 15(b), the blue line, the red line, and the green line show the runtimes of the traditional graph cut algorithm, the dynamic graph cut algorithm, and the proposed graph cut algorithm, respectively. In comparison to the dynamic graph cut algorithm, the average runtime of the proposed graph cut algorithm is reduced by approximately 16%, though its variance is higher. (a) (b) Fig. 15. An example of the runtime of the improved dynamic graph cut algorithm: (a) the input video and (b) the runtime required to find the max-flow, where the green line is the runtime of the proposed graph cut algorithm.

19 8. Experimental Results he experiments consisted of three parts: the foreground extraction, the skin color detection, and the face detection. he image size is in the foreground extraction and skin-color detection stages, and in the face detection stage. he processing time is approximately 12 frames/second. Five classrooms, B101, B103, C209, 101, and the B1 lyceum, in the National aiwan Normal University were used to obtain different classroom environments. ince the classrooms B101 and B103 are similar, Figure 16 shows four of the five classrooms. hese classrooms have different background color distributions and lighting conditions; B103 contains has a background similar to skin color. and C209 is the only one that is not a lecture theater. (a) (b) (c) (d) Fig. 16. Four classrooms used to obtain the experimental videos: (a) B103, (b) C209, (c) 101, and (d) the B1 lyceum. even videos (totaling 24,952 frames) were taken from these classrooms, and named in series as 265, 266, 288, 292, 301, 303, and 307. Figure 17(a) shows one frame of each video. Figures 17(b) and (d) show their corresponding experimental results of foreground extraction and skin color detection, respectively. Moreover, to compute the error rate of foreground extraction and skin color detection, ground truth images are manually produced (shown in Figures 17(c) and (e)). able 1 shows the experimental results of foreground extraction. he precision rates and recall rates of foreground extraction are approximately 85 92% and 84 97%, respectively. he f-measure values of these seven sequences are all more than One can see from able 1 that the precision rates are all lower than the recall rates, except for video 265. his means that when the objects are occluded by others, the small background regions near the object boundary are classified as foreground areas. However, in most cases this misclassification does not affect the face detection results unless the color of the background region is similar to the skin color. If input videos contain skin-color-like backgrounds, then the results of skin color detection will be

20 easily affected and the heights and widths of the bounding boxes of the detected skin regions will be incorrect. hese situations can be partially solved by the probability updating function (a) (b) (c) (d) Fig. 17. ome experimental examples of (a) the experimental videos, (b) the experimental results of foreground extraction, (c) the ground truth of foreground extraction, (d) the experimental results of skin color detection, and (e) the skin color ground truth. able 1. he experimental results of foreground extraction. Video No. No. of frames Precision Recall F-measure % 84.43% % 95.33% % 97.18% % 91.13% % 91.85% % 97.00% % 95.36% (e) able 2 shows the experimental results of skin color detection. he f-measure values of these seven sequences are in the range to In dark lighting conditions, as in videos 265 and 307, it can be seen that the system obtains the skin color regions

21 sufficiently well and approaches high precision rates (88.47% and 91.57%, respectively). However, some precision rates of skin color detection are much lower than the foreground extraction stage. he reasons for the low precision rate can be: (1) if there are few skin color pixels in some frames, then the portion of misclassified pixels is increased; (2) the brown color pixels of a student s hair are usually misclassified as skin color pixels and lower the precision rates, e.g., video 288; and (3) many small noisy pixels around skin regions in skin-color-like background frames, e.g., video 266, are misclassified as foreground skin color pixels. In comparison, low recall rates occur in video 288 (66.43%) and video 307 (66.99%), which are both captured in the B1 lyceum. he blue color of the chairs and the lighting conditions cause the color variation of the students faces in the frames. he system finds it difficult to determine a suitable skin color range. able 2. he experimental results of skin color detection. Video No. No. of frames Precision Recall F-measure % 77.91% % 88.49% % 66.43% % 81.73% % 87.31% % 71.91% % 66.99% able 3 shows the final results of the proposed face detection method. wo critical factors are relative to the precision rates. he first is the distribution of the skin region positions; the more dispersive the skin regions are, the better the results will be. he second is the resolution of skin regions caused by the lighting conditions, the position of the camera, and the size of the frame. In videos 266 and 303, the face regions are clear and non-overlapping, and the lighting conditions are good. hus, their precision and recall rates are rather high (91 99%). Although the background color of video 266 is similar to skin color, application of the probability update function allows the system to find the face regions as long as the corner features are clear. Videos 288 and 307 are both challenging cases since the skin regions are small areas of the scene. Moreover, the faces of students seated in the back rows are mostly occluded, which lowers the recall rates. Video 265 is the darkest case that the system tests. he system obtains a good skin-color detection result, but the poor lighting conditions cause the misdetection of corner features. As a result, the system does not

22 detect most of the faces of students seated in the back rows. he recall rate is reduced to 43%. In addition, the processing rate of all the videos is approximately 12 frames/second, meaning the technique has the potential to be a real-time system with further optimization or GPU parallelization. able 3. he experimental results of the proposed face detection method. Video No. No. of frames Precision Recall F-measure % 43% % 99% % 83% % 93% % 96% % 98% % 91% Conclusions Face detection using skin color is usually considered ineffective due to its sensitivity to lighting conditions, the race of the subjects, and other skin color imperfections. In this paper, an improved graph cut algorithm is proposed to perform foreground and skin color extraction. Moreover, a dynamic learning strategy to update the skin color range in the frame is also proposed. his strategy improves the correctness of the initial skin color range and reduces the computational time of skin-color region detection. he experimental results show that, even in a classroom with a background similar to skin color, the system can still detect faces successfully. he proposed method is robust against the various head poses of the subjects, and will help to further analyze the behavior of students in a classroom. Acknowledgment he authors would like to thank the National cience Council of the Republic of China, aiwan for financially supporting this research under Contract No. NC E Reference Boykov, Y. Y. & Jolly, M. P Interactive Graph Cuts for Optimal Boundary & Region egmentation of Objects in N-D Images, Proceedings of International Conference on Computer Vision, Vancouver, Canada, 1, Chai, D. & Ngan, K. N Face egmentation Using kin-color Map in

23 Videophone Applications, IEEE ransactions on Circuits and ystems for Video echnology, 9 (4), Cheng, L., Gong, M., chuurmans, D., & Caelli, Real-ime Discriminative Background ubtraction, IEEE ransactions on Image Processing, 20 (5), Chiu, C. C., Ku, M. Y., & Liang, L. W A Robust Object egmentation ystem Using a Probability-Based Background Extraction Algorithm, IEEE ransactions on Circuits and ystems for Video echnology, 20 (4), Elgammal, A. M., Duraiswami, R., Harwood, D., & Davis, L Background and Foreground Modeling using Nonparametric Kernel Density Estimation for Visual urveillance, Proceedings of the IEEE, 90 (7), Fang, C. Y., Kuo, M. H., Lee, G. C, & Chen,. W tudent Gesture Recognition ystem Classroom 2.0, Proceedings of the IAED International Conference Computers and Advanced echnology in Education (CAE 2011), Cambridge, United Kingdom, Heikkilä, M. & Pietikäinen, M A exture-based Method for Modeling the Background and Detecting Moving Objects, IEEE ransactions on Pattern Analysis and Machine Intelligence, 28 (4), Juan, O. & Boykov, Y Interactive Graph Cuts, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York, Kae, P. & Bow, R An Improved Adaptive Background Mixture Model for Real-ime racking with hadow Detection, Proceedings of the 2nd European Workshop on Advanced Video-Based urveillance ystems, London. Kohli, P. & orr, P. H Dynamic Graph Cuts for Efficient Inference in Markov Random Fields, IEEE ransactions on Pattern Analysis and Machine Intelligence, 29 (12), Mahmoud,. M A New Fast kin Color Detection echnique, World Academy of cience, Engineering and echnology, Phung,. A A Novel kin Color Model in YCbCr Color pace And Its Application to Human Face Detection, Proceedings of IEEE International Conference on Image Processing(ICIP '02), 1, I I-292, New York.

24 igal, L., claroff,., & Athitsos, V kin Color-based Video egmentation Under ime-varying Illumination, IEEE ransactions on Pattern Analysis and Machine Intelligence, 26 (7), Wu, Z. & Leahy, R An Optimal Graph heoretic Approach to Data Clustering: heory and its Application to Image egmentation, IEEE ransactions on Pattern Analysis and Machine Intelligence, 15 (11), Zhang, X. N., Jiang, J. Z., Liang, H., & Liu, C. L kin Color Enhancement Based on Favorite kin Color in HV Color pace, IEEE ransactions on Consumer Electronics, 56 (3),

STUDENT GESTURE RECOGNITION SYSTEM IN CLASSROOM 2.0

STUDENT GESTURE RECOGNITION SYSTEM IN CLASSROOM 2.0 STUDENT GESTURE RECOGNITION SYSTEM IN CLASSROOM 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee and Sei-Wang Chen Department of Computer Science and Information Engineering, National Taiwan Normal University

More information

An algorithm of lips secondary positioning and feature extraction based on YCbCr color space SHEN Xian-geng 1, WU Wei 2

An algorithm of lips secondary positioning and feature extraction based on YCbCr color space SHEN Xian-geng 1, WU Wei 2 International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 015) An algorithm of lips secondary positioning and feature extraction based on YCbCr color space SHEN Xian-geng

More information

A Feature Point Matching Based Approach for Video Objects Segmentation

A Feature Point Matching Based Approach for Video Objects Segmentation A Feature Point Matching Based Approach for Video Objects Segmentation Yan Zhang, Zhong Zhou, Wei Wu State Key Laboratory of Virtual Reality Technology and Systems, Beijing, P.R. China School of Computer

More information

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song

DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN. Gengjian Xue, Jun Sun, Li Song DYNAMIC BACKGROUND SUBTRACTION BASED ON SPATIAL EXTENDED CENTER-SYMMETRIC LOCAL BINARY PATTERN Gengjian Xue, Jun Sun, Li Song Institute of Image Communication and Information Processing, Shanghai Jiao

More information

Human Head-Shoulder Segmentation

Human Head-Shoulder Segmentation Human Head-Shoulder Segmentation Hai Xin, Haizhou Ai Computer Science and Technology Tsinghua University Beijing, China ahz@mail.tsinghua.edu.cn Hui Chao, Daniel Tretter Hewlett-Packard Labs 1501 Page

More information

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN 2016 International Conference on Artificial Intelligence: Techniques and Applications (AITA 2016) ISBN: 978-1-60595-389-2 Face Recognition Using Vector Quantization Histogram and Support Vector Machine

More information

HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION. Gengjian Xue, Li Song, Jun Sun, Meng Wu

HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION. Gengjian Xue, Li Song, Jun Sun, Meng Wu HYBRID CENTER-SYMMETRIC LOCAL PATTERN FOR DYNAMIC BACKGROUND SUBTRACTION Gengjian Xue, Li Song, Jun Sun, Meng Wu Institute of Image Communication and Information Processing, Shanghai Jiao Tong University,

More information

Operators-Based on Second Derivative double derivative Laplacian operator Laplacian Operator Laplacian Of Gaussian (LOG) Operator LOG

Operators-Based on Second Derivative double derivative Laplacian operator Laplacian Operator Laplacian Of Gaussian (LOG) Operator LOG Operators-Based on Second Derivative The principle of edge detection based on double derivative is to detect only those points as edge points which possess local maxima in the gradient values. Laplacian

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement Daegeon Kim Sung Chun Lee Institute for Robotics and Intelligent Systems University of Southern

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

Robust color segmentation algorithms in illumination variation conditions

Robust color segmentation algorithms in illumination variation conditions 286 CHINESE OPTICS LETTERS / Vol. 8, No. / March 10, 2010 Robust color segmentation algorithms in illumination variation conditions Jinhui Lan ( ) and Kai Shen ( Department of Measurement and Control Technologies,

More information

Automatic Colorization of Grayscale Images

Automatic Colorization of Grayscale Images Automatic Colorization of Grayscale Images Austin Sousa Rasoul Kabirzadeh Patrick Blaes Department of Electrical Engineering, Stanford University 1 Introduction ere exists a wealth of photographic images,

More information

Implementing the Scale Invariant Feature Transform(SIFT) Method

Implementing the Scale Invariant Feature Transform(SIFT) Method Implementing the Scale Invariant Feature Transform(SIFT) Method YU MENG and Dr. Bernard Tiddeman(supervisor) Department of Computer Science University of St. Andrews yumeng@dcs.st-and.ac.uk Abstract The

More information

An embedded system of Face Recognition based on ARM and HMM

An embedded system of Face Recognition based on ARM and HMM An embedded system of Face Recognition based on ARM and HMM Yanbin Sun 1,2, Lun Xie 1, Zhiliang Wang 1,Yi An 2 1 Department of Electronic Information Engineering, School of Information Engineering, University

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Scale Invariant Feature Transform Why do we care about matching features? Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Image

More information

Tracking and Recognizing People in Colour using the Earth Mover s Distance

Tracking and Recognizing People in Colour using the Earth Mover s Distance Tracking and Recognizing People in Colour using the Earth Mover s Distance DANIEL WOJTASZEK, ROBERT LAGANIÈRE S.I.T.E. University of Ottawa, Ottawa, Ontario, Canada K1N 6N5 danielw@site.uottawa.ca, laganier@site.uottawa.ca

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail: nongluk.c@chula.ac.th

More information

Mingle Face Detection using Adaptive Thresholding and Hybrid Median Filter

Mingle Face Detection using Adaptive Thresholding and Hybrid Median Filter Mingle Face Detection using Adaptive Thresholding and Hybrid Median Filter Amandeep Kaur Department of Computer Science and Engg Guru Nanak Dev University Amritsar, India-143005 ABSTRACT Face detection

More information

Face Recognition using SURF Features and SVM Classifier

Face Recognition using SURF Features and SVM Classifier International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 8, Number 1 (016) pp. 1-8 Research India Publications http://www.ripublication.com Face Recognition using SURF Features

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE Hongyu Liang, Jinchen Wu, and Kaiqi Huang National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Why do we care about matching features? Scale Invariant Feature Transform Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Automatic

More information

A Keypoint Descriptor Inspired by Retinal Computation

A Keypoint Descriptor Inspired by Retinal Computation A Keypoint Descriptor Inspired by Retinal Computation Bongsoo Suh, Sungjoon Choi, Han Lee Stanford University {bssuh,sungjoonchoi,hanlee}@stanford.edu Abstract. The main goal of our project is to implement

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

CSE/EE-576, Final Project

CSE/EE-576, Final Project 1 CSE/EE-576, Final Project Torso tracking Ke-Yu Chen Introduction Human 3D modeling and reconstruction from 2D sequences has been researcher s interests for years. Torso is the main part of the human

More information

CRF Based Point Cloud Segmentation Jonathan Nation

CRF Based Point Cloud Segmentation Jonathan Nation CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to

More information

Articulated Pose Estimation with Flexible Mixtures-of-Parts

Articulated Pose Estimation with Flexible Mixtures-of-Parts Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:

More information

Spatio-Temporal Nonparametric Background Modeling and Subtraction

Spatio-Temporal Nonparametric Background Modeling and Subtraction Spatio-Temporal onparametric Background Modeling and Subtraction Raviteja Vemulapalli R. Aravind Department of Electrical Engineering Indian Institute of Technology, Madras, India. Abstract Background

More information

Image Segmentation Using Iterated Graph Cuts Based on Multi-scale Smoothing

Image Segmentation Using Iterated Graph Cuts Based on Multi-scale Smoothing Image Segmentation Using Iterated Graph Cuts Based on Multi-scale Smoothing Tomoyuki Nagahashi 1, Hironobu Fujiyoshi 1, and Takeo Kanade 2 1 Dept. of Computer Science, Chubu University. Matsumoto 1200,

More information

Understanding Tracking and StroMotion of Soccer Ball

Understanding Tracking and StroMotion of Soccer Ball Understanding Tracking and StroMotion of Soccer Ball Nhat H. Nguyen Master Student 205 Witherspoon Hall Charlotte, NC 28223 704 656 2021 rich.uncc@gmail.com ABSTRACT Soccer requires rapid ball movements.

More information

CS 664 Segmentation. Daniel Huttenlocher

CS 664 Segmentation. Daniel Huttenlocher CS 664 Segmentation Daniel Huttenlocher Grouping Perceptual Organization Structural relationships between tokens Parallelism, symmetry, alignment Similarity of token properties Often strong psychophysical

More information

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION Panca Mudjirahardjo, Rahmadwati, Nanang Sulistiyanto and R. Arief Setyawan Department of Electrical Engineering, Faculty of

More information

Region-based Segmentation

Region-based Segmentation Region-based Segmentation Image Segmentation Group similar components (such as, pixels in an image, image frames in a video) to obtain a compact representation. Applications: Finding tumors, veins, etc.

More information

Face Recognition Technology Based On Image Processing Chen Xin, Yajuan Li, Zhimin Tian

Face Recognition Technology Based On Image Processing Chen Xin, Yajuan Li, Zhimin Tian 4th International Conference on Machinery, Materials and Computing Technology (ICMMCT 2016) Face Recognition Technology Based On Image Processing Chen Xin, Yajuan Li, Zhimin Tian Hebei Engineering and

More information

Object detection using non-redundant local Binary Patterns

Object detection using non-redundant local Binary Patterns University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh

More information

Study on Image Position Algorithm of the PCB Detection

Study on Image Position Algorithm of the PCB Detection odern Applied cience; Vol. 6, No. 8; 01 IN 1913-1844 E-IN 1913-185 Published by Canadian Center of cience and Education tudy on Image Position Algorithm of the PCB Detection Zhou Lv 1, Deng heng 1, Yan

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier

Computer Vision 2. SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung. Computer Vision 2 Dr. Benjamin Guthier Computer Vision 2 SS 18 Dr. Benjamin Guthier Professur für Bildverarbeitung Computer Vision 2 Dr. Benjamin Guthier 1. IMAGE PROCESSING Computer Vision 2 Dr. Benjamin Guthier Content of this Chapter Non-linear

More information

Texture Sensitive Image Inpainting after Object Morphing

Texture Sensitive Image Inpainting after Object Morphing Texture Sensitive Image Inpainting after Object Morphing Yin Chieh Liu and Yi-Leh Wu Department of Computer Science and Information Engineering National Taiwan University of Science and Technology, Taiwan

More information

Supervised texture detection in images

Supervised texture detection in images Supervised texture detection in images Branislav Mičušík and Allan Hanbury Pattern Recognition and Image Processing Group, Institute of Computer Aided Automation, Vienna University of Technology Favoritenstraße

More information

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Automatic Tracking of Moving Objects in Video for Surveillance Applications Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering

More information

Occlusion Robust Multi-Camera Face Tracking

Occlusion Robust Multi-Camera Face Tracking Occlusion Robust Multi-Camera Face Tracking Josh Harguess, Changbo Hu, J. K. Aggarwal Computer & Vision Research Center / Department of ECE The University of Texas at Austin harguess@utexas.edu, changbo.hu@gmail.com,

More information

Postprint.

Postprint. http://www.diva-portal.org Postprint This is the accepted version of a paper presented at 14th International Conference of the Biometrics Special Interest Group, BIOSIG, Darmstadt, Germany, 9-11 September,

More information

An ICA based Approach for Complex Color Scene Text Binarization

An ICA based Approach for Complex Color Scene Text Binarization An ICA based Approach for Complex Color Scene Text Binarization Siddharth Kherada IIIT-Hyderabad, India siddharth.kherada@research.iiit.ac.in Anoop M. Namboodiri IIIT-Hyderabad, India anoop@iiit.ac.in

More information

Robust Shape Retrieval Using Maximum Likelihood Theory

Robust Shape Retrieval Using Maximum Likelihood Theory Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

More information

2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

2006 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, 6 IEEE Personal use of this material is permitted Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising

More information

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM 1 PHYO THET KHIN, 2 LAI LAI WIN KYI 1,2 Department of Information Technology, Mandalay Technological University The Republic of the Union of Myanmar

More information

Computer Science Faculty, Bandar Lampung University, Bandar Lampung, Indonesia

Computer Science Faculty, Bandar Lampung University, Bandar Lampung, Indonesia Application Object Detection Using Histogram of Oriented Gradient For Artificial Intelegence System Module of Nao Robot (Control System Laboratory (LSKK) Bandung Institute of Technology) A K Saputra 1.,

More information

Unsupervised Human Members Tracking Based on an Silhouette Detection and Analysis Scheme

Unsupervised Human Members Tracking Based on an Silhouette Detection and Analysis Scheme Unsupervised Human Members Tracking Based on an Silhouette Detection and Analysis Scheme Costas Panagiotakis and Anastasios Doulamis Abstract In this paper, an unsupervised, automatic video human members(human

More information

An Adaptive Threshold LBP Algorithm for Face Recognition

An Adaptive Threshold LBP Algorithm for Face Recognition An Adaptive Threshold LBP Algorithm for Face Recognition Xiaoping Jiang 1, Chuyu Guo 1,*, Hua Zhang 1, and Chenghua Li 1 1 College of Electronics and Information Engineering, Hubei Key Laboratory of Intelligent

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

Iris Recognition for Eyelash Detection Using Gabor Filter

Iris Recognition for Eyelash Detection Using Gabor Filter Iris Recognition for Eyelash Detection Using Gabor Filter Rupesh Mude 1, Meenakshi R Patel 2 Computer Science and Engineering Rungta College of Engineering and Technology, Bhilai Abstract :- Iris recognition

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

Adaptive Learning of an Accurate Skin-Color Model

Adaptive Learning of an Accurate Skin-Color Model Adaptive Learning of an Accurate Skin-Color Model Q. Zhu K.T. Cheng C. T. Wu Y. L. Wu Electrical & Computer Engineering University of California, Santa Barbara Presented by: H.T Wang Outline Generic Skin

More information

Extracting Road Signs using the Color Information

Extracting Road Signs using the Color Information Extracting Road Signs using the Color Information Wen-Yen Wu, Tsung-Cheng Hsieh, and Ching-Sung Lai Abstract In this paper, we propose a method to extract the road signs. Firstly, the grabbed image is

More information

A Novel Smoke Detection Method Using Support Vector Machine

A Novel Smoke Detection Method Using Support Vector Machine A Novel Smoke Detection Method Using Support Vector Machine Hidenori Maruta Information Media Center Nagasaki University, Japan 1-14 Bunkyo-machi, Nagasaki-shi Nagasaki, Japan Email: hmaruta@nagasaki-u.ac.jp

More information

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors

Applications. Foreground / background segmentation Finding skin-colored regions. Finding the moving objects. Intelligent scissors Segmentation I Goal Separate image into coherent regions Berkeley segmentation database: http://www.eecs.berkeley.edu/research/projects/cs/vision/grouping/segbench/ Slide by L. Lazebnik Applications Intelligent

More information

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial

More information

Automatic Shadow Removal by Illuminance in HSV Color Space

Automatic Shadow Removal by Illuminance in HSV Color Space Computer Science and Information Technology 3(3): 70-75, 2015 DOI: 10.13189/csit.2015.030303 http://www.hrpub.org Automatic Shadow Removal by Illuminance in HSV Color Space Wenbo Huang 1, KyoungYeon Kim

More information

A New Feature Local Binary Patterns (FLBP) Method

A New Feature Local Binary Patterns (FLBP) Method A New Feature Local Binary Patterns (FLBP) Method Jiayu Gu and Chengjun Liu The Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA Abstract - This paper presents

More information

Research on QR Code Image Pre-processing Algorithm under Complex Background

Research on QR Code Image Pre-processing Algorithm under Complex Background Scientific Journal of Information Engineering May 207, Volume 7, Issue, PP.-7 Research on QR Code Image Pre-processing Algorithm under Complex Background Lei Liu, Lin-li Zhou, Huifang Bao. Institute of

More information

COMS W4735: Visual Interfaces To Computers. Final Project (Finger Mouse) Submitted by: Tarandeep Singh Uni: ts2379

COMS W4735: Visual Interfaces To Computers. Final Project (Finger Mouse) Submitted by: Tarandeep Singh Uni: ts2379 COMS W4735: Visual Interfaces To Computers Final Project (Finger Mouse) Submitted by: Tarandeep Singh Uni: ts2379 FINGER MOUSE (Fingertip tracking to control mouse pointer) Abstract. This report discusses

More information

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection

Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection Hu, Qu, Li and Wang 1 Research on Recognition and Classification of Moving Objects in Mixed Traffic Based on Video Detection Hongyu Hu (corresponding author) College of Transportation, Jilin University,

More information

CIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task

CIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task CIS UDEL Working Notes on ImageCLEF 2015: Compound figure detection task Xiaolong Wang, Xiangying Jiang, Abhishek Kolagunda, Hagit Shatkay and Chandra Kambhamettu Department of Computer and Information

More information

Implementation of a Face Recognition System for Interactive TV Control System

Implementation of a Face Recognition System for Interactive TV Control System Implementation of a Face Recognition System for Interactive TV Control System Sang-Heon Lee 1, Myoung-Kyu Sohn 1, Dong-Ju Kim 1, Byungmin Kim 1, Hyunduk Kim 1, and Chul-Ho Won 2 1 Dept. IT convergence,

More information

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,

More information

A Vision System for Monitoring Intermodal Freight Trains

A Vision System for Monitoring Intermodal Freight Trains A Vision System for Monitoring Intermodal Freight Trains Avinash Kumar, Narendra Ahuja, John M Hart Dept. of Electrical and Computer Engineering University of Illinois,Urbana-Champaign Urbana, Illinois

More information

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes 2009 10th International Conference on Document Analysis and Recognition Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes Alireza Alaei

More information

String Extraction From Color Airline Coupon Image Using Statistical Approach

String Extraction From Color Airline Coupon Image Using Statistical Approach String Extraction From Color Airline Coupon Image Using Statistical Approach Yi Li, Zhiyan Wang, Haizan Zeng School of Computer Science South China University of echnology, Guangzhou, 510640, P.R.China

More information

A Texture-Based Method for Modeling the Background and Detecting Moving Objects

A Texture-Based Method for Modeling the Background and Detecting Moving Objects A Texture-Based Method for Modeling the Background and Detecting Moving Objects Marko Heikkilä and Matti Pietikäinen, Senior Member, IEEE 2 Abstract This paper presents a novel and efficient texture-based

More information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information

Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Fully Automatic Methodology for Human Action Recognition Incorporating Dynamic Information Ana González, Marcos Ortega Hortas, and Manuel G. Penedo University of A Coruña, VARPA group, A Coruña 15071,

More information

Segmentation of Distinct Homogeneous Color Regions in Images

Segmentation of Distinct Homogeneous Color Regions in Images Segmentation of Distinct Homogeneous Color Regions in Images Daniel Mohr and Gabriel Zachmann Department of Computer Science, Clausthal University, Germany, {mohr, zach}@in.tu-clausthal.de Abstract. In

More information

Detecting and Identifying Moving Objects in Real-Time

Detecting and Identifying Moving Objects in Real-Time Chapter 9 Detecting and Identifying Moving Objects in Real-Time For surveillance applications or for human-computer interaction, the automated real-time tracking of moving objects in images from a stationary

More information

A Hand Gesture Recognition Method Based on Multi-Feature Fusion and Template Matching

A Hand Gesture Recognition Method Based on Multi-Feature Fusion and Template Matching Available online at www.sciencedirect.com Procedia Engineering 9 (01) 1678 1684 01 International Workshop on Information and Electronics Engineering (IWIEE) A Hand Gesture Recognition Method Based on Multi-Feature

More information

Pupil Localization Algorithm based on Hough Transform and Harris Corner Detection

Pupil Localization Algorithm based on Hough Transform and Harris Corner Detection Pupil Localization Algorithm based on Hough Transform and Harris Corner Detection 1 Chongqing University of Technology Electronic Information and Automation College Chongqing, 400054, China E-mail: zh_lian@cqut.edu.cn

More information

Connected Component Analysis and Change Detection for Images

Connected Component Analysis and Change Detection for Images Connected Component Analysis and Change Detection for Images Prasad S.Halgaonkar Department of Computer Engg, MITCOE Pune University, India Abstract Detection of the region of change in images of a particular

More information

Computers and Mathematics with Applications. An embedded system for real-time facial expression recognition based on the extension theory

Computers and Mathematics with Applications. An embedded system for real-time facial expression recognition based on the extension theory Computers and Mathematics with Applications 61 (2011) 2101 2106 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa An

More information

A Comparison of Color Models for Color Face Segmentation

A Comparison of Color Models for Color Face Segmentation Available online at www.sciencedirect.com Procedia Technology 7 ( 2013 ) 134 141 A Comparison of Color Models for Color Face Segmentation Manuel C. Sanchez-Cuevas, Ruth M. Aguilar-Ponce, J. Luis Tecpanecatl-Xihuitl

More information

Image retrieval based on region shape similarity

Image retrieval based on region shape similarity Image retrieval based on region shape similarity Cheng Chang Liu Wenyin Hongjiang Zhang Microsoft Research China, 49 Zhichun Road, Beijing 8, China {wyliu, hjzhang}@microsoft.com ABSTRACT This paper presents

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Input sensitive thresholding for ancient Hebrew manuscript

Input sensitive thresholding for ancient Hebrew manuscript Pattern Recognition Letters 26 (2005) 1168 1173 www.elsevier.com/locate/patrec Input sensitive thresholding for ancient Hebrew manuscript Itay Bar-Yosef * Department of Computer Science, Ben Gurion University,

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

Image Segmentation Using Iterated Graph Cuts BasedonMulti-scaleSmoothing

Image Segmentation Using Iterated Graph Cuts BasedonMulti-scaleSmoothing Image Segmentation Using Iterated Graph Cuts BasedonMulti-scaleSmoothing Tomoyuki Nagahashi 1, Hironobu Fujiyoshi 1, and Takeo Kanade 2 1 Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai,

More information

CHAPTER 4 DETECTION OF DISEASES IN PLANT LEAF USING IMAGE SEGMENTATION

CHAPTER 4 DETECTION OF DISEASES IN PLANT LEAF USING IMAGE SEGMENTATION CHAPTER 4 DETECTION OF DISEASES IN PLANT LEAF USING IMAGE SEGMENTATION 4.1. Introduction Indian economy is highly dependent of agricultural productivity. Therefore, in field of agriculture, detection of

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15

More information

Color Content Based Image Classification

Color Content Based Image Classification Color Content Based Image Classification Szabolcs Sergyán Budapest Tech sergyan.szabolcs@nik.bmf.hu Abstract: In content based image retrieval systems the most efficient and simple searches are the color

More information

A Robust Wipe Detection Algorithm

A Robust Wipe Detection Algorithm A Robust Wipe Detection Algorithm C. W. Ngo, T. C. Pong & R. T. Chin Department of Computer Science The Hong Kong University of Science & Technology Clear Water Bay, Kowloon, Hong Kong Email: fcwngo, tcpong,

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c

A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b and Guichi Liu2, c 4th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2015) A Background Modeling Approach Based on Visual Background Extractor Taotao Liu1, a, Lin Qi2, b

More information

Background subtraction in people detection framework for RGB-D cameras

Background subtraction in people detection framework for RGB-D cameras Background subtraction in people detection framework for RGB-D cameras Anh-Tuan Nghiem, Francois Bremond INRIA-Sophia Antipolis 2004 Route des Lucioles, 06902 Valbonne, France nghiemtuan@gmail.com, Francois.Bremond@inria.fr

More information

Research on Evaluation Method of Video Stabilization

Research on Evaluation Method of Video Stabilization International Conference on Advanced Material Science and Environmental Engineering (AMSEE 216) Research on Evaluation Method of Video Stabilization Bin Chen, Jianjun Zhao and i Wang Weapon Science and

More information

High Capacity Reversible Watermarking Scheme for 2D Vector Maps

High Capacity Reversible Watermarking Scheme for 2D Vector Maps Scheme for 2D Vector Maps 1 Information Management Department, China National Petroleum Corporation, Beijing, 100007, China E-mail: jxw@petrochina.com.cn Mei Feng Research Institute of Petroleum Exploration

More information

RSRN: Rich Side-output Residual Network for Medial Axis Detection

RSRN: Rich Side-output Residual Network for Medial Axis Detection RSRN: Rich Side-output Residual Network for Medial Axis Detection Chang Liu, Wei Ke, Jianbin Jiao, and Qixiang Ye University of Chinese Academy of Sciences, Beijing, China {liuchang615, kewei11}@mails.ucas.ac.cn,

More information

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA Journal of Computer Science, 9 (5): 534-542, 2013 ISSN 1549-3636 2013 doi:10.3844/jcssp.2013.534.542 Published Online 9 (5) 2013 (http://www.thescipub.com/jcs.toc) MATRIX BASED INDEXING TECHNIQUE FOR VIDEO

More information

Motion Detection Using Adaptive Temporal Averaging Method

Motion Detection Using Adaptive Temporal Averaging Method 652 B. NIKOLOV, N. KOSTOV, MOTION DETECTION USING ADAPTIVE TEMPORAL AVERAGING METHOD Motion Detection Using Adaptive Temporal Averaging Method Boris NIKOLOV, Nikolay KOSTOV Dept. of Communication Technologies,

More information