Tracking Under Low-light Conditions Using Background Subtraction Matthew Bennink Clemson University Clemson, South Carolina Abstract A low-light tracking system was developed using background subtraction. Results are given for a full uniform light source, single light source and no light source. The results show that although the system performs well with full uniform light, it is easily confused by shadows given a single light source, and is almost useless with no light, even though low-light cameras are being used. The results will be discussed as well as the methods used to obtain the results. Some possible solutions will be presented in order to improve the tracking system for low-light conditions. 1 Introduction Tracking has a variety of applications. For example, a security team may want to track suspicious persons or a manufacturing plant may want to follow a product through the assembly process. One common method used with video tracking is background subtraction. Abbott and Williams used background subtraction with connected components analysis to segment video [1], Davis and Sharma used background subtraction with thermal cameras for tracking purposes [2], while Hoover used it with regular video feeds, also for tracking purposes [3]. In this paper, we will follow an algorithm very similar to Hoover s algorithm, but we will use low-light cameras in place of regular cameras. In doing so, we hope to track objects with little or no light present. These cameras contain a small amount of LEDs around the lens to provide ambient light without producing any light visible to the naked eye. 2 Methods Before any code is written, it is necessary to setup the tracking area and the cameras. In our case, we used masking tape to sketch out a rectangle approximately 4m long by 3m wide. The cameras were positioned above the tracking area and facing the center. Once the inital setup is complete, the cameras are calibrated. Using the calibration matrices and background subtraction, pixels are highlighted where the tracking system believes an object exists. We provide a discussion of camera calibration and background subtraction, then follow with the algorithm. 1
2.1 Camera Calibration Camera calibration is necessary to map image coordinates to real world coordinates. A brief overview of the calibration matrices will be presented and then a discussion of the image calibration tool we used will be given. Calibration requires two sets of information, the intrinsic values specific to the camera and the extrinsic values dependent on the world geometry. The intrinsic values include the focal length of the camera, the aspect ratio, the principal point, and skew. Rotation and translation make up the extrinsic values. The reduced equation mapping world coordinates to image coordinates is given below, where f is the focal length and (u 0,v 0 ) is the principal point. We assume that the skew is zero and the aspect ratio is 1:1. x y 1 = f 0 u 0 0 f v 0 0 0 1 R 11 R 12 R 13 R 21 R 22 R 23 R 31 R 32 R 33 X Y Z + Camera calibration is not trivial, but several tools are available to calibrate cameras. We chose to use a 3rd party Matlab toolbox [4]. The toolbox, developed by Jean-Yves Bouguet, computes both the intrinsic values and extrinsic values of the camera. Calibration requires a calibration image, usually a black and white chessboard of sorts. We constructed a 3 x 3 chessboard using black foam board and white 11" x 8.5" printer paper. Images were captured with the board tilted at various angles. The calibration software uses these images to produce the intrinsic values. The board was then placed in the origin of our tracking area. The origin may be placed anywhere, but for ease of computation, we chose a corner, allowing for only positive world coordinates. A single image was captured, and this image was used to determine the extrinsic values, rotation and translation. Since rotation and translation are extrinsic, it is required to periodically update these matrices since they will change as the room is used. For example, the camera may be shifted slightly on accident. 2.2 Background Subtraction Background subtraction, on the other hand, is extremely trivial. With grey-level values, a difference map is produced by computing the absolute difference between each pixel in the image. This difference image is then thresholded to remove any difference values below some fixed threshold. Background subtraction is only effective when the foreground objects differ from the background. For example, black objects tracked on a black surface will not show up in a difference image because their grey level values are too similar. To produce good results, it is recommended that the images be pre-processed beforehand to remove noise and increase the breadth of the grey-level intensities. 2.3 Algorithm Now that we ve provided some background, here is the algorithm used. First, we calibrate the cameras. Second, we create a mapping of image coordinates to world coordinates. This speeds up the computation tremendously. Background images are stored, and mask images are produced in order to track only what is in the tracking area. Now, we loop over time. We set the occupancy map pixels to 1 indicating that none of the floor can be seen. Then, for each camera, a difference image is computed between the current image and the background image. If the difference is less T x T y T z 2
than some threshold, the floor can be seen in this area, and a 0 is placed in the occupancy map. After looping through all the cameras, the occupancy map is displayed. A value of 0 indicates at least one camera can see the floor. A value of 1 indicates that no camera can see the floor. Pseudocode Calibrate the cameras Create a lookup table of image coordinates to world coordinates Capture background images of empty tracking area Create mask image (1 is trackable, 0 is untrackable) Loop over time Set Occupancy Map to 1 for all pixels For each camera Compute difference of current image with background image If the difference is within desired threshold Set Occupancy Map to 0 at that location 3 Experimental Results The results varied over each of the test cases. With full uniform light, we acheived fairly good results. With a single light source, the results were not very good. The shadow of the object created just as much intensity change as the object itself. With no light, the noise from the cameras combined with a reduced overall intensity made it near impossible to track any object. It was impossible to distinguish between the system noise and an object within the tracking area. Images are given below to show how well an object was tracked. 3
3.1 Full Uniform Light Source Background Images Tracking Images Occupancy Map 4
3.2 Single Light Source Background Images Tracking Images Occupancy Map 5
3.3 No Light Source Background Images Tracking Images Occupancy Map 6
4 Conclusion As our results show, background subtraction is not a viable option for tracking under low-light conditions, even using low-light cameras. However, simply changing sensors could dramatically improve results. For example, using LADAR, one could use a very similar algorithm to track a person within the room. Also, infrared cameras produce a much better intensity histogram. It is also possible that some image processing could be done before the images are subtracted, such as smoothing and filtering. There are still many possibilities for tracking in the dark. However, we can now confidently say that background subtraction with low-light cameras is not the optimal solution. References [1] R. G. Abbott and L. R. Williams. Multiple target tracking with lazy background subtraction and connected components analysis. 3rd Computer Science Univ. New Mexico Student Conference 2007. [2] J. W. Davis and V. Sharma. Background-subtraction in thermal imagery using contour saliency. International Journal of Computer Vision, pages 161-181, 2007. [3] A. Hoover and B. D. Olsen. Real-time occupancy map from multiple video streams. Proc. IEEE Intl. Conf. Robotics and Automation, pages 2261-2266, 1999. [4] J. Bouget. Camera Calibration Toolbox for Matlab. http://www.vision.caltech.edu/ bouguetj/calib_doc/, accessed 10 December 2007. 7