Computatonal ghost magng usng a feldprogrammable gate array IKUO HOSHI, * TOMOYOSHI SHIMOBABA, TAKASHI KAKUE, AND TOMOYOSHI ITO 1 Graduate School of Engneerng, Chba Unversty, 1-33, Yayo-cho, Inage-ku, Chba, Japan * aeka2345@chba-u.jp Abstract: Computatonal ghost magng s a promsng technque for sngle-pxel magng because t s robust to dsturbance and can be operated over broad wavelength bands, unlke common cameras. However, one dsadvantage of ths method s that t has a long calculaton tme for mage reconstructon. In ths paper, we have desgned a dedcated calculaton crcut that accelerated the process of computatonal ghost magng. We mplemented ths crcut by usng a feld-programmable gate array, whch reduced the calculaton tme for the crcut compared to a CPU. The dedcated crcut reconstructs mages at a frame rate of 300 Hz. 1. Introducton Ghost magng (GI) s an magng method that has been ntensvely studed n recent years [1-4]. Unlke the usual magng that uses charge-coupled devces, GI uses a sngle-pxel devce as the lght-recevng element. In GI, the object s llumnated by usng lght havng spatally random patterns; then, the lght that passes through the objects (or the lght reflected by the objects) s corrected by the lens. These lghts are called object lghts. The ntensty of the object lght s detected by a sngle-pxel element. Fnally, the object mage s reconstructed by calculatng the correlaton between the obtaned object lght ntenstes and the random llumnaton patterns used to obtan the object lght ntenstes. Researchers have proposed GIbased methods for calculatng the lght-ntensty dstrbuton of the random llumnaton patterns on a computer [5, 6]; ths s called computatonal GI. Computatonal GI s advantageous for measurements over broad wavelength bands; t s robust to dsturbance and smplfes the optcal system. These characterstcs are expected to be appled n a wde range of felds such as bo magng [7], remote sensng [8], and encrypton [9]; computatonal GI s also helpful n takng three-dmensonal measurements [10]. However, ths method also has ts dsadvantages the mage qualty of the reconstructed mage s poor; the measurement tme s hgh; and the reconstructon calculaton s tme consumng. Research has been conducted for mprovng the mage qualty by usng modfed correlaton calculaton [11, 12], compressve sensng [13], and deep learnng [14, 15]. Research has also been conducted for shortenng the measurement tme [7, 16, 17]. To accelerate the reconstructon calculaton, we have desgned a calculaton crcut for computatonal GI, whch can calculate the pxels of the reconstructed mage n parallel. We mplemented ths crcut n a feld-programmable gate array (FPGA). The object lght ntenstes obtaned from the optcal system were nput to the FPGA, and a reconstructed mage was obtaned by calculatng the correlaton. The reconstructon tme n FPGA for mages havng 32 32 pxels was 3 ms. Ths mples that ths crcut can reconstruct mages at a frame rate of 300 Hz or more. In Secton 2, we descrbe the prncple of computatonal GI, and we descrbe the calculaton crcut used for the computatonal GI. In Secton 3, we show the results obtaned by mplementng the proposed crcut nto the FPGA. We compare the calculaton speed and evaluate the mage qualty of the reconstructed mages. In Secton 4, we summarze ths research.
2. Hardware mplementaton of computatonal ghost magng Fgure 1 presents a schematc of the computatonal GI used n ths research. By usng a dgtal mrror devce (DMD) projector to llumnate an object wth random llumnaton patterns, we obtaned the tme-seres data of the object lght ntenstes by usng a photo detector and an analog-to-dgtal (AD) converter. The tme-seres data were sent to the memory of the FPGA. The parallel processng of the reconstructon calculaton on the FPGA mproved the speed. A unversal seral bus (USB) nterface was used for the communcaton between a personal computer and the FPGA board. Random llumnaton pattern I x, y DMD Personal computer Lens Projector Retreved pxel data O (x, y) Object Lens FPGA board Sngle-element photo detector Object lght ntensty S AD Converter Reconstructed mage Fg. 1. Optcal system wth the FPGA for the computatonal ghost magng. After the random llumnaton pattern was passed through the object, the object lght ntensty was collected by the lens and detected by the photo detector. The detected object s lght ntensty S s gven as S I ( x, y) T( x, y) dxdy, (1) where I (x, y) s the dstrbuton of the random llumnaton pattern, and T(x, y) s the transmttance of the object. The ntensty of the random llumnaton pattern R s gven as R I ( x, y) dxdy. (2) The followng formula called the dfferental GI (DGI) [11, 12] of a computatonal GI was used for the reconstructon: S O ( x, y) SI ( x, y) R I ( x, y), (3) R where O (x, y) represents the reconstructed mage, and represents the ensemble average. In Eq. (3), S I (x, y) and R I (x, y) requre n x y tmes calculatons, where n s the number of the random llumnaton patterns, and x y s the number of the pxels. These operatons are the most tme consumng n DGI. R and R I (x, y) do not depend on objects;
therefore, they can be calculated n advance. Instead of usng central processng unts (CPUs), we desgned a dedcated crcut to accelerate the computaton of Eq. (3). We compared the mage qualty obtaned by usng the orgnal computatonal GI [5] and the DGI under the same condtons. The reconstructed mages are shown n Fg. 2. Fgure 2(a) s the mage reconstructed by usng the computatonal GI, and Fg. 2(b) s the mage reconstructed by usng the DGI. We adopted the DGI for the hardware mplementaton because the mage qualty of the DGI was obvously better than that of the computatonal GI. (a) GI (b) DGI Fg. 2. Comparson of the mages obtaned by usng (a) computatonal GI and (b) DGI. In Eq. (3), the dvson by R s a bottleneck n the hardware mplementaton. To smplfy the hardware mplementaton, we reformulate Eq. (3) as follows: R O ( x, y) R S I ( x, y) S R I ( x, y). (4) The dedcated crcut generates random llumnaton patterns that are the same as the patterns dsplayed on the DMD projector. The pseudo-random number generators lnear congruental generators (LCGs), Mersenne Twster (MT), and the maxmum length sequence (herenafter M-sequence ) generate random llumnaton patterns. The reconstructed mages generated by each method are shown n Fg. 3. Fgures 3(a), 3(b), and 3(c) are the reconstructed mages obtaned by usng LCGs, MT, and M-sequence, respectvely. There were almost no dfferences n the mage qualty. In terms of the hardware mplementaton, we selected M-sequence as the pseudo-random number generator. (a) LCGs (b) MT (c) M-sequence Fg. 3. Comparson of the reconstructed mages usng LCGs, MT, and the M-sequence. 3. Desgnng the calculaton crcut The schematc of the dedcated crcut s shown n Fg. 4. Ths crcut has three parts: a recever unt, a calculaton unt, and a transmtter unt. The recever unt and the transmtter unt are the USB transmsson crcuts between the host computer and the FPGA. The calculaton unt reconstructs mages wth 32 32 pxels. We used Xlnx Artx-7 XC7A100T-2 as the FPGA. The dedcated crcut was operated at 100 MHz. The nput data was the object lght ntensty obtaned by the AD converter. The output data was the reconstructed mage.
Input data Recever Unt Calculaton Unt Transmtter Unt Output data Fg. 4. Top confguraton of the dedcated crcut. The schematc of the calculaton unt s shown n Fg. 5. All the arthmetc operatons n the calculaton crcut were developed usng a fxed-pont number. Fgures 5 and 6 have several sets of three numbers n parentheses. Here, the frst, second, and thrd numbers represent the sgn bt, the number of bts of the nteger part, and the decmal part of the fxed pont number, respectvely. When the object lght ntenstes were receved, the calculaton unt started by calculatng the average S. Then, the average S I (x, y) was calculated by the parallel calculator from the object lght ntensty S saved n memory and from the random llumnaton pattern I (x, y) that was generated from the random number generator. The calculated S I (x, y) was saved n a random access memory (RAM). Subsequently, O (x, y) was calculated from S I (x, y), S, R I (x, y), and R. R I (x, y) and R were pre-calculated n the host computer and stored n a table and regsters, respectvely. Fnally, the calculaton unt sent O (x, y) to the transmtter unt. Note that the R factor that appears on the left sde of Eq. (4) can be omtted because t s a constant. The detals of the parallel calculator unt are shown n Fg. 6. Ths unt can smultaneously calculate 64 pxels n the reconstructed mage because 64 calculaton modules were operated n parallel. Fgure 7 shows the detals of the calculaton module. As shown n Fg. 8, the 64 calculaton modules process two lnes of the reconstructed mage (32 32 pxels); subsequently, they process the next two lnes. All the calculated values were saved n the RAM shown n Fg. 5 va the multplexer of Fg. 6.
Random number generator R Regster (0,9,0) S (0,64,0) (0,8,0) S I(x, y) Parallel calculator RAM (0,18,12) O (x, y) (1,19,12) + S Regster (0,9,0) Selector R I(x, y) Table (0,18,12) Fg. 5. Schematc of the calculaton unt. S I( 3) I( 2) I( ) (0,8,0) Calculaton module Calculaton module Calculaton module Mult plexer S I(x, y) I( ) Calculaton module Fg. 6. Schematc of parallel calculator unt. I S (0,8,0) (0,8,0) RAM S I(x, y) Fg. 7. Schematc of the calculaton module.
32 pxel Module1 1 2 3 4 30 31 32 Module64 33 34 35 36 62 63 64 32 pxel Fg. 8. Order of reconstructon calculaton. The random pattern generator usng M-sequence s shown n Fg. 9. The boxes (called taps) wth the notaton M(s) were mplemented by flp-flops; s ndcates the ndex of the flp-flops. In ths research, we generated a bnary random number sequence by usng the lner feedback shft regster (LFSR). The feedback postons of the LFSR were determned by the longest polynomal [18]. The generator was necessary for producng 64-bt random numbers n parallel; therefore, the regster needed to shft the current data to 64 bts per clock cycle. I( 3) I( 2) I( ) I( ) I( ) I( ) I( ) I( ) I( ) I( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( 3) ( 2) ( ) ( ) ( ) ( ) (3) (2) ( ) ( ) ( ) ( ) ( ) ( 3) ( ) (2) Fg. 9. M-sequence wth 64-bt lner feedback shft regster. 4. Result In ths study, the calculaton tme for mage reconstructon when usng a CPU was compared wth the calculaton tme when usng the FPGA. The number of random llumnaton patterns was 16,384. The calculaton tmes of the FPGA were compared for 16 and 64 calculaton modules. The transmsson tme between the FPGA and the host computer were not ncluded n the calculaton tmes. For the computng envronment, we used Intel Core 5 4690 (clock frequency 3.50 GHz) as the CPU, a memory of 8.0 GB, Mcrosoft Wndows 10 educaton as
the operatng system, and Mcrosoft Vsual Studo C ++ 2015 as the compler. The calculaton tmes for the varous devces are gven n Table.1. Table 1. Calculaton tmes. Devce Calculaton tme [ms] CPU 45 FPGA (16 calculaton modules) 10 FPGA (64 calculaton modules) 3 From Table 1, t s clear that the calculaton usng the FPGA was faster than that usng the CPU. In addton, as the number of parallel modules was ncreased, the calculaton speed was mproved. In the 64 calculaton modules, the dedcated crcut could calculate the reconstructed mage at 3 ms. In other words, the crcut reconstructed mages at a frame rate of over 300 Hz. As the result, the parallelzaton was effectve. A few man advantages of usng FPGAs are as follows: (a) The object lght ntensty of the AD converter can be drectly receved by the FPGA wthout gong through any CPU or operatng systems; (b) The reconstructon calculaton can be performed wthout CPUs; and (c) The power consumpton s low. In partcular, the frst advantage becomes very mportant n applcatons that requre precse tmng control, such as cytometry [7]. In such applcatons, t s necessary to accurately control the tmng (latency) from the recepton of the nput sgnals to the mage reconstructon. It s very dffcult for CPUs and GPUs to control the latency because the calculaton paths n CPUs and GPUs are complex; n addton, the paths are controlled by operatng systems. However, FPGAs can accurately control the latency easly. We evaluated the mage qualty of the reconstructed mages obtaned by the CPU and FPGA n the numercal smulatons. The calculatons n the FPGA used the fxed-pont number. The calculatons n the CPU used the floatng-pont number. Each reconstructed mage s shown n Fg. 10. Fgure 10(a) s the orgnal mage. Fgure 10(b) s the reconstructed mage obtaned by the FPGA, and Fg. 10(c) s the reconstructed mage obtaned by the CPU. The peak sgnal-tonose rato (PSNR) and structural smlarty (SSIM) were used for evaluatng the mage qualty. The evaluated mage qualty s shown n Table 2. The qualtatve and quanttatve evaluatons show that there s almost no dfference between the mage qualtes. (a) Orgnal mage (b) Reconstructed mage by the FPGA (c) Reconstructed mage by the CPU Fg. 10. Reconstructed mages obtaned by the CPU and FPGA. Table 2. Numercal evaluaton of mage qualty. Devce PSNR SSIM CPU (floatng-pont number) 25.03 0.95 FPGA (fxed-pont number) 23.62 0.94 Reconstructed mages of the three objects usng an actual optcal system are shown n Fg. 10. We confrmed that the reconstructed mages could be obtaned by FPGA.
5 mm (a) (b) 5 mm (c) (d) 5 mm (e) (f) Fg. 11. Orgnal objects and reconstructed mages usng an actual optcal system. 5. Concluson In ths research, we desgned a dedcated crcut to reduce the tme taken for the mage reconstructon by usng computatonal GI. The dedcated crcut could reconstruct mages at a frame rate of over 300 Hz. The mage qualty of the reconstructed mages obtaned by the FPGA was almost the same as that obtaned by the CPU. We also confrmed that the FPGA could obtan reconstructed mages n an actual optcal system. The crcut scale of the FPGA used n ths research was small. Larger reconstructed mages could be obtaned at hgher speeds by usng large-scale FPGAs. In ths research, we used random pattern llumnaton. The mage qualty s expected to mprove f the Fourer bass and Hadamard bass are used for llumnaton [16, 17]. In future, we plan to mprove our dedcated crcut usng upon the method. References 1. T. B. Pttman, Y. H. Shon, D. V. Strekalov, and A. V. Sergenko, Optcal magng by means of two photon quantum entanglement, Phys. Rev. A 52, R3429 (1995). 2. A. Gatt, E. Bramblla, M. Bache, and L. A. Lugato, Ghost magng wth thermal lght: Comparng entanglement and classcal correlaton, Phys. Rev. Lett. 93, 093602 (2004). 3. A. Gatt, E. Bramblla, M. Bache, and L. A. Lugato, Correlated magng, quantum and classcal, Phys. Rev. A 70, 013802 (2004). 4. F. Feer, D. Magatt, A. Gatt, M. Bache, E. Bramblla, and L. A. Lugato, Hgh-resoluton ghost mage and ghost dffracton experments wth thermal lght, Phys. Rev. Lett. 94, 183602 (2005). 5. J. H. Shapro, Computatonal ghost magng, Phys. Rev. A 78, 061802 (2008). 6. Y. Bromberg, O. Katz, and Y. Slberberg, Ghost magng wth a sngle detector, Phys. Rev. A 79, 053840 (2009). 7. S. Ota, R. Horsak, Y. Kawamura, M. Ugawa, I. Sato, K. Hashmoto, and K. Wak, Ghost cytometry, Scence, 360(6394), 1246-1251 (2018). 8. B. I. Erkmen, Computatonal ghost magng for remote sensng, J. Opt. Soc. Am. A 29, 782-789 (2012). 9. P. Clemente, V. Durán, V. T-.Company, E. Tajahuerce, and J. Lancs, Optcal encrypton based on computatonal ghost magng, Opt. Lett. 35, 2391-2393 (2010). 10. B.Sun, M. P. Edgar, R. Bowman, L. E. Vttert, S. Welsh, A. Bowman, and M. J. Padgett, 3D computatonal magng wth sngle-pxel detectors, Scence, 340, 844-847 (2013). 11. F. Ferr, D. Magatt, L. A. Lugato, and A. Gatt, Dfferental ghost magng, Phys. Rev. Lett. 104, 253603 (2010). 12. B. Sun, S. S. Welch, M. P. Edgar, J. H. Shapro, and M. J. Padgett, Normalzed ghost magng, Opt. Express 20, 16892 (2012). 13. O. Katz, Y. Bromberg, and Y. Slberberg, Compressve ghost magng, Appl. Phys. Lett. 95, 131110 (2009). 14. T. Shmobaba, Y. Endo, T. Nshtsuj, T. Takahash, Y. Nagahama, S. Hasegawa, and T. Ito, Computatonal ghost magng usng deep learnng, Opt. Commun. 413, 147-151 (2018). 15. M. Lyu, W. Wang, H. Wang, H. Wang, G. L,, N. Chen, and G. Stu, Deep-learnng-based ghost magng, Sc. Rep. 7, 17865 (2017). 16. Z. Xu, W. Chen, J. Penuelas, M. Padgett, and M. Sun, 1000 fps computatonal ghost magng usng LED-based structured llumnaton, Opt. Express 26, 2427-2434 (2018). 17. Z. Zhang, X. Wang, G. Zheng, and J. Zhong, Hadamard sngle-pxel magng versus Fourer sngle-pxel magng, Opt. Express 25, 19619 19639 (2017). 18. P. Alfke, Effcent shft regsters, LFSR counters, and long pseudo-random sequence generators, http://www.xlnx.com/bvdocs/appnotes/xapp052.pdf (1996).