On-Chip CNN Accelerator for Image Super-Resolution

Size: px
Start display at page:

Download "On-Chip CNN Accelerator for Image Super-Resolution"

Transcription

1 On-Chip CNN Acceerator for Image Super-Resoution Jung-Woo Chang and Suk-Ju Kang Dept. of Eectronic Engineering, Sogang University, Seou, South Korea {zwzang91, ABSTRACT To impement convoutiona neura networks (CNN) in hardware, the state-of-the-art CNN acceerators pipeine computation and data transfer stages using an off-chip memory and simutaneousy execute them on the same timeine. However, since a arge amount of feature maps generated during the operation shoud be transmitted to the off-chip memory, the pipeine stage ength is determined by the off-chip data transfer stage. Fusion architectures that can fuse mutipe ayers have been proposed to sove this probem, but appications such as super-resoution (SR) require a arge amount of an on-chip memory because of the high resoution of the feature maps. In this paper, we propose a nove on-chip CNN acceerator for SR to optimize the CNN datafow in the on-chip memory. First, the convoution oop optimization technique is proposed to prevent using a frame buffer. Second, we deveop a combined convoutiona ayer processor to reduce the buffer size used to store the feature maps. Third, we expore how to perform ow-cost mutipy-and-accumuate operations in the deconvoutiona ayer used in SR. Finay, we propose a two-stage quantization agorithm to seect the optimized hardware size for the imited number of DSPs to impement the on-chip CNN acceerator. We evauate our proposed acceerator with FSRCNN, which is most popuar as the CNN-based SR agorithm. Experimenta resuts show that the proposed acceerator requires 9.21ms to achieve an output image with the pixe resoution, which is 36 times faster than the conventiona method. In addition, we reduce the on-chip memory usage and DSP usage by 4 times and 1.44 times, respectivey, compared to the conventiona methods. 1 INTRODUCTION Image super-resoution (SR) and computer vision [1, 2], such as object detection and recognition, have attracted much attention due to the emergence of convoutiona neura networks (CNN). As a resut, CNN acceerators have been studied extensivey to appy various CNN agorithms to the rea-time embedded systems. Especiay, in hardware impementation, FPGA-based CNN acceerators are more energy efficient than those based on GPUs and can perform massive parae processing than those based on CPUs [3, 4, 5, 6, 7]. CNN agorithms are difficut to store a necessary data with an on-chip memory except binary feature maps [7] because of the 3-D feature maps generated at each time the convoutiona ayer is processed. Therefore, most FPGA-based CNN acceerators use an off-chip memory and perform off-chip data transfer and computation simutaneousy through ping-pong operations [3, 5, 6, 8]. As a resut, a mutipy-accumuate operation (MAC) is performed continuousy through a convoutiona ayer processor (CLP) with oop optimization techniques [3, 4, 5, 6, 8, 9]. Even if ping-pong operations are appied through doube buffers, CLP cannot perform the next operation unti a arge amount of output feature maps are stored in the off-chip memory. In this case, the performance of the acceerators is degraded [6, 8]. Previous studies propose CNN fusion architectures to reduce arge amounts of offchip data transfer [6, 8]. Fusion architectures are designed as muti- CLPs for processing mutipe convoutiona ayers within the CNN [6, 8]. Therefore, the data generated after CLP is executed is transferred to the next ayer processor using the on-chip memory. In addition, the off-chip data transfer occurs ony in the first ayer and the ast ayer of the fused ayers. However, in rea-time appications for the image processing such as SR converting the ow resoution (LR) images into high resoution (HR) images, the input and output buffers are difficut to be designed using ine buffers because feature maps require a arge amount of on-chip memory. On the other hand, a method for transforming deconvoutiona ayer into convoutiona ayer (TDC) is proposed to effectivey paraeize deconvoutiona neura networks (DCNN) that generate HR images from LR images [9]. In this method, severa pixes in the output feature maps are generated in parae by a convoution kerne which is smaer than a deconvoution kerne size. However, zero weights of converted convoution kerne are not considered, thereby performing unnecessary MAC operations. In this paper, we propose a nove on-chip memory based CNN acceerator for the efficient data transfer. The main contributions of this paper are as foows. We propose a nove convoution oop optimization technique to impement the CNN acceerator with the on-chip memory. We deveop the combined CLP architecture that combines consecutive CLPs to sove the probem of storing arge amounts of feature maps in the imited buffer size. We expore how to design the CLP designed with the TDC method in the deconvoutiona ayer into the imited hardware resources of FPGA. We propose a two-stage quantization agorithm to reduce the size of the CNN in order to impement the designed on-chip CNN acceerator through the target FPGA. 2 BACKGROUND 2.1 CNN-based SR Agorithms CNN-based SR agorithms are typicay performed in two cases. In the first case for methods incuding SRCNN [2] and SRCNN-Ex [11], an input image is processed by the bicubic interpoation [10]

2 Figure 1: Pseudo code of the acceerator with the conventiona oop optimization techniques to generate the HR image, and then CNN agorithms are performed. However, since the tota execution cyces of the CNN acceerators are proportiona to the resoution of the input image, the performance of the CNN acceerator is degraded by a scae factor. In the second case for the methods incuding FSRCNN and FSRCNN-s [12], the HR image is directy generated from the LR image using a deconvoutiona ayer. However, the deconvoutiona ayer is sower than the convoutiona ayer because it has more oop iterations than the convoutiona ayer [9]. But, it can be converted to the convoutiona ayer through the TDC method to improve the performance. Generay, CNN-based agorithms require a arge amount of the mode size. The argest mode size of the CNN agorithms mentioned above is 230 KB, and the mode size of the CNN-based SR agorithms is sma enough to be stored in the on-chip memory. Therefore, it is not necessary to use a mode compression [13]. 2.2 Conventiona CNN Acceerators Figure 2: Required size of the ine buffers for each CNNbased SR agorithms Most CNN acceerators use oop optimization techniques to acceerate convoutiona ayers consisting of many stages of convoution oops [3, 4, 5, 6, 8, 9]. The pseudo code in Figure 1 shows an exampe of appying the oop optimization techniques. In computation(), the convoution oops are executed in parae with the size of tie parameters through the UNROLL directive. The intermediate data generated by the processor are too arge to be stored in the on-chip memory unti the ayer is competey processed. Therefore, conventiona CNN acceerators sove the memory shortage probem through the off-chip memory [3, 4, 5, 6, 8, 9]. In main(), computation and off-chip data transfer stages are performed simutaneousy on the same timeine through the DATAFLOW directive. However, since the pipeine stage ength is imited by the off-chip data transfer, the use of the off-chip memory is a major botteneck in the performance of the acceerators. Fusion architectures are proposed to reduce the off-chip data transfer [6, 8]. Awani et a. [6] propose a method of storing the ast feature maps into the off-chip memory after performing computation in fused convoutiona ayers. Specificay, the resoution of feature maps is decreased when passing through the convoutiona ayer, and the fina feature maps are stored in the offchip memory. However, since the on-chip data transfer between fused ayers is managed by the tie-based buffer approach, there is a burden that exception handing must be accompanied for various boundary conditions. To sove this probem, Xiao et a. [8] propose a method of managing feature maps as ine buffers. Even if there are boundaries that require exception handing, data can be easiy reused by using severa ine buffers. However, as the CNN-based SR agorithm has the high resoution of feature maps, the ine buffers are difficut to be expoited due to the burden on the on-chip memory. Unike the convoutiona ayer, the deconvoutiona ayer has the process of accumuating kerne-sized output bocks in the previousy obtained output feature maps. To impement this in hardware, it is necessary to carry out a compicated process of oading the output bock stored in the memory into the buffer, adding it, and storing it again in the memory. The TDC method soves this probem by generating an output bock with a fixed size that does not overap with other output bocks. In addition, there is an advantage that a the pixes in the output bock can be processed in parae by the convoution. However, when converting from the deconvoutiona ayer to the convoutiona ayer, the newy created convoution kerne contains a arge amount of weights with zero coefficients. Thus, we need to remove the operations for zero coefficients to enhance the FPGA resource utiization. 3 Proposed ARCHITECTURE DESIGN 3.1 Datafow Optimization In our system, the pixe data is sent to the FPGA through the dispay driver in the horizonta direction of the frame. When a the pixes in the frame have been transferred, the data for the next frame is sent to the FPGA. If a the convoutiona ayers are executed by the singe CLP, the pixe data coming into the FPGA must be stored in the on-chip memory unti the ast ayer is competey processed. Consequenty, severa frame buffers may be required depending on the execution time of the CLP. In order to sove this probem, we process input data coming into the FPGA by designing a the convoutiona ayers to run concurrenty through the mutipe CLPs such as the fusion architectures. In this case, the design of the muti- CLPs that can hande efficient on-chip datafow is required. Thus,

3 Figure 3: Proposed combined CLP architecture fused with the forward ayer and the shrinking ayer we must determine the oop tiing parameters of each CLPs. To find the oop tiing parameters, we compare the computation speed of each CLP with the speed of the pixe data coming into the CLP. The computation to transmission (CT) ratio is defined as the ratio of the cyces required to perform the th CLP to the cyces required to transmit the pixe data from the dispay driver or input buffers into the th CLP. When the tie size combination of the th CLP is given as T y, T x, T m, T n, and T k, which denote the tie size of row, coumn, output feature maps, input feature maps, and kerne, the th CT, which is CT, is cacuated by Equation (1). In this case, we do not consider T y and T x when cacuating the CT because the input of each CLP is sequentiay transmitted in pixe data instead of tie-sized pixe bock. CT ratio = = tota number of execution cyces tota number of transmission cyces H W M T m N T n K T k K T k H W N T n = M T m K T k K T k where H and W are the height and width of feature maps, respectivey, N and M are the size of the input and output feature maps, respectivey, and K is the width of the kerne size in the th convoutiona ayer. If the CT is greater than 1, the transmission speed of the pixe data is faster than that of the CLP, and hence, the data transferred during the computation must be stored in the frame buffer. For exampe, if an output image with the (FHD) pixe resoution is generated using a SR agorithm with the scae factor of 2, a 4.15MB buffer memory is required to store an input image with the pixe resoution in 32-bit foating point data type. In addition, considering the size of input feature maps, it can exceed the aowabe on-chip memory of a typica FPGA. Therefore, we set the CTR of a ayers to 1 not to use the frame buffer. Thus, T m and T k become M and K, respectivey. Furthermore, the tiing factors of the input feature maps of the next (1) Figure 4: Proposed combined CLP architecture fused with the expanding ayer and the backward ayer ayer, T n +1 and T m shoud be same for feature maps to be transmitted without buffering to the next ayer. Since N +1 is equa to M, T n aso becomes equa to N. As a resut, a CLPs can fuy process the three convoution oops in parae. 3.2 Combined CLP Design A memory management technique is needed to efficienty store feature maps generated by muti-clps. Since the pixe data from the dispay driver is transmitted ine by ine, we use the ine buffer that can reuse the data without being restricted by boundary conditions [8]. The ine buffer is designed as a bock RAM (BRAM) that is a simpe dua-port mode [14] in which one read and one write are aowed concurrenty in order to fufi both input and output buffers of each CLPs. In addition, the capacity needed to impement the ine buffer shoud be same as the size of the 3-D data generated from the CLP. The number of ine buffers must be equa to the kerne size for the convoution. Therefore, the size of the input and output buffers for the th CLP, which are M in and M out respectivey, can be cacuated by Equation (2). M out M in = K W N pixe_datawidth = K +1 W +1 N +1 pixe_datawidth Figure 2 shows the size of the ine buffers required to generate output images with the output resoutions of FHD, (QHD), and (UHD) when using CNN-based SR agorithms [2, 10, 11]. FSRCNN-s, which requires the smaest amount of memory, shoud use 2.65MB memory to generate UHD image. These buffers are constructed with 588 numbers of 36K- BRAMs. Since it is arger than the number of the BRAMs avaiabe in genera FPGAs, the required ine buffers shoud be reduced. We can sove the memory shortage probem through the shrinking and expanding ayers, which are both 1 1 convoutiona ayers. The shrinking ayer compresses the argest input feature maps to a reduced size. Conversey, the expanding ayer restores the reduced size of the input feature maps back to the origina size. Figure 3 shows the combined CLP with the fusion of the forward ayer processor and the shrinking ayer processor. An activation function that contributes the noninear character-istics to the resut of the convoution is processed after performing the convoution oops on input feature maps and kernes. Since we competey unro the convoution oops for input feature maps, (2)

4 deconvoutiona ayer. However, since the TDC method maps the KD KD weights to a matrix of KC KC S D 2 size in each input and output feature maps, zero coefficients aways occurs in the matrix. The number of zero coefficients, numzero, in the transformed convoution kernes is derived by Equation (3). num Zero = K C 2 (M D S D 2 ) N D K D 2 M D N D (3) Therefore, when computing the convoution between input bocks and kernes in deconvoutiona ayer processor, we can reduce the use of FPGA resources by skipping the computations when weights are zero. Specificay, we reduce the DSP usage to 81% in FSRCNN and FSRCNN-s, compared to the TDC method when the kerne size of the deconvoutiona ayer is 9 9. output feature maps, and kernes, PReLU [15], the modues for activation function of FSRCNN and FSRCNN-s, are designed as many as the number of the output feature maps. Therefore, the output feature maps generated by the CLP of the forward ayer can be directy sent to the CLP of the shrinking ayer without being stored in the output buffer. Simiary, Figure 4 shows another combined CLP with the merging of the expanding ayer processor and the backward ayer processor. Large amounts of feature maps generated in the expanding ayer must be stored in the same number of ine buffers as the width of the kerne size in the backward ayer. Instead, we sove this probem by setting T y L 2 in the expanding ayer processor to the width of the kerne size in the backward ayer. Athough the number of ine buffers in the expanding ayer increases to the width of the kerne size in the backward ayer, the efficiency of the on-chip memory can be increased because it stores the smaest feature maps in FSRCNN and FSRCNN-s. Through this combined CLP architecture, FSRCNN and FSRCNN-s can reduce the required memory size by 3.62 times and 7.67 times compared with the fusion architecture [9]. 3.3 MAC Optimization of DCNN acceerator The state-of-the-art DCNN acceerator reduces the kerne size of the deconvoutiona ayer from KD KD to KC KC and increases the size of the output feature maps from MD to MD SD SD through the TDC method, where KD and KC denote the width of the kerne size of the deconvoutiona ayer and transformed convoutiona ayer, respectivey, and MD and SD represent the size of output feature maps and the stride of the deconvoutiona ayer, respectivey. As a resut, the tota number of weights in the transformed convoutiona ayer is KC KC SD SD MD ND, where ND denotes the size of the input feature maps in the 4 MODEL QUANTIZATION To optimize the utiization of the FPGA resources more compacty, pixes, weights and partia sums are expressed as 16-bit fixed points from the 32-bit singe-precision foating points using the bit-width quantization method [16]. The number of DSPs required for the mutipier and adder of the Xiinx Kintex-7 FPGA with the precision of a 16-bit fixed point is one each. If the mutipier and adder are impemented as ogic eements in FPGA, the required resources are 280 LUT6s and 16 LUTs+16 FFs, respectivey. When competey unroing severa convoution oops, we cannot use both mutipiers and adders as DSPs due to the imited number of DSPs. Therefore, we design an adder with LUTs and FFs that require the ower number of ogic eements than a mutipier, and a mutipier with a DSP. Therefore, the tota number of DSPs required to design the muti-clps is as foows. L 1 num DSP = T y M N K K num zero (4) =0 where tiing parameter T y used for the expanding ayer is K L-1 when is the order of the expanding ayer, L-2, and 1 for other cases. In addition, the deconvoutiona ayer processor performs the optimized MAC through the proposed method, and hence, numdsp is subtracted by the number of zero coefficients. Therefore, the number of avaiabe DSPs is increased. When the ayer configurations of FSRCNN and FSRCNN-s are substituted into Equation (4), the number of DSPs required is 16,216 and 5,185, respectivey, which are higher than the tota number of DSPs in the high-performance FPGAs. Therefore, the two-stage quantization agorithm is proposed to reduce the mode size of FSRCNN and FSRCNN-s. Agorithm 1 represents the pseu-do code for the twostage quantization agorithm. It is used to find the best CNN mode that can be impemented in the target FPGA. In the first stage, we cacuate the size of a receptive fied to figure how many surrounding pixes are used. The receptive fied R has the size of the input bock required to generate the pixe for the output feature maps of the ast ayer. Typicay, the performance of the SR is improved as R increases [2, 11, 12], thereby seecting the optima number considering the performance. R can be cacuated in Equation (5).

5 Figure 5: Bock diagram of the CNN acceeration system Figure 7: Comparison between the fusion architecture and the combined CLP architecture Figure 6: Convoutiona ayer pipeine L 1 R = K K 2 By Equation (5), the size of the receptive fied in FSRCNN and FSRCNN-s are and 11 11, respectivey. We need to find the best receptive fied among the various cases in the first stage. However, as there are many configurations, we imit the size of the receptive fied using threshod1. After that, we define the fundamenta probem for designing SR to maximize the performance and impement it in the target FPGA can be formuated as foows. =1 maximize PSNR i, s. t. num DSPi tota DSPs (6) where PSNRi is a peak signa-to-noise (PSNR) vaue for the i th combination of possibe configurations and numdspi is the number of DSPs needed at that time. The constraints on the memory are not incuded because the combined CLP architecture soves them. Since the object function PSNRi is cacuated by the CNN mode obtained through training, an exhaustive search must be performed for a configurations to find the optima quantized mode. In CNN-based SR agorithms, the output feature maps between ayers are nonineary connected and the performance increases as the number of connections increases [10, 11]. Therefore, we simpy change the object function to the function considering the tota number of connections in FSRCNN and FSRCNN-s. The probem is reformuated as foows. maximize M L 1 (5) M, s. t. num DSP tota DSPs (7) =0 There is a probem that the variabe M in the object function is as many as the number of ayers in the CNN. We sove the probem by grouping ayers with simiar characteristics. The tota number of connections depends on the size of the output feature maps in each ayers. Unike other ayers, the output feature maps of the deconvoutiona ayer are excuded from the quantization because each feature maps are merged into HR image. We describe the differentia of the connections by the feature maps as dc/dm. Except for the deconvoutiona ayer, there are an even number of ayers with equa dc/dm in the hourgass type FSRCNN and FSRCNN-s. We can represent a set of ayers with the same dc/dm and create vectors containing their configurations. For exampe, in FSRCNN-s, the first ayer and the fourth ayer have the same dc/dm but are smaer than the other ayers. So, the ayers beong to the set G[0] and the ayer configurations are stored in GM[0], GN[0], GK[0], and GTy[0], respectivey, representing vectors of incuding the tiing parameters, where a set with a higher index vaue has a arger vaue of dc/dm than a set with ower index vaue. Therefore, we sove the objective function separatey for each set. Thus the constraint probem for quantizing the output feature maps of the p th group can be expressed as beow: maximize M q (M q ) N p, s. t. ( α M q ) + β tota DSPs (8) where Mq is a variabe in which the output feature maps are quantized, and Np is the number of ayers having the same size of feature maps. α is the sum of the remaining mutipied configureations and β is the number of DSPs for the ayers in the other sets. Since Np is an even number, the object function becomes a convex downward. Therefore, the maximum integer satisfying the constraint becomes the soution. The resut of the quantized feature maps Mq is as foows. tota DSPs β M q = (9) α Consequenty, in the second stage, we first quantize the output feature maps GM[1] of the ayers beonging to the set G[1] without considering the output feature maps GM[0]. Then, the output feature maps of the ayers beonging to G[0] are strongy quantized. Since then we increments GM[0] by graduay decrementing the vaues of GM[1] unti the oop iterator is greater than the threshod2. We repeat this process with the first stage and find the best mode. As a resut, we reduce the combination of the feature maps in FSRCNN from <56, 12, 12, 12, 12, 12, 56, S D 2 > to <23, 3, 3, 3, 3, 3, 23, S D 2 >. And we reduce the kerne size of both the first ayer and

6 Figure 8: Comparison of DSP utiization between TDC method and proposed method in the FSRCNN Tabe 1: Comparison of between the proposed method and the conventiona method BRAM DSP FF LUT Cyces( 10 6 ) [9] , , Ours 120 1,491 53,242 45, the ast ayer in FSRCNN from 5 5 to 3 3. The PSNR is 36.28dB, which is 0.72dB ower than the origina mode in a Set5 test dataset. 5 EXPERIMENTAL RESULTS We evauated the proposed on-chip CNN acceerator using the Xiinx Kintex-7 410T FPGA. We designed it with Veriog RTL through Vivado shown in Figure 5. The CLP of the first ayer and the CLP of the second ayer, the shrinking ayer, were fused to the combined CLP. In addition, the expanding ayer and the fina ayer, which was the deconvoutiona ayer, were fused to another combined CLP. Therefore, intermediate data that required ots of memory in the each combined CLPs were not stored. In addition, weight buffers were stored in ROM because it operated as read ony. Our system had the scae factor of 2 and rendered the QHD images from input image of pixe resoution. Tabe 1 shows the resources utiization of the proposed acceerator. Our acceerator used a sma amount of FPGA resources compared to [9]. We set working frequency of the FPGA as 100MHz. The execution times for generating FHD, QHD, and UHD images were 5.28ms, 9.21ms, and 20.73ms, respectivey. Thus its rendering speed was 36 times higher than of the conventiona method. Figure 6 shows that the ayers in the CNN are performed on the pipeine through the proposed acceerator. Thus, a ayer were performed concurrenty on the same timeine. For ayers with the width of the kerne size greater than 1, a ine deay was generated unti the ine buffers equa to the size of T k was fied. Exceptionay, the expanding ayer generated a ine deay for the combined CLP. In the PReLU, the output of the convoution was added to the bias, and when it was smaer than 0, the resut was mutipied by the PReLU coefficients. So, we impemented adders and mutipiers through ogic eements for every PReLU modue. Figure 7 shows the amount of memory required to impement the on-chip CNN acceerator using the fusion architecture and the combined CLP architecture for the quantized FSRCNN. The fusion architecture required about 1MB of the on-chip memory when generating UHD images, whie the proposed method required 250KB of memory, which was four times ess than that of the fusion architecture. Therefore, we coud sove the conventiona probem that required ots of memory to store feature maps. Figure 8 shows the comparison of DSP usage between the TDC method and the proposed method in the deconvoutiona ayer. In Figure 8 (a), if KD was 5, the difference in DSP usage between the two methods was about 500. We coud aso found that the arger the feature maps, the higher the difference in Figure 8 (b). This enormous difference was an important factor in designing the CNN acceerator. Because the CNN acceerators shoud process the convoutiona ayers and the deconvoutiona ayer as fast as possibe with a imited amount of DSPs in the FPGA. Therefore, the proposed method coud operate much more effectivey than the conventiona method in imited hardware resources. 6 CONCLUSIONS In this paper, we proposed a nove on-chip CNN acceerator to sove the imitation due to the off-chip data transfer. We impemented the FSRCNN, we known as CNN-based SR, using Xiinx Kintex-7 FPGA. We deveoped the 2-stage quantization technique to impement the imited FPGA resources. In experimenta resuts, our on-chip CNN acceerator generated HR images in rea time. In addition, we reduced the on-chip memory usage and DSP usage by 4 times and 1.44 times, respectivey, compared to conventiona methods. REFERENCES [1] A. Krizhevsky, I. Sutskever, and G. E. Hinton Imagenet cassification with deep convoutiona neura networks. In NIPS 12.. [2] C. Dong, C. C. Loy, K. He, and X. Tang Learning a deep convoutiona network for image super-resoution. In ECCV [3] C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong Optimizing FPGA-based Acceerator Design for Deep Convoutiona Neura Networks. In FPGA [4] Y. Ma, Y. Cao, S. Vrudhua, and J.-S. Seo Optimizing Loop Operation and Datafow in FPGA Acceeration of Deep Convoutiona Neura Networks. In FPGA [5] Y. Shen, M. Ferdman, and P. Mider Maximizing CNN Acceerator Efficiency Through Resource Partitioning. In ISCA

7 [6] M. Awani, H. Chen, M. Ferdman, and P. Mider Fused-ayer CNN acceerators In MICRO [7] H. Yonekawa and H. Nakahara On-Chip Memory based Binarized Convoutiona Deep Neura Networks Appying Batch Normaization Free Technique on an FPGA. In IPDPSW [8] Q. Xiao, Y. Liang, L. Lu, S. Yan, and Y.-W. Tai Exporing Heterogeneous Agorithms for Acceerating Deep Convoutiona Neura Networks on FPGAs. In DAC [9] J.-W. Chang and S.-J. Kang Optimizing FPGA-based Convoutiona Neura Networks Acceerator for Image Super-Resoution. In ASP-DAC 17. [10] R. Keys Cubic convoution interpoation for digita image processing. IEEE Transactions on acoustics, speech, and signa processing, [11] C. Dong, C. C. Loy, and X. Tang Image Super-Resoution using Deep Convoutiona Networks. IEEE Trans. pattern anaysis and machine inteigence [12] C. Dong, C. C. Loy, and X. Tang Acceerating the super-resoution convoutiona neura networks. ECCV [13] S. Han, H. Mao, and W. J. Day Deep compression: Compressing deep neura networks with pruning, trained quantization and huffman coding. arxiv preprint arxiv: , [14] Xiinx, 7 series FPGAs memory resources user guide. [15] K. He, X. Zhang, S. Ren, and J. Sun Deving deep into retifiers: surpassing human-eve performance on imagenet cassification. In ICCV [16] D. Lin, S. Taathi, and S. Annapureddy Fixed point quantization of deep convoutiona networks. In ICML

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm

Outline. Parallel Numerical Algorithms. Forward Substitution. Triangular Matrices. Solving Triangular Systems. Back Substitution. Parallel Algorithm Outine Parae Numerica Agorithms Chapter 8 Prof. Michae T. Heath Department of Computer Science University of Iinois at Urbana-Champaign CS 554 / CSE 512 1 2 3 4 Trianguar Matrices Michae T. Heath Parae

More information

JOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM

JOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM JOINT IMAGE REGISTRATION AND AMPLE-BASED SUPER-RESOLUTION ALGORITHM Hyo-Song Kim, Jeyong Shin, and Rae-Hong Park Department of Eectronic Engineering, Schoo of Engineering, Sogang University 35 Baekbeom-ro,

More information

A Fast Block Matching Algorithm Based on the Winner-Update Strategy

A Fast Block Matching Algorithm Based on the Winner-Update Strategy In Proceedings of the Fourth Asian Conference on Computer Vision, Taipei, Taiwan, Jan. 000, Voume, pages 977 98 A Fast Bock Matching Agorithm Based on the Winner-Update Strategy Yong-Sheng Chenyz Yi-Ping

More information

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm

More information

Research on UAV Fixed Area Inspection based on Image Reconstruction

Research on UAV Fixed Area Inspection based on Image Reconstruction Research on UAV Fixed Area Inspection based on Image Reconstruction Kun Cao a, Fei Wu b Schoo of Eectronic and Eectrica Engineering, Shanghai University of Engineering Science, Abstract Shanghai 20600,

More information

Mobile App Recommendation: Maximize the Total App Downloads

Mobile App Recommendation: Maximize the Total App Downloads Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management

More information

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System GPU Impementation of Parae SVM as Appied to Intrusion Detection System Sudarshan Hiray Research Schoar, Department of Computer Engineering, Vishwakarma Institute of Technoogy, Pune, India sdhiray7@gmai.com

More information

WHILE estimating the depth of a scene from a single image

WHILE estimating the depth of a scene from a single image JOURNAL OF L A T E X CLASS FILES, VOL. 4, NO. 8, AUGUST 05 Monocuar Depth Estimation using Muti-Scae Continuous CRFs as Sequentia Deep Networks Dan Xu, Student Member, IEEE, Eisa Ricci, Member, IEEE, Wani

More information

A Memory Grouping Method for Sharing Memory BIST Logic

A Memory Grouping Method for Sharing Memory BIST Logic A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5

More information

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code Further Optimization of the Decoding Method for Shortened Binary Cycic Fire Code Ch. Nanda Kishore Heosoft (India) Private Limited 8-2-703, Road No-12 Banjara His, Hyderabad, INDIA Phone: +91-040-3378222

More information

Hiding secrete data in compressed images using histogram analysis

Hiding secrete data in compressed images using histogram analysis University of Woongong Research Onine University of Woongong in Dubai - Papers University of Woongong in Dubai 2 iding secrete data in compressed images using histogram anaysis Farhad Keissarian University

More information

NestedNet: Learning Nested Sparse Structures in Deep Neural Networks

NestedNet: Learning Nested Sparse Structures in Deep Neural Networks NestedNet: Learning Nested Sparse Structures in Deep Neura Networks Eunwoo Kim Chanho Ahn Songhwai Oh Department of ECE and ASRI, Seou Nationa University, South Korea {kewoo15, mychahn, songhwai}@snu.ac.kr

More information

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion.

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion. Lecture outine 433-324 Graphics and Interaction Scan Converting Poygons and Lines Department of Computer Science and Software Engineering The Introduction Scan conversion Scan-ine agorithm Edge coherence

More information

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory 0 th Word Congress on Structura and Mutidiscipinary Optimization May 9 -, 03, Orando, Forida, USA A Design Method for Optima Truss Structures with Certain Redundancy Based on Combinatoria Rigidity Theory

More information

Utility-based Camera Assignment in a Video Network: A Game Theoretic Framework

Utility-based Camera Assignment in a Video Network: A Game Theoretic Framework This artice has been accepted for pubication in a future issue of this journa, but has not been fuy edited. Content may change prior to fina pubication. Y.LI AND B.BHANU CAMERA ASSIGNMENT: A GAME-THEORETIC

More information

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER!

Proceedings of the International Conference on Systolic Arrays, San Diego, California, U.S.A., May 25-27, 1988 AN EFFICIENT ASYNCHRONOUS MULTIPLIER! [1,2] have, in theory, revoutionized cryptography. Unfortunatey, athough offer many advantages over conventiona and authentication), such cock synchronization in this appication due to the arge operand

More information

Load Balancing by MPLS in Differentiated Services Networks

Load Balancing by MPLS in Differentiated Services Networks Load Baancing by MPLS in Differentiated Services Networks Riikka Susitaiva, Jorma Virtamo, and Samui Aato Networking Laboratory, Hesinki University of Technoogy P.O.Box 3000, FIN-02015 HUT, Finand {riikka.susitaiva,

More information

Handling Outliers in Non-Blind Image Deconvolution

Handling Outliers in Non-Blind Image Deconvolution Handing Outiers in Non-Bind Image Deconvoution Sunghyun Cho 1 Jue Wang 2 Seungyong Lee 1,2 sodomau@postech.ac.kr juewang@adobe.com eesy@postech.ac.kr 1 POSTECH 2 Adobe Systems Abstract Non-bind deconvoution

More information

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS Pave Tchesmedjiev, Peter Vassiev Centre for Biomedica Engineering,

More information

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds

A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS. A. C. Finch, K. J. Mackenzie, G. J. Balsdon, G. Symonds A METHOD FOR GRIDLESS ROUTING OF PRINTED CIRCUIT BOARDS A C Finch K J Mackenzie G J Basdon G Symonds Raca-Redac Ltd Newtown Tewkesbury Gos Engand ABSTRACT The introduction of fine-ine technoogies to printed

More information

Fastest-Path Computation

Fastest-Path Computation Fastest-Path Computation DONGHUI ZHANG Coege of Computer & Information Science Northeastern University Synonyms fastest route; driving direction Definition In the United states, ony 9.% of the househods

More information

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART 13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of

More information

Complex Human Activity Searching in a Video Employing Negative Space Analysis

Complex Human Activity Searching in a Video Employing Negative Space Analysis Compex Human Activity Searching in a Video Empoying Negative Space Anaysis Shah Atiqur Rahman, Siu-Yeung Cho, M.K.H. Leung 3, Schoo of Computer Engineering, Nanyang Technoogica University, Singapore 639798

More information

Automatic Hidden Web Database Classification

Automatic Hidden Web Database Classification Automatic idden Web atabase Cassification Zhiguo Gong, Jingbai Zhang, and Qian Liu Facuty of Science and Technoogy niversity of Macau Macao, PRC {fstzgg,ma46597,ma46620}@umac.mo Abstract. In this paper,

More information

Endoscopic Motion Compensation of High Speed Videoendoscopy

Endoscopic Motion Compensation of High Speed Videoendoscopy Endoscopic Motion Compensation of High Speed Videoendoscopy Bharath avuri Department of Computer Science and Engineering, University of South Caroina, Coumbia, SC - 901. ravuri@cse.sc.edu Abstract. High

More information

Massively Parallel Part of Speech Tagging Using. Min-Max Modular Neural Networks.

Massively Parallel Part of Speech Tagging Using. Min-Max Modular Neural Networks. assivey Parae Part of Speech Tagging Using in-ax oduar Neura Networks Bao-Liang Lu y, Qing a z, ichinori Ichikawa y, & Hitoshi Isahara z y Lab. for Brain-Operative Device, Brain Science Institute, RIEN

More information

A VLSI Architecture of JPEG2000 Encoder

A VLSI Architecture of JPEG2000 Encoder A VLSI Architecture of JPEG2000 Encoder Leibo LIU, Ning CHEN, Hongying MENG, Li ZHANG, Zhihua WANG, and Hongyi CHEN Abstract This paper proposes a VLSI architecture of JPEG2000 encoder, which functionay

More information

AUTOMATIC IMAGE RETARGETING USING SALIENCY BASED MESH PARAMETERIZATION

AUTOMATIC IMAGE RETARGETING USING SALIENCY BASED MESH PARAMETERIZATION S.Sai Kumar et a. / (IJCSIT Internationa Journa of Computer Science and Information Technoogies, Vo. 1 (4, 010, 73-79 AUTOMATIC IMAGE RETARGETING USING SALIENCY BASED MESH PARAMETERIZATION 1 S.Sai Kumar,

More information

Distance Weighted Discrimination and Second Order Cone Programming

Distance Weighted Discrimination and Second Order Cone Programming Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and

More information

Delay Budget Partitioning to Maximize Network Resource Usage Efficiency

Delay Budget Partitioning to Maximize Network Resource Usage Efficiency Deay Budget Partitioning to Maximize Network Resource Usage Efficiency Kartik Gopaan Tzi-cker Chiueh Yow-Jian Lin Forida State University Stony Brook University Tecordia Technoogies kartik@cs.fsu.edu chiueh@cs.sunysb.edu

More information

CNN and RNN Based Neural Networks for Action Recognition

CNN and RNN Based Neural Networks for Action Recognition Journa of Physics: Conference Series PAPER OPEN ACCESS CNN and RNN Based Neura Networks for Action Recognition To cite this artice: Chen Zhao et a 2018 J. Phys.: Conf. Ser. 1087 062013 View the artice

More information

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING Binbin Dai and Wei Yu Ya-Feng Liu Department of Eectrica and Computer Engineering University of Toronto, Toronto ON, Canada M5S 3G4 Emais:

More information

A New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions

A New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions 2006 Internationa Joint Conference on Neura Networks Sheraton Vancouver Wa Centre Hote, Vancouver, BC, Canada Juy 16-21, 2006 A New Supervised Custering Agorithm Based on Min-Max Moduar Network with Gaussian-Zero-Crossing

More information

MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION. Heng Zhang, Vishal M. Patel and Rama Chellappa

MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION. Heng Zhang, Vishal M. Patel and Rama Chellappa MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION Heng Zhang, Visha M. Pate and Rama Cheappa Center for Automation Research University of Maryand, Coage

More information

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System Sef-Contro Cycic Access with Time Division - A MAC Proposa for The HFC System S.M. Jiang, Danny H.K. Tsang, Samue T. Chanson Hong Kong University of Science & Technoogy Cear Water Bay, Kowoon, Hong Kong

More information

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University

Arithmetic Coding. Prof. Ja-Ling Wu. Department of Computer Science and Information Engineering National Taiwan University Arithmetic Coding Prof. Ja-Ling Wu Department of Computer Science and Information Engineering Nationa Taiwan University F(X) Shannon-Fano-Eias Coding W..o.g. we can take X={,,,m}. Assume p()>0 for a. The

More information

Joint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y.

Joint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y. FORTH-ICS / TR-157 December 1995 Joint disparity and motion ed estimation in stereoscopic image sequences Ioannis Patras, Nikos Avertos and Georgios Tziritas y Abstract This work aims at determining four

More information

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method 297 Rea-Time Feature escriptor Matching via a Muti-Resoution Ehaustive Search Method Chi-Yi Tsai, An-Hung Tsao, and Chuan-Wei Wang epartment of Eectrica Engineering, Tamang University, New Taipei City,

More information

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018

ECEn 528 Prof. Archibald Lab: Dynamic Scheduling Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 ECEn 528 Prof. Archibad Lab: Dynamic Scheduing Part A: due Nov. 6, 2018 Part B: due Nov. 13, 2018 Overview This ab's purpose is to expore issues invoved in the design of out-of-order issue processors.

More information

Response Surface Model Updating for Nonlinear Structures

Response Surface Model Updating for Nonlinear Structures Response Surface Mode Updating for Noninear Structures Gonaz Shahidi a, Shamim Pakzad b a PhD Student, Department of Civi and Environmenta Engineering, Lehigh University, ATLSS Engineering Research Center,

More information

MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY

MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY MULTIGRID REDUCTION IN TIME FOR NONLINEAR PARABOLIC PROBLEMS: A CASE STUDY R.D. FALGOUT, T.A. MANTEUFFEL, B. O NEILL, AND J.B. SCHRODER Abstract. The need for paraeism in the time dimension is being driven

More information

MCSE Training Guide: Windows Architecture and Memory

MCSE Training Guide: Windows Architecture and Memory MCSE Training Guide: Windows 95 -- Ch 2 -- Architecture and Memory Page 1 of 13 MCSE Training Guide: Windows 95-2 - Architecture and Memory This chapter wi hep you prepare for the exam by covering the

More information

A Near-Optimal Distributed QoS Constrained Routing Algorithm for Multichannel Wireless Sensor Networks

A Near-Optimal Distributed QoS Constrained Routing Algorithm for Multichannel Wireless Sensor Networks Sensors 2013, 13, 16424-16450; doi:10.3390/s131216424 Artice OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journa/sensors A Near-Optima Distributed QoS Constrained Routing Agorithm for Mutichanne Wireess

More information

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming The First Internationa Symposium on Optimization and Systems Bioogy (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 267 279 Soving Large Doube Digestion Probems for DNA Restriction

More information

As Michi Henning and Steve Vinoski showed 1, calling a remote

As Michi Henning and Steve Vinoski showed 1, calling a remote Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware

More information

Multiple Plane Phase Retrieval Based On Inverse Regularized Imaging and Discrete Diffraction Transform

Multiple Plane Phase Retrieval Based On Inverse Regularized Imaging and Discrete Diffraction Transform Mutipe Pane Phase Retrieva Based On Inverse Reguaried Imaging and Discrete Diffraction Transform Artem Migukin, Vadimir Katkovnik, and Jaakko Astoa Department of Signa Processing, Tampere University of

More information

Application of Image Fusion Techniques on Medical Images

Application of Image Fusion Techniques on Medical Images Internationa Journa of Current Engineering and Technoogy E-ISS 77 406, P-ISS 347 56 07 IPRESSCO, A Rights Reserved Avaiabe at http://inpressco.com/category/icet Research Artice Appication of Image usion

More information

The Classification of Stored Grain Pests based on Convolutional Neural Network

The Classification of Stored Grain Pests based on Convolutional Neural Network 2017 2nd Internationa Conference on Mechatronics and Information Technoogy (ICMIT 2017) The Cassification of Stored Grain Pests based on Convoutiona Neura Network Dexian Zhang1, Wenun Zhao*, 1 1 Schoo

More information

Chapter Multidimensional Direct Search Method

Chapter Multidimensional Direct Search Method Chapter 09.03 Mutidimensiona Direct Search Method After reading this chapter, you shoud be abe to:. Understand the fundamentas of the mutidimensiona direct search methods. Understand how the coordinate

More information

CORRELATION filters (CFs) are a useful tool for a variety

CORRELATION filters (CFs) are a useful tool for a variety Zero-Aiasing Correation Fiters for Object Recognition Joseph A. Fernandez, Student Member, IEEE, Vishnu Naresh Boddeti, Member, IEEE, Andres Rodriguez, Member, IEEE, B. V. K. Vijaya Kumar, Feow, IEEE arxiv:4.36v

More information

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line American J. of Engineering and Appied Sciences 3 (): 5-24, 200 ISSN 94-7020 200 Science Pubications Appication of Inteigence Based Genetic Agorithm for Job Sequencing Probem on Parae Mixed-Mode Assemby

More information

Language Identification for Texts Written in Transliteration

Language Identification for Texts Written in Transliteration Language Identification for Texts Written in Transiteration Andrey Chepovskiy, Sergey Gusev, Margarita Kurbatova Higher Schoo of Economics, Data Anaysis and Artificia Inteigence Department, Pokrovskiy

More information

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism Backing-up Fuzzy Contro of a Truck-traier Equipped with a Kingpin Siding Mechanism G. Siamantas and S. Manesis Eectrica & Computer Engineering Dept., University of Patras, Patras, Greece gsiama@upatras.gr;stam.manesis@ece.upatras.gr

More information

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling

CSE120 Principles of Operating Systems. Prof Yuanyuan (YY) Zhou Scheduling CSE120 Principes of Operating Systems Prof Yuanyuan (YY) Zhou Scheduing Announcement Homework 2 due on October 25th Project 1 due on October 26th 2 CSE 120 Scheduing and Deadock Scheduing Overview In discussing

More information

A Novel Congestion Control Scheme for Elastic Flows in Network-on-Chip Based on Sum-Rate Optimization

A Novel Congestion Control Scheme for Elastic Flows in Network-on-Chip Based on Sum-Rate Optimization A Nove Congestion Contro Scheme for Eastic Fows in Network-on-Chip Based on Sum-Rate Optimization Mohammad S. Taebi 1, Fahimeh Jafari 1,3, Ahmad Khonsari 2,1, and Mohammad H. Yaghmae 3 1 IPM, Schoo of

More information

A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irregular LDPC Codes in the IEEE 802.

A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irregular LDPC Codes in the IEEE 802. A Fast-Convergence Decoding Method and Memory-Efficient VLSI Decoder Architecture for Irreguar LDPC Codes in the IEEE 82.16e Standards Yeong-Luh Ueng and Chung-Chao Cheng Dept. of Eectrica Engineering,

More information

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space Sensitivity Anaysis of Hopfied Neura Network in Cassifying Natura RGB Coor Space Department of Computer Science University of Sharjah UAE rsammouda@sharjah.ac.ae Abstract: - This paper presents a study

More information

Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation

Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation Muti-Scae Continuous CRFs as Sequentia Deep Networks for Monocuar Depth Estimation Dan Xu 1, Eisa Ricci 4,5, Wani Ouyang 2,3, Xiaogang Wang 2, Nicu Sebe 1 1 University of Trento, 2 The Chinese University

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

A HIGH PERFORMANCE, LOW LATENCY, LOW POWER AUDIO PROCESSING SYSTEM FOR WIDEBAND SPEECH OVER WIRELESS LINKS

A HIGH PERFORMANCE, LOW LATENCY, LOW POWER AUDIO PROCESSING SYSTEM FOR WIDEBAND SPEECH OVER WIRELESS LINKS A HIGH PERFORMANCE, LOW LATENCY, LOW POWER AUDIO PROCESSING SYSTEM FOR WIDEBAND SPEECH OVER WIRELESS LINKS Etienne Cornu 1, Aain Dufaux 2, and David Hermann 1 1 AMI Semiconductor Canada, 611 Kumpf Drive,

More information

COMPRESSIVE sensing (CS), which aims at recovering

COMPRESSIVE sensing (CS), which aims at recovering D-Net: Deep Learning pproach for Compressive Sensing RI Yan Yang, Jian Sun, Huibin Li, and Zongben u ariv:705.06869v [cs.cv] 9 ay 07 bstract Compressive sensing (CS) is an effective approach for fast agnetic

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

Research of Classification based on Deep Neural Network

Research of  Classification based on Deep Neural Network 2018 Internationa Conference on Sensor Network and Computer Engineering (ICSNCE 2018) Research of Emai Cassification based on Deep Neura Network Wang Yawen Schoo of Computer Science and Engineering Xi

More information

Providing Hop-by-Hop Authentication and Source Privacy in Wireless Sensor Networks

Providing Hop-by-Hop Authentication and Source Privacy in Wireless Sensor Networks The 31st Annua IEEE Internationa Conference on Computer Communications: Mini-Conference Providing Hop-by-Hop Authentication and Source Privacy in Wireess Sensor Networks Yun Li Jian Li Jian Ren Department

More information

Alpha labelings of straight simple polyominal caterpillars

Alpha labelings of straight simple polyominal caterpillars Apha abeings of straight simpe poyomina caterpiars Daibor Froncek, O Nei Kingston, Kye Vezina Department of Mathematics and Statistics University of Minnesota Duuth University Drive Duuth, MN 82-3, U.S.A.

More information

TSR: Topology Reduction from Tree to Star Data Grids

TSR: Topology Reduction from Tree to Star Data Grids 03 Seventh Internationa Conference on Innovative Mobie and Internet Services in biquitous Computing TSR: Topoogy Reduction from Tree to Star Data Grids Ming-Chang Lee #, Fang-Yie Leu *, Ying-ping Chen

More information

Layer-Specific Adaptive Learning Rates for Deep Networks

Layer-Specific Adaptive Learning Rates for Deep Networks Layer-Specific Adaptive Learning Rates for Deep Networks arxiv:1510.04609v1 [cs.cv] 15 Oct 2015 Bharat Singh, Soham De, Yangmuzi Zhang, Thomas Godstein, and Gavin Tayor Department of Computer Science Department

More information

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162 oward Efficient Spatia Variation Decomposition via Sparse Regression Wangyang Zhang, Karthik Baakrishnan, Xin Li, Duane Boning and Rob Rutenbar 3 Carnegie Meon University, Pittsburgh, PA 53, wangyan@ece.cmu.edu,

More information

AUTOMATIC gender classification based on facial images

AUTOMATIC gender classification based on facial images SUBMITTED TO IEEE TRANSACTIONS ON NEURAL NETWORKS 1 Gender Cassification Using a Min-Max Moduar Support Vector Machine with Incorporating Prior Knowedge Hui-Cheng Lian and Bao-Liang Lu, Senior Member,

More information

Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications

Alternative Decompositions for Distributed Maximization of Network Utility: Framework and Applications Aternative Decompositions for Distributed Maximization of Network Utiity: Framework and Appications Danie P. Paomar and Mung Chiang Eectrica Engineering Department, Princeton University, NJ 08544, USA

More information

Topology-aware Key Management Schemes for Wireless Multicast

Topology-aware Key Management Schemes for Wireless Multicast Topoogy-aware Key Management Schemes for Wireess Muticast Yan Sun, Wade Trappe,andK.J.RayLiu Department of Eectrica and Computer Engineering, University of Maryand, Coege Park Emai: ysun, kjriu@gue.umd.edu

More information

Image Segmentation Using Semi-Supervised k-means

Image Segmentation Using Semi-Supervised k-means I J C T A, 9(34) 2016, pp. 595-601 Internationa Science Press Image Segmentation Using Semi-Supervised k-means Reza Monsefi * and Saeed Zahedi * ABSTRACT Extracting the region of interest is a very chaenging

More information

A NEW APPROACH FOR BLOCK BASED STEGANALYSIS USING A MULTI-CLASSIFIER

A NEW APPROACH FOR BLOCK BASED STEGANALYSIS USING A MULTI-CLASSIFIER Internationa Journa on Technica and Physica Probems of Engineering (IJTPE) Pubished by Internationa Organization of IOTPE ISSN 077-358 IJTPE Journa www.iotpe.com ijtpe@iotpe.com September 014 Issue 0 Voume

More information

Optimized Base-Station Cache Allocation for Cloud Radio Access Network with Multicast Backhaul

Optimized Base-Station Cache Allocation for Cloud Radio Access Network with Multicast Backhaul Optimized Base-Station Cache Aocation for Coud Radio Access Network with Muticast Backhau Binbin Dai, Student Member, IEEE, Ya-Feng Liu, Member, IEEE, and Wei Yu, Feow, IEEE arxiv:804.0730v [cs.it] 28

More information

Fuzzy Perceptual Watermarking For Ownership Verification

Fuzzy Perceptual Watermarking For Ownership Verification Fuzzy Perceptua Watermarking For Ownership Verification Mukesh Motwani 1 and Frederick C. Harris, Jr. 1 1 Computer Science & Engineering Department, University of Nevada, Reno, NV, USA Abstract - An adaptive

More information

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Patrice Y. Simard, 1 Yann A. Le Cun, 2 John S. Denker, 2 Bernard Victorri 3 1 Microsoft Research, 1 Microsoft Way, Redmond,

More information

FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES

FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES Tien Sy Nguyen, Stéphane Begot, Forent Ducuty, Manue Avia To cite this version: Tien Sy Nguyen, Stéphane Begot, Forent

More information

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program?

Intro to Programming & C Why Program? 1.2 Computer Systems: Hardware and Software. Why Learn to Program? Intro to Programming & C++ Unit 1 Sections 1.1-3 and 2.1-10, 2.12-13, 2.15-17 CS 1428 Spring 2018 Ji Seaman 1.1 Why Program? Computer programmabe machine designed to foow instructions Program a set of

More information

Comparative Analysis of Relevance for SVM-Based Interactive Document Retrieval

Comparative Analysis of Relevance for SVM-Based Interactive Document Retrieval Comparative Anaysis for SVM-Based Interactive Document Retrieva Paper: Comparative Anaysis of Reevance for SVM-Based Interactive Document Retrieva Hiroshi Murata, Takashi Onoda, and Seiji Yamada Centra

More information

Image Processing Technology of FLIR-based Enhanced Vision System

Image Processing Technology of FLIR-based Enhanced Vision System 25 TH INTERNATIONAL CONGRESS OF THE AERONAUTICAL SCIENCES Image Processing Technoogy of FLIR-based Enhanced Vision System Ding Quan Xin, Zhu Rong Gang,Zhao Zhen Yu Luoyang Institute of Eectro-optica Equipment

More information

Application of the pseudoinverse computation in reconstruction of blurred images

Application of the pseudoinverse computation in reconstruction of blurred images Fiomat 26:3 (22), 453 465 DOI 2298/FIL23453M Pubished by Facuty of Sciences and Mathematics, University of Niš, Serbia Avaiabe at: http://wwwpmfniacrs/fiomat Appication of the pseudoinverse computation

More information

WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION

WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION Shen Tao Chinese Academy of Surveying and Mapping, Beijing 100039, China shentao@casm.ac.cn Xu Dehe Institute of resources and environment, North

More information

Resource Optimization to Provision a Virtual Private Network Using the Hose Model

Resource Optimization to Provision a Virtual Private Network Using the Hose Model Resource Optimization to Provision a Virtua Private Network Using the Hose Mode Monia Ghobadi, Sudhakar Ganti, Ghoamai C. Shoja University of Victoria, Victoria C, Canada V8W 3P6 e-mai: {monia, sganti,

More information

Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient

Efficient method to design RF pulses for parallel excitation MRI using gridding and conjugate gradient Origina rtice Efficient method to design RF puses for parae excitation MRI using gridding and conjugate gradient Shuo Feng, Jim Ji Department of Eectrica & Computer Engineering, Texas & M University, Texas,

More information

On Trivial Solution and High Correlation Problems in Deep Supervised Hashing

On Trivial Solution and High Correlation Problems in Deep Supervised Hashing On Trivia Soution and High Correation Probems in Deep Supervised Hashing Yuchen Guo, Xin Zhao, Guiguang Ding, Jungong Han Schoo of Software, Tsinghua University, Beijing 84, China Schoo of Computing and

More information

Automatic Grouping for Social Networks CS229 Project Report

Automatic Grouping for Social Networks CS229 Project Report Automatic Grouping for Socia Networks CS229 Project Report Xiaoying Tian Ya Le Yangru Fang Abstract Socia networking sites aow users to manuay categorize their friends, but it is aborious to construct

More information

(0,l) (0,0) (l,0) (l,l)

(0,l) (0,0) (l,0) (l,l) Parae Domain Decomposition and Load Baancing Using Space-Fiing Curves Srinivas Auru Fatih E. Sevigen Dept. of CS Schoo of EECS New Mexico State University Syracuse University Las Cruces, NM 88003-8001

More information

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES Eya En Gad, Akshay Gadde, A. Saman Avestimehr and Antonio Ortega Department of Eectrica Engineering University of Southern

More information

Testing Whether a Set of Code Words Satisfies a Given Set of Constraints *

Testing Whether a Set of Code Words Satisfies a Given Set of Constraints * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 6, 333-346 (010) Testing Whether a Set of Code Words Satisfies a Given Set of Constraints * HSIN-WEN WEI, WAN-CHEN LU, PEI-CHI HUANG, WEI-KUAN SHIH AND MING-YANG

More information

THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM

THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM 17th European Signa Processing Conference (EUSIPCO 2009) Gasgow, Scotand, August 24-28, 2009 THE PERCENTAGE OCCUPANCY HIT OR MISS TRANSFORM P. Murray 1, S. Marsha 1, and E.Buinger 2 1 Dept. of Eectronic

More information

arxiv: v1 [cs.cv] 7 Feb 2013

arxiv: v1 [cs.cv] 7 Feb 2013 Fast Image Scanning with Deep Max-Pooing Convoutiona Neura Networks arxiv:1302.1700v1 [cs.cv] 7 Feb 2013 Aessandro Giusti Dan C. Cireşan Jonathan Masci Luca M. Gambardea Jürgen Schmidhuber Technica Report

More information

Neural Network Enhancement of the Los Alamos Force Deployment Estimator

Neural Network Enhancement of the Los Alamos Force Deployment Estimator Missouri University of Science and Technoogy Schoars' Mine Eectrica and Computer Engineering Facuty Research & Creative Works Eectrica and Computer Engineering 1-1-1994 Neura Network Enhancement of the

More information

Learning Dynamic Guidance for Depth Image Enhancement

Learning Dynamic Guidance for Depth Image Enhancement Learning Dynamic Guidance for Depth Image Enhancement Shuhang Gu 1, Wangmeng Zuo 2, Shi Guo 2, Yunjin Chen 3, Chongyu Chen 4,1, Lei Zhang 1, 1 The Hong Kong Poytechnic University, 2 Harbin Institute of

More information

Real-Time Image Generation with Simultaneous Video Memory Read/Write Access and Fast Physical Addressing

Real-Time Image Generation with Simultaneous Video Memory Read/Write Access and Fast Physical Addressing Rea-Time Image Generation with Simutaneous Video Memory Read/rite Access and Fast Physica Addressing Mountassar Maamoun 1, Bouaem Laichi 2, Abdehaim Benbekacem 3, Daoud Berkani 4 1 Department of Eectronic,

More information

For Review Only. CFP: Cooperative Fast Protection. Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapolcai and Hussein T. Mouftah

For Review Only. CFP: Cooperative Fast Protection. Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapolcai and Hussein T. Mouftah Journa of Lightwave Technoogy Page of CFP: Cooperative Fast Protection Bin Wu, Pin-Han Ho, Kwan L. Yeung, János Tapocai and Hussein T. Mouftah Abstract We introduce a nove protection scheme, caed Cooperative

More information

Multi-level Shape Recognition based on Wavelet-Transform. Modulus Maxima

Multi-level Shape Recognition based on Wavelet-Transform. Modulus Maxima uti-eve Shape Recognition based on Waveet-Transform oduus axima Faouzi Aaya Cheikh, Azhar Quddus and oncef Gabbouj Tampere University of Technoogy (TUT), Signa Processing aboratory, P.O. Box 553, FIN-33101

More information

PAPER Delay Constrained Routing and Link Capacity Assignment in Virtual Circuit Networks

PAPER Delay Constrained Routing and Link Capacity Assignment in Virtual Circuit Networks 2004 PAPER Deay Constrained Routing and Link Capacity Assignment in Virtua Circuit Networks Hong-Hsu YEN a), Member and FrankYeong-Sung LIN b), Nonmember SUMMARY An essentia issue in designing, operating

More information

Optimization and Application of Support Vector Machine Based on SVM Algorithm Parameters

Optimization and Application of Support Vector Machine Based on SVM Algorithm Parameters Optimization and Appication of Support Vector Machine Based on SVM Agorithm Parameters YAN Hui-feng 1, WANG Wei-feng 1, LIU Jie 2 1 ChongQing University of Posts and Teecom 400065, China 2 Schoo Of Civi

More information

On Upper Bounds for Assortment Optimization under the Mixture of Multinomial Logit Models

On Upper Bounds for Assortment Optimization under the Mixture of Multinomial Logit Models On Upper Bounds for Assortment Optimization under the Mixture of Mutinomia Logit Modes Sumit Kunnumka September 30, 2014 Abstract The assortment optimization probem under the mixture of mutinomia ogit

More information

High-Quality Unstructured Volume Rendering on the PC Platform

High-Quality Unstructured Volume Rendering on the PC Platform Graphics Hardware (2002), pp. 1 8 Thomas Ert, Wofgang Heidrich, and Michae Doggett (Editors) High-Quaity Unstructured Voume Rendering on the PC Patform Stefan Guthe Stefan Roettger Andreas Schieber Wofgang

More information