Hardware Acceleration of Retinal Blood Vasculature Segmentation

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Hardware Acceleration of Retinal Blood Vasculature Segmentation"

Transcription

1 Hardware Acceleration of Retinal Blood Vasculature Segmentation Dimitris Koukounis, Christos Ttofis, Theocharis Theocharides KIOS Research Center ECE Department, University of Cyprus 123 Kyreneias Ave, Nicosia, Cyprus {koukounis.dimitris, ttofis.christos, ABSTRACT Retinal vessel tree extraction is a complex and computationally intensive task used in several medical and biometric applications. The emergence of portable biometric authentication applications, as well as on-site biomedical diagnostics, raises the need for hardware-accelerated, power-efficient architectures that can satisfy the performance and accuracy requirements of retinal vessel tree extraction. As such, this paper presents a VLSI implementation of a retina vessel segmentation system, in an attempt to illustrate the advantages and performance benefits that result from a dedicated VLSI solution. The proposed design implements an unsupervised, vessel segmentation algorithm, which utilizes match filtering with signed integers to enhance the difference between the blood vessels and the rest of the retina. The design simplifies the process of obtaining a binary map of the vessel tree by using parallel processing and efficient resource sharing, thus offering real-time performance. FPGA-based simulation results indicate significant performance improvements (up to 90x) when compared to existing hardware and software implementations. Categories and Subject Descriptors B.7.1 [Integrated Circuits]: Types and Design Styles Algorithms implemented in hardware. General Terms Algorithms, Performance, Design. Keywords Retinal Vessel Segmentation, Portable Biometrics, On-site medical diagnostics, Reconfigurable Parallel Architectures, Hardware Acceleration. 1. INTRODUCTION Separating the blood vessel tree from the retina, is an important and daunting process in biometric and biomedical applications. As such, several algorithms have been implemented, mostly on software, that perform retinal blood vasculature segmentation, and most of them target medical applications where speed is extremely important [2]. These algorithms are typically used by physicians to diagnose and investigate different diseases that Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GLSVLSI'13, May 2 3, 2013, Paris, France. Copyright /13/05...$ cause changes to the blood vasculature. Many of those algorithms manage to provide satisfactory results in terms of accuracy. However, the use of the retina in identification or authentication systems traditionally was rarely used due to the high cost of the image capturing equipment, but most importantly due to the complex computational demands [1]. Emerging research suggests hardware acceleration [3-5] that aims to speed up the processing time of vessel segmentation algorithms, thus realizing their use in biometric systems. However, to take advantage of the emerging need for portable biometric authentication devices, as well as on-site diagnostic requirements, further performance improvement is still necessary, with the power consumption an additional significant constraint. Portable biometric authentication systems and on-site medical diagnostic equipment require high performance that achieves realtime response, low resource requirements and an acceptable accuracy able to authenticate/identify an individual or detect accurately a disease. Existing hardware attempts have various insufficiencies in regard to the processing time and their scalability to high-resolution images, thus have partially satisfied these requirements so far. This paper therefore presents a VLSI implementation of an unsupervised vessel segmentation algorithm. The proposed hardware architecture adopts the main algorithmic idea from [12] that was implemented in software, but integrates hardware optimization techniques in an attempt to make the algorithm hardware-friendly while achieving high performance and accuracy. A matched filtering technique is used to augment the vessels from retinal images, which helps the extraction of the binary map of the vessel tree. The binary image can be used at a later stage in a segmentation-driven registration system in order to make the registration process more accurate. The paper presents several architectural optimizations stemming from the targeted hardware architecture, and thus is able to achieve significant performance increase (~90x) when compared to other hardware and software implementations, with acceptable accuracy (within ~4% of the highest reported accuracy achieved with hardware acceleration). The proposed architecture was implemented and evaluated on a Spartan 6 FPGA, in order to derive synthesis and performance evaluation results, yielding s per 768x584 pixels image. The paper is organized as follows. Section II presents background information and describes the algorithm used to segment the blood vessel tree. Section III summarizes and reviews the related work. Then the proposed hardware architecture is presented in Section IV and Section V presents the simulation results. Finally, this paper concludes in Section VI where conclusions and future directions are presented. 113

2 2. Background 2.1 Vessel Segmentation A large number of algorithms relating to the segmentation of the retinal blood vessels have been published so far, and can be classified into two broad categories: supervised and unsupervised [2]. Furthermore, [2] classifies the blood vessels segmentation algorithms in seven main categories based on the methodology followed: (1) pattern recognition techniques, (2) matched filtering (MF), (3) vessel tracking/tracing, (4) mathematical morphology, (5) multiscale approaches, (6) model based approaches and (7) parallel/hardware based approaches. Among the aforementioned categories, matched filtering approaches give the more accurate results [7], thus the proposed architecture is inspired by a matched filter approach, and specifically, the algorithm presented in [12]. This algorithm integrates matched filtering and a threshold probing technique to achieve the desirable results. As proposed in [8], which presented the design of a hardware image processing system based on matched filtering, this approach has another significant advantage; the filter is implemented using integers, rather than floating point values, making the algorithm hardware friendly. Fig. 1.(a) Retinal Image. (b) Matched Filter Response. 2.2 Matched filtering-based algorithm The matched filter describes the expected appearance of the blood vessels in retinal images. The design of the matched filter in [8] is based on three main characteristics of the blood vessels: the first characteristic is that the vessels have a small curvature so they can be approximated by piecewise linear segments, the second characteristic is that the vessels have a lower reflectance compared to the rest of the retinal and as a result vessels appear darker in comparison with the background, and lastly, the third feature of the vessels is that their width lies between 2-10 pixels. Utilizing these characteristics, [8] proposes a Gaussian function as a model for a blood vessel profile. In order to extend the model to two dimensions (X and Y), it is assumed that a vessel has fixed width and direction, for a short length. The matched filter is then rotated from 0 to 180 with a 15 step. The twelve resulting filter kernels are convolved with the image. The matched filter response (MFR) is the highest scoring value at each pixel. The MFR of a retinal image can be seen in Figure 1b. Once the convolution with the matched filter finishes, the vessels are well distinguished from the rest of the retina. The MFR is thresholded by using a threshold value calculated automatically through iterative thresholding, and selected based on the number of pixels that have the most common pixel value on the image. The next step involves threshold probing, where the segments created by the thresholding are used to locate a set of starting points to initialize the probe queue. Probing is iterative and is used to determine the appropriate threshold for the area being probed. 3. Related Work The majority of the algorithms presented in literature, have been implemented in software running on general-purpose processors or clusters. An ad-hoc parallel implementation for the segmentation of high-resolution images is presented in [6]. In this approach, the image is divided into sub-images that have overlapping regions and each sub-image is distributed across computers for feature extraction and region growing; the segmentation results are then combined. The proposed algorithm processes 7 megapixel images in less than two minutes, facing memory shortage issues stemming from the use of such highdefinition images. A similar parallel approach to segment highresolution images is proposed in [10]. This approach reports a processing time of 3.54s for segmenting a 10-megapixel image, at 92.5% accuracy (the accuracy is measured by the ratio of the total number of correctly classified pixels to the total number of pixels in the original image that actually belong to the vasculature). This approach spends 20% of the total execution time for communication and I/O purposes. In addition, [10] indicates that the use of multiprocessing computing platforms can enable the segmentation of high definition images in reasonable time. However, while easy to implement and extremely flexible, these approaches require state-of-the-art and expensive hardware platforms to execute, and consume excessive amount of energy. Therefore, the use of such systems can only be limited to certain medical applications, where power and cost are not an issue. A similar approach to the one described in this paper is shown in [21], and segments veins from medical video processing. It reports a performance of 215 frames per second (0.005s per image), however the performance value stated in [21] is theoretical and not verified experimentally. It relies only on the listed FPGA operating frequency, without considering issues that may affect performance such as I/O handling and memory accesses. The emergence of portable biometrics and on-site medical diagnosis, raised the need for dedicated hardware approaches, which have emerged recently in an attempt to overcome the constraints found in software-based solutions [2]. In [5], an FPGA-based implementation of vessel extraction is presented. The specific algorithm implements a single-instruction multiple data (SIMD) architecture on a Spartan-3 FPGA. It requires a total of 1.4s to extract the vessels from a 768x584 retinal image. Alonso-Montes et al. [4] present another hardware approach where the vessels are segmented by using local dynamic convolutions and morphological operations together with arithmetic and logical operations. These are implemented and tested in a fine-grained SIMD processor array. [4] reports an average accuracy of 91.8% and a processing time of 0.19s per image. While the hardware implementations in [3-5] achieve significant improvements in comparison with the software approaches, they partially satisfy the high performance requirement of portable systems that impose a satisfactory accuracy and most importantly vessel extraction in real-time response. Furthermore, most of the approaches follow a straight algorithm-to-hardware mapping, without any algorithmic optimizations or considerations for hardware-friendly design optimization techniques, especially focused on the reconfigurable fabric. As the segmentation process is the most computationally intensive part of the entire authentication or diagnosis procedure that involves retina examination, further performance improvement is necessary. As such, the work presented in this paper targets a hardware-based 114

3 Fig. 2. Proposed Architecture Overview Accuracy % Fig. 4. Accuracy vs Number of Matched Filters vessel segmentation approach that manages to achieve extremely high performance by reducing certain computationally intensive steps, sacrificing minimal (but acceptable) accuracy for performance speedup and power reduction. The hardware-targeted optimizations enable easy integration of the architecture in existing portable biomedical and biometric systems. 4. Proposed Hardware Architecture The proposed architecture consists of three major hardware units; the I/O unit, the Match Filter unit and the Threshold unit. These units are pipelined, thus are able to operate concurrently, and the interconnection between them is accomplished using on-chip buffers. Particularly, the system uses one scanline buffer to temporarily store the image data, and six smaller on-chip buffers to store the rotated matched filters. The complete system architecture and the communication flow between the units is shown in Fig. 2. It is assumed that input images are captured by a specialized retina camera, and are loaded through an external memory controller on the targeted hardware in a raster scan format. 4.1 I/O Unit No of MF The memory controller communicates with on-chip Block RAM memory and fetches the green channel values (8 bits) to the scanline buffer. The red and blue channels are not used as the green channel utilizes a better difference between the vessels and the retinal background [15]. The use of the scanline buffer provides manageable parallelism in order to continuously feed the processing units with the data needed without any delays. It allows to sequentially read the image from the BRAM, and uses it in parallel in order to exploit the available parallelism on the hardware platform. The only introduced delay by using this mechanism is an initial delay until the first N rows of the image are loaded (where N is the width of the kernel window) No of Bits Fig. 6. Accuracy vs Number of Bits per pixel Subsequently, this mechanism can provide data needed by the match filter unit every cycle. Accuracy % 4.2 Match Filter Unit (MFU) In contrast to the original software-based algorithm in [12], the number of the matched filters were reduced to 6 by increasing the rotation step from 15 to 30, in an attempt to reduce the computational complexity and memory requirements. Simulations indicate a less than 3% reduction in the accuracy when reducing the number of rotations. Figure 4 illustrates how the number of rotations used are affecting the accuracy of this vessel segmentation approach. When the rotation step is increased to 45 (4 matched filters) a significant reduction in accuracy is observed(less than 80%). At 30 the reduction in the accuracy is not significant (~3%) and the resulting accuracy of 90% is more than enough for the aimed applications. In addition, if more matched filters were used the hardware area overhead would increased significantly and thus increase the power consumption. The MFU performs the convolution between the image and the six resulting kernels (Fig.4a). Fig. 3 shows the pipeline stages followed during the convolution process. The first N lines of the image are being loaded to the scanline buffer, while the matched Fig. 3. Pipeline Stages followed in Match Filter Unit 115

4 a. b. Fig. 5. (a) Architecture of the Convolution Unit. (b) Architecture of the scaling done between the Convolution and Threshold Unit filter is rotated and stored in six different buffers. As soon as the scanline buffer fills up, the multiplexer selects the first kernel for pixel-wise multiplication. The number of parallel multiplications equals to the number of elements of the scanline buffer output window. Multiplication results are stored in a 16-bit buffer in order to have simultaneous access to all the elements, which are then all forwarded to the accumulation stage in the following clock cycle. At this point, the multiplexer selects the next kernel, and the operation iterates. Fig. 4a ("Tree Adder") shows the accumulation as the last step of the Convolution Unit, implemented as a tree adder, which enables a cost-effective implementation of the accumulator. The hardware layout resulting from this type of addition results in the highest performance and frequency achievable. The resulting accumulated 16-bit value is stored in a register and every time a new value is computed, a comparison between the previous value and the new value is done. Six values are compared, one for each kernel, and the maximum value remains as the final pixel value. The final pixel values are stored in another scanline buffer called the "Matched Filter Response". The MFR scanline buffer can be seen in the blood vessels segmentation unit, below the multiply-accumulate unit in Fig. 2. This process is repeated six times for every image window that the starting scan line buffer fetches, in order to convolve the image window with the six kernels (Fig. 5a). The pipelined architecture of the convolution unit allows it to sustain a valid pixel value every six clock cycles (given that convolution is done for all six kernels). This process is done for all windows in the image. The convolution unit can be easily modified to process more than one kernel in a single cycle, thus computing a valid pixel in less than 6 clock cycles, depending on the amount of parallelism desired, and the available hardware resources. 4.3 Threshold Unit Prior to thresholding, it is necessary to normalize the accumulation result of the pixels computed at the previous stage in order to simplify and reduce storage requirements. The authors in [12] normalize the resulting value by dividing each pixel with a constant value. To implement a hardware friendly normalization, we used only the 11 most significant bits of each pixel value (Fig. 2 Normalization Stage). Figure 6 displays the accuracy against the number of bits per pixel used. The accuracy reaches 90,07% which the highest achievable accuracy of the presented approach at 11 bits per pixel. At a lower number of bits, the accuracy is reduced considerably since important features of the vessels are ignored. As the number of bits is increased, the accuracy is reduced, as a result of the unwanted details included at such high numbers of bits per pixel. Moreover, the MFR consists of both positive and negative integer values; hence, all values are subsequently positively biased by the absolute value of the minimum pixel value, to create only positive integers and reduce the complexity of the threshold operation. This is described in Fig. 5b. The largest number from all the resulting pixels that have similar value (i.e. all characterize similar intensity in the image, which is usually the background and not the vasculature) is then used to compute the threshold value. As said, this value should represent the background of the retinal image. As shown in Fig. 1b, the background of the MFR is well distinguished from the vessels. The thresholding operation is done in parallel. All pixels above the threshold are replaced with '1' (white) and the rest of the pixels are replaced with '0' (black). 5. Experimental Evaluation and Results The proposed retinal blood vessel architecture was implemented on a Spartan 6 XC65LX150T FPGA, and was evaluated using images from publicly available databases such as the STARE database [12], the DRIVE database [11] and the MESSIDOR database [13]. The input retinal images were loaded into the BlockRAM of the FPGA and used as input to the system shown in Figure 2. The implemented system was evaluated subsequently in terms of performance, power consumption, accuracy and hardware requirements. Results of the obtained segmented images are shown in Figure 7, along with the corresponding software implementation results. 116

5 TABLE III. Processing speed and Accuracy Comparison for Various Systems Database Accuracy Frequency (MHz) Execution time (s) Platform Algorithm Chaudhuri et al [9] N/A N/A 2,0000 Special Image processing board MFR Hoover et al [7] STARE N/A 3,0000 Sun SPARCstation 20 MFR+Threshold Probing Alonso Montes et al [4] DRIVE N/A 0,1925 Cellular Neural Networks Pixel Level Snakes Cinsidiki and Aydun [8] DRIVE N/A 3,0000 Cluster Computer MFR+ANT algorithm A. Nieto et al [5] N/A 0, ,4000 FPGA Active Contour Proposed Architecture DRIVE 0, , 0313 FPGA MFR+Thresholding STARE 0, ,0318 FPGA MFR+Thresholding TABLE I. Image size and Execution time Image Size Number of Windows Total Execution Time (s) 768x , x , Performance The processing speed of the proposed hardware implementation is measured by the time (seconds) needed to extract the blood vessel tree of a retinal image. We used 20 images from the DRIVE [11] and STARE [12] database to compute an average processing time for the proposed architecture and the original software-based algorithm. Experimental results indicate that the proposed system offers a total performance acceleration of up to 90 times relative to the corresponding software implementation of the algorithm. Table I illustrates how the processing time of the proposed architecture is affected by the image size. A promising observation is that the processing time remains relatively small for a small image size (of course this is expected), but increases linearly as the image size increases, which is attributed to the I/O handling, since once the original set of pixels are loaded, processing is then dependent clearly on the amount of available FPGA resources. Given that the overall input image size plays a linear role in the initial scanline buffer filling up, the delay is then directly proportional to the overall number of input pixels per line, which explains the linear delay increase. This is attributed to the fact that the number of the windows being processed is equal to the number of pixels in the image (see Table I). Table III presents a comparison between existing implementations and the proposed architecture, showing the algorithm adopted in each work as well as the implementation platform. The proposed system is faster than all implementations, as it needs only s to extract the vessels from an image size of 768x584 while the processing time of the majority of the works listed in the table is comparatively lower (comparisons are based on the same image databases that were used in existing implementations). 5.2 Accuracy The accuracy of the system is defined as the ratio of the total number of correctly classified pixels (sum of true positives and true negatives) to the number of pixels in the image field of view. We compare the accuracy of the proposed architecture with the software-base implementation of the algorithm. The original algorithm has an accuracy of 92.67%, while the accuracy of the proposed architecture is 90.07%. This accounts for a quality drop of about 2.5%, and is attributed to the hardware optimizations TABLE II. Complete System Hardware Overheads Platform Spartan 6 XC65LX150T Slice LUTs (92152) Slice Registers (183304) DSP48A 1s (180) BRAMs (268) Freq. (MHz) 66% 14% 100% 55% 100 (kernel reduction and thresholding). While this might be a considerable drop for a high-end medical system, the accuracy drop-off is not as important in portable biometrics and biomedical diagnostics; furthermore, other algorithms can be used instead of thresholding in order to segment the blood vessels which can increase the accuracy. This, however, is left as future work. It is worth noting that only the work in [5] has been implemented on an FPGA platform, has an accuracy of 91%, which is roughly the same as our reported accuracy but with a total speed up in performance of up to 50x. 5.3 Area and Power Overheads The proposed FPGA-based vessel segmentation architecture yields promising processing speed. To determine its requirements in terms of the targeted FPGA, we implemented the experimental architecture on a Spartan 6 XC65LX150T FPGA. Table II gives synthesis results for the entire system implemented on the FPGA. The table lists area results for slice LUTs, slice registers and DSP components in order to give a complete picture of the required hardware overheads associated with the system. It can be observed that the entire system utilizes 66% of the FPGA LUTs and 14% of the FPGA slice registers. The slice LUTs are dominated by the convolution unit, which consumes 60%, while the slice registers are dominated by the scanline buffer. Additionally, the system takes full advantage of the DSPs available on the FPGA platform to perform most of the multiplication operations of the convolution unit. We used the Xilinx X-Power Analyzer tool and the image data as input to obtain the average dynamic consumption figures. The total power needed for one frame is just 2.146W, making the design suitable for embedded portable applications [19,20]. We anticipate that an ASIC implementation will further increase the performance while reducing the power as well. Conclusively, synthesis and performance results indicate that vessel segmentation from retinal images can be effectively implemented on reconfigurable hardware platforms, with high potential for biometric applications that require real-time processing speed. The proposed system consumes a relatively reasonable number of resources when implemented on a typical low-end FPGA such as the Spartan-6, leaving room for the host application, let that be biometric or biomedical, to expand the 117

6 a. b. c. d. e. f. Fig.7. (a) Original retinal image used as input to the system. (b) Matched Filter Response. (c) B/W image with vessels segmented from the software version of the algorithm. (d) Image outlining the vessels based on the segmentation result. (e) B/W image with vessels segmented from the hardware version of the algorithm. (f) Image outlining the vessels based on the segmentation result. capabilities of the system by integrating registration and pattern recognition modules. 6. Conclusion and Future Work This paper presented a high performance retinal blood vessel segmentation accelerator architecture. The architecture features a highly-parallel match filtering unit, enabling real-time blood vessel tree extraction. FPGA-based evaluation suggests that the proposed architecture outperforms existing works, demonstrating the capabilities of reconfigurable hardware for the acceleration of such algorithms. Our on-going work focuses on incorporating pre-processing and post-processing retinal processing methods into the existing system, in order to further improve accuracy. The objective is to integrate registration and recognition algorithms and their efficient implementation on reconfigurable hardware, for portable biometric and biomedical applications. 7. ACKNOWLEDGMENTS This work was co-funded by the European Regional Development Fund and the Republic of Cyprus through the Research Promotion Foundation (Project ΝΕΑ ΥΠΟΔΟΜΗ/ΣΤΡΑΤΗ/0308/26). 8. REFERENCES [1] Usher, D. Y. Tosa, and M. Friedman, Ocular biometrics: simultaneous capture and analysis of the retina and iris, Advances in Biometrics: Sensors, Algorithms and Systems, Springer Publishers, London, UK, [2] Fraz, M.M. P. Remagnino, A. Hoppe, B. Uyyanonvara, A.R. Rudnicka, C.G. Owen, S.A. Barman, 2012, Blood vessel segmentation methodologies in retinal images A survey, Computer Methods and Programs in Biomedicine, Volume 108, Issue 1, [3] Perfetti, R, E. Ricci, D. Casali, G. Costantini, 2007, Cellular neural networks with virtual template expansion for retinal vessel segmentation, IEEE Trans. Circuits Syst., 54 (2), [4] Alonso-Montes, C, M Ortega, M.G. Penedo, D.L. Vilarino, 2008 Pixel parallel vessel tree extraction for a personal authentication system, Circuits and Systems,. ISCAS IEEE International Symposium on, vol., no., pp , [5] Nieto, A, V.M. Brea, D.L. Vilarino, 2009,FPGA-accelerated retinal vessel-tree extraction, Field Programmable Logic and Applications, FPL International Conference on, [6] Palomera-Perez, M.A., M.E. Martinez-Perez, H. Benitez-Perez, J.L. Ortega-Arjona, 2010, Parallel Multiscale Feature Extraction and Region Growing: Application in Retinal Blood Vessel Detection, IEEE Transactions on Information Technology in Biomedicine, vol.14, no.2, [7] Muhammed Gökhan Cinsdikici, Doğan Aydın, 2009, Detection of blood vessels in ophthalmoscope images using MF/ant (matched filter/ant colony) algorithm, Computer Methods and Programs in Biomedicine, Volume 96, Issue 2, November 2009, [8] Chaudhuri, S, S. Chatterjee, N. Katz, M. Nelson,; Goldbaum, M, 1989, Detection of blood vessels in retinal images using twodimensional matched filters, IEEE Transactions on Medical Imaging, 8 (3). pp ISSN [9] Raja J.B., C.G. Ravichandran, November 2001, Blood Vessel Segmentation For High Resolution Retinal Images, Inter. J. of Comp. Science Issues,vol.8, Issue 6, no 2. [10] Hijazi, M. H. A, F. Coenen, and Y. Zheng, 2010, Retinal image classification using a histogram based approach. In Proc. International Joint Conference on Neural Networks, pages IEEE.. [11] Niemeijer M., J.J. Staal, B.v. Ginneken, M. Loog, M.D. Abramoff, DRIVE: digital retinal images for vessel extraction, [12] Hoover, A.D. Kouznetsova, V. Goldbaum, M, March 2000, Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response, IEEE Transactions on Medical Imaging, vol.19, no.3, pp [13] MESSIDOR: Methods for Evaluating Segmentation and Indexing Techniques Dedicated to Retinal Ophthalmology, [14] Jain Anil, Lin Hong, and Sharath Pankanti Biometric identification. Commun. ACM 43, 2 (February 2000), DOI= / [15] Feng P., Y. Pan, B. Wei, W. Jin, and D. Mi, 2007, Enhancing retinal image by the contourlet transform. Pattern Recognition Letters, 4(28): [16] Kanski J.J., Clinical Opthalmology, 6th ed., Elsevier Health Sciences, London, UK, [17] Teng T., M. Lefley, D. Claremont, 2002, Progress towards automated diabetic occular screening: a review of image analysis and intelligent systems for diabetic retinopathy, Medical and Biological Engineering and Computing 40 (2002) [18] Heneghan C., J. Flynn, M. O'Keefe, M. Cahill, 2002, Characterization of changes in blood vessel width and tortuosity in retinopathy of prematurity using image analysis, Medical Image Analysis 6, [19] Fowers, J., Brown, G., Cooke, P., and Stitt, G. A Performance Energy Comparison of FPGAs, GPUs, and Multicores for Sliding- Window, Applications. Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays (FPGA'12), New York, NY, USA, [20] Maclean, W. J An Evaluation of the Suitability of FPGAs for Embedded Vision Systems. Proceedings of the 2005 IEEE Computer Society Conference on Computer vision and pattern recognition, San Diego,CA, 131. [21] Thisius, R.S., A, Kjaer-Nielsen, A, S, Sorensen, 2011, Real-time medical video processing, enabled by hardware accelerated correlations, Journal of Real-Time Image Processing, September 2011, Volume 6, Issue 3, pp

Retinal Vessel Segmentation from Simple to Difficult

Retinal Vessel Segmentation from Simple to Difficult University of Iowa Iowa Research Online Proceedings of the Ophthalmic Medical Image Analysis International Workshop 2016 Proceedings Oct 21st, 2016 Retinal Vessel Segmentation from Simple to Difficult

More information

MAX-MIN CENTRAL VEIN DETECTION IN RETINAL FUNDUS IMAGES

MAX-MIN CENTRAL VEIN DETECTION IN RETINAL FUNDUS IMAGES MAX-MIN CENTRAL VEIN DETECTION IN RETINAL FUNDUS IMAGES Hind Azegrouz a and Emanuele Trucco a a School of Engineering and Physical Sciences Department of Electrical Electronic and Computer Engineering

More information

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch

RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC. Zoltan Baruch RUN-TIME RECONFIGURABLE IMPLEMENTATION OF DSP ALGORITHMS USING DISTRIBUTED ARITHMETIC Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca, 26-28, Bariţiu St., 3400 Cluj-Napoca,

More information

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School

More information

Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study

Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study Bradley F. Dutton, Graduate Student Member, IEEE, and Charles E. Stroud, Fellow, IEEE Dept. of Electrical and Computer Engineering

More information

Parallel graph traversal for FPGA

Parallel graph traversal for FPGA LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,

More information

Tracking of Blood Vessels in Retinal Images Using Kalman Filter

Tracking of Blood Vessels in Retinal Images Using Kalman Filter Tracking of Blood Vessels in Retinal Images Using Kalman Filter Tamir Yedidya and Richard Hartley The Australian National University and National ICT Australia {tamir.yedidya, richard.hartley}@rsise.anu.edu.au

More information

Fixed-point Simulink Designs for Automatic HDL Generation of Binary Dilation & Erosion

Fixed-point Simulink Designs for Automatic HDL Generation of Binary Dilation & Erosion Fixed-point Simulink Designs for Automatic HDL Generation of Binary Dilation & Erosion Gurpreet Kaur, Nancy Gupta, and Mandeep Singh Abstract Embedded Imaging is a technique used to develop image processing

More information

Supporting Multithreading in Configurable Soft Processor Cores

Supporting Multithreading in Configurable Soft Processor Cores Supporting Multithreading in Configurable Soft Processor Cores Roger Moussali, Nabil Ghanem, and Mazen A. R. Saghir Department of Electrical and Computer Engineering American University of Beirut P.O.

More information

FPGA Implementation of a Single Pass Real-Time Blob Analysis Using Run Length Encoding

FPGA Implementation of a Single Pass Real-Time Blob Analysis Using Run Length Encoding FPGA Implementation of a Single Pass Real-Time J. Trein *, A. Th. Schwarzbacher + and B. Hoppe * Department of Electronic and Computer Science, Hochschule Darmstadt, Germany *+ School of Electronic and

More information

Mobile Robot Path Planning Software and Hardware Implementations

Mobile Robot Path Planning Software and Hardware Implementations Mobile Robot Path Planning Software and Hardware Implementations Lucia Vacariu, Flaviu Roman, Mihai Timar, Tudor Stanciu, Radu Banabic, Octavian Cret Computer Science Department, Technical University of

More information

direct hardware mapping of cnns on fpga-based smart cameras

direct hardware mapping of cnns on fpga-based smart cameras direct hardware mapping of cnns on fpga-based smart cameras Workshop on Architecture of Smart Cameras Kamel ABDELOUAHAB, Francois BERRY, Maxime PELCAT, Jocelyn SEROT, Jean-Charles QUINTON Cordoba, June

More information

e-issn: p-issn:

e-issn: p-issn: Available online at www.ijiere.com International Journal of Innovative and Emerging Research in Engineering e-issn: 2394-3343 p-issn: 2394-5494 Edge Detection Using Canny Algorithm on FPGA Ms. AASIYA ANJUM1

More information

Journal of Engineering Technology Volume 6, Special Issue on Technology Innovations and Applications Oct. 2017, PP

Journal of Engineering Technology Volume 6, Special Issue on Technology Innovations and Applications Oct. 2017, PP Oct. 07, PP. 00-05 Implementation of a digital neuron using system verilog Azhar Syed and Vilas H Gaidhane Department of Electrical and Electronics Engineering, BITS Pilani Dubai Campus, DIAC Dubai-345055,

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Parallel Architecture & Programing Models for Face Recognition

Parallel Architecture & Programing Models for Face Recognition Parallel Architecture & Programing Models for Face Recognition Submitted by Sagar Kukreja Computer Engineering Department Rochester Institute of Technology Agenda Introduction to face recognition Feature

More information

Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path

Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path Accelerating DSP Applications in Embedded Systems with a Coprocessor Data-Path Michalis D. Galanis, Gregory Dimitroulakos, and Costas E. Goutis VLSI Design Laboratory, Electrical and Computer Engineering

More information

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning Justin Chen Stanford University justinkchen@stanford.edu Abstract This paper focuses on experimenting with

More information

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression

FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression FPGA Implementation of Multiplierless 2D DWT Architecture for Image Compression Divakara.S.S, Research Scholar, J.S.S. Research Foundation, Mysore Cyril Prasanna Raj P Dean(R&D), MSEC, Bangalore Thejas

More information

A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications

A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications Jeremy Fowers, Greg Brown, Patrick Cooke, Greg Stitt University of Florida Department of Electrical and

More information

University of Cambridge Engineering Part IIB Module 4F12 - Computer Vision and Robotics Mobile Computer Vision

University of Cambridge Engineering Part IIB Module 4F12 - Computer Vision and Robotics Mobile Computer Vision report University of Cambridge Engineering Part IIB Module 4F12 - Computer Vision and Robotics Mobile Computer Vision Web Server master database User Interface Images + labels image feature algorithm Extract

More information

Blood vessel tracking in retinal images

Blood vessel tracking in retinal images Y. Jiang, A. Bainbridge-Smith, A. B. Morris, Blood Vessel Tracking in Retinal Images, Proceedings of Image and Vision Computing New Zealand 2007, pp. 126 131, Hamilton, New Zealand, December 2007. Blood

More information

Hardware Acceleration of Feature Detection and Description Algorithms on Low Power Embedded Platforms

Hardware Acceleration of Feature Detection and Description Algorithms on Low Power Embedded Platforms Hardware Acceleration of Feature Detection and Description Algorithms on LowPower Embedded Platforms Onur Ulusel, Christopher Picardo, Christopher Harris, Sherief Reda, R. Iris Bahar, School of Engineering,

More information

Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders

Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders Vol. 3, Issue. 4, July-august. 2013 pp-2266-2270 ISSN: 2249-6645 Designing and Characterization of koggestone, Sparse Kogge stone, Spanning tree and Brentkung Adders V.Krishna Kumari (1), Y.Sri Chakrapani

More information

Using Intel Streaming SIMD Extensions for 3D Geometry Processing

Using Intel Streaming SIMD Extensions for 3D Geometry Processing Using Intel Streaming SIMD Extensions for 3D Geometry Processing Wan-Chun Ma, Chia-Lin Yang Dept. of Computer Science and Information Engineering National Taiwan University firebird@cmlab.csie.ntu.edu.tw,

More information

Improving Reconfiguration Speed for Dynamic Circuit Specialization using Placement Constraints

Improving Reconfiguration Speed for Dynamic Circuit Specialization using Placement Constraints Improving Reconfiguration Speed for Dynamic Circuit Specialization using Placement Constraints Amit Kulkarni, Tom Davidson, Karel Heyse, and Dirk Stroobandt ELIS department, Computer Systems Lab, Ghent

More information

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK Mahamuni P. D 1, R. P. Patil 2, H.S. Thakar 3 1 PG Student, E & TC Department, SKNCOE, Vadgaon Bk, Pune, India 2 Asst. Professor,

More information

An FPGA based Minutiae Extraction System for Fingerprint Recognition

An FPGA based Minutiae Extraction System for Fingerprint Recognition An FPGA based Minutiae Extraction System for Fingerprint Recognition Yousra Wakil Sehar Gul Tariq Aniza Humayun Naeem Abbas National University of Sciences and Technology Karsaz Road, ABSTRACT Fingerprint

More information

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers

IEEE-754 compliant Algorithms for Fast Multiplication of Double Precision Floating Point Numbers International Journal of Research in Computer Science ISSN 2249-8257 Volume 1 Issue 1 (2011) pp. 1-7 White Globe Publications www.ijorcs.org IEEE-754 compliant Algorithms for Fast Multiplication of Double

More information

AUTOMATIC EXTRACTION OF RETINAL BLOOD VESSELS: A SOFTWARE IMPLEMENTATION

AUTOMATIC EXTRACTION OF RETINAL BLOOD VESSELS: A SOFTWARE IMPLEMENTATION AUTOMATIC EXTRACTION OF RETINAL BLOOD VESSELS: A SOFTWARE IMPLEMENTATION Bahadir Karasulu, PhD Assist. Prof. Dr. at Department of Computer Engineering, Faculty of Engineering, Canakkale Onsekiz Mart University,

More information

Time Stamp Detection and Recognition in Video Frames

Time Stamp Detection and Recognition in Video Frames Time Stamp Detection and Recognition in Video Frames Nongluk Covavisaruch and Chetsada Saengpanit Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand E-mail: nongluk.c@chula.ac.th

More information

FPGA architecture and design technology

FPGA architecture and design technology CE 435 Embedded Systems Spring 2017 FPGA architecture and design technology Nikos Bellas Computer and Communications Engineering Department University of Thessaly 1 FPGA fabric A generic island-style FPGA

More information

16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE.

16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE. 16 BIT IMPLEMENTATION OF ASYNCHRONOUS TWOS COMPLEMENT ARRAY MULTIPLIER USING MODIFIED BAUGH-WOOLEY ALGORITHM AND ARCHITECTURE. AditiPandey* Electronics & Communication,University Institute of Technology,

More information

Pupil Boundary Detection for Iris Recognition Using Graph Cuts

Pupil Boundary Detection for Iris Recognition Using Graph Cuts H. Mehrabian, P. Hashemi-Tari, Pupil Boundary Detection for Iris Recognition Using Graph Cuts, Proceedings of Image and Vision Computing New Zealand 2007, pp. 77 82, Hamilton, New Zealand, December 2007.

More information

An Approach for Reduction of Rain Streaks from a Single Image

An Approach for Reduction of Rain Streaks from a Single Image An Approach for Reduction of Rain Streaks from a Single Image Vijayakumar Majjagi 1, Netravati U M 2 1 4 th Semester, M. Tech, Digital Electronics, Department of Electronics and Communication G M Institute

More information

Fingerprint Image Enhancement Algorithm and Performance Evaluation

Fingerprint Image Enhancement Algorithm and Performance Evaluation Fingerprint Image Enhancement Algorithm and Performance Evaluation Naja M I, Rajesh R M Tech Student, College of Engineering, Perumon, Perinad, Kerala, India Project Manager, NEST GROUP, Techno Park, TVM,

More information

Stereo Video Processing for Depth Map

Stereo Video Processing for Depth Map Stereo Video Processing for Depth Map Harlan Hile and Colin Zheng University of Washington Abstract This paper describes the implementation of a stereo depth measurement algorithm in hardware on Field-Programmable

More information

Implimentation of A 16-bit RISC Processor for Convolution Application

Implimentation of A 16-bit RISC Processor for Convolution Application Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 4, Number 5 (2014), pp. 441-446 Research India Publications http://www.ripublication.com/aeee.htm Implimentation of A 16-bit RISC

More information

Real-Time Lane Departure and Front Collision Warning System on an FPGA

Real-Time Lane Departure and Front Collision Warning System on an FPGA Real-Time Lane Departure and Front Collision Warning System on an FPGA Jin Zhao, Bingqian ie and inming Huang Department of Electrical and Computer Engineering Worcester Polytechnic Institute, Worcester,

More information

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

SDA: Software-Defined Accelerator for Large- Scale DNN Systems SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A

More information

Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm

Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm International Journal of Scientific and Research Publications, Volume 3, Issue 8, August 2013 1 Low-Power Adaptive Viterbi Decoder for TCM Using T-Algorithm MUCHHUMARRI SANTHI LATHA*, Smt. D.LALITHA KUMARI**

More information

IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM

IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM IMAGE PROCESSING USING DISCRETE WAVELET TRANSFORM Prabhjot kour Pursuing M.Tech in vlsi design from Audisankara College of Engineering ABSTRACT The quality and the size of image data is constantly increasing.

More information

A GPU-based implementation of the MRF algorithm in ITK package

A GPU-based implementation of the MRF algorithm in ITK package J Supercomput DOI 10.1007/s11227-011-0597-1 A GPU-based implementation of the MRF algorithm in ITK package Pedro Valero José L. Sánchez Diego Cazorla Enrique Arias Springer Science+Business Media, LLC

More information

FPGA Provides Speedy Data Compression for Hyperspectral Imagery

FPGA Provides Speedy Data Compression for Hyperspectral Imagery FPGA Provides Speedy Data Compression for Hyperspectral Imagery Engineers implement the Fast Lossless compression algorithm on a Virtex-5 FPGA; this implementation provides the ability to keep up with

More information

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Naveen Suda, Vikas Chandra *, Ganesh Dasika *, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, Yu

More information

Evaluation of FPGA Resources for Built-In Self-Test of Programmable Logic Blocks

Evaluation of FPGA Resources for Built-In Self-Test of Programmable Logic Blocks Evaluation of FPGA Resources for Built-In Self-Test of Programmable Logic Blocks Charles Stroud, Ping Chen, Srinivasa Konala, Dept. of Electrical Engineering University of Kentucky and Miron Abramovici

More information

Automated Canvas Analysis for Painting Conservation. By Brendan Tobin

Automated Canvas Analysis for Painting Conservation. By Brendan Tobin Automated Canvas Analysis for Painting Conservation By Brendan Tobin 1. Motivation Distinctive variations in the spacings between threads in a painting's canvas can be used to show that two sections of

More information

VHDL IMPLEMENTATION FOR EDGE DETECTION USING LOG GABOR FILTER FOR DISEASE DETECTION

VHDL IMPLEMENTATION FOR EDGE DETECTION USING LOG GABOR FILTER FOR DISEASE DETECTION VHDL IMPLEMENTATION FOR EDGE DETECTION USING LOG GABOR FILTER FOR DISEASE DETECTION V.V.Kumbhalwar 1, S.R.Dixit 2 1 M.Tech Student, Communication Engineering, Department of Electronics & Telecommunication

More information

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite Parallelization of 3-D Range Image Segmentation on a SIMD Multiprocessor Vipin Chaudhary and Sumit Roy Bikash Sabata Parallel and Distributed Computing Laboratory SRI International Wayne State University

More information

On a fast discrete straight line segment detection

On a fast discrete straight line segment detection On a fast discrete straight line segment detection Ali Abdallah, Roberto Cardarelli, Giulio Aielli University of Rome Tor Vergata Abstract Detecting lines is one of the fundamental problems in image processing.

More information

Implementation of Fingerprint Matching Algorithm

Implementation of Fingerprint Matching Algorithm RESEARCH ARTICLE International Journal of Engineering and Techniques - Volume 2 Issue 2, Mar Apr 2016 Implementation of Fingerprint Matching Algorithm Atul Ganbawle 1, Prof J.A. Shaikh 2 Padmabhooshan

More information

Eliminating False Loops Caused by Sharing in Control Path

Eliminating False Loops Caused by Sharing in Control Path Eliminating False Loops Caused by Sharing in Control Path ALAN SU and YU-CHIN HSU University of California Riverside and TA-YUNG LIU and MIKE TIEN-CHIEN LEE Avant! Corporation In high-level synthesis,

More information

Motion Detection Algorithm

Motion Detection Algorithm Volume 1, No. 12, February 2013 ISSN 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/ Motion Detection

More information

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors G. Chen 1, M. Kandemir 1, I. Kolcu 2, and A. Choudhary 3 1 Pennsylvania State University, PA 16802, USA 2 UMIST,

More information

Blood vessel segmentation methodologies in retinal images A survey

Blood vessel segmentation methodologies in retinal images A survey c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 8 ( 2 0 1 2 ) 407 433 jo ur n al hom ep age : www.intl.elsevierhealth.com/journals/cmpb Blood vessel segmentation methodologies

More information

Gurmeet Kaur 1, Parikshit 2, Dr. Chander Kant 3 1 M.tech Scholar, Assistant Professor 2, 3

Gurmeet Kaur 1, Parikshit 2, Dr. Chander Kant 3 1 M.tech Scholar, Assistant Professor 2, 3 Volume 8 Issue 2 March 2017 - Sept 2017 pp. 72-80 available online at www.csjournals.com A Novel Approach to Improve the Biometric Security using Liveness Detection Gurmeet Kaur 1, Parikshit 2, Dr. Chander

More information

Canny Edge Detection Algorithm on FPGA

Canny Edge Detection Algorithm on FPGA IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 1, Ver. 1 (Jan - Feb. 2015), PP 15-19 www.iosrjournals.org Canny Edge Detection

More information

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient

Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient ISSN (Online) : 2278-1021 Implementation of Ripple Carry and Carry Skip Adders with Speed and Area Efficient PUSHPALATHA CHOPPA 1, B.N. SRINIVASA RAO 2 PG Scholar (VLSI Design), Department of ECE, Avanthi

More information

ABSTRACT I. INTRODUCTION. 905 P a g e

ABSTRACT I. INTRODUCTION. 905 P a g e Design and Implements of Booth and Robertson s multipliers algorithm on FPGA Dr. Ravi Shankar Mishra Prof. Puran Gour Braj Bihari Soni Head of the Department Assistant professor M.Tech. scholar NRI IIST,

More information

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION 6 NEURAL NETWORK BASED PATH PLANNING ALGORITHM 61 INTRODUCTION In previous chapters path planning algorithms such as trigonometry based path planning algorithm and direction based path planning algorithm

More information

Embedded Systems. 7. System Components

Embedded Systems. 7. System Components Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

FPGA-Based Feature Detection

FPGA-Based Feature Detection FPGA-Based Feature Detection Wennie Tabib School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 wtabib@andrew.cmu.edu Abstract Fast, accurate, autonomous robot navigation is essential

More information

Component-based Face Recognition with 3D Morphable Models

Component-based Face Recognition with 3D Morphable Models Component-based Face Recognition with 3D Morphable Models B. Weyrauch J. Huang benjamin.weyrauch@vitronic.com jenniferhuang@alum.mit.edu Center for Biological and Center for Biological and Computational

More information

Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors

Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Francisco Barat, Murali Jayapala, Pieter Op de Beeck and Geert Deconinck K.U.Leuven, Belgium. {f-barat, j4murali}@ieee.org,

More information

FPGA-based Smart Camera System for Real-time Automated Video Surveillance

FPGA-based Smart Camera System for Real-time Automated Video Surveillance FPGA-based Smart Camera System for Real-time Automated Video Surveillance Sanjay Singh*, Sumeet Saurav, Ravi Saini, Atanendu S. Mandal, Santanu Chaudhury 1 CSIR-Central Electronics Engineering Research

More information

Image Enhancement Techniques for Fingerprint Identification

Image Enhancement Techniques for Fingerprint Identification March 2013 1 Image Enhancement Techniques for Fingerprint Identification Pankaj Deshmukh, Siraj Pathan, Riyaz Pathan Abstract The aim of this paper is to propose a new method in fingerprint enhancement

More information

Semi-Supervised PCA-based Face Recognition Using Self-Training

Semi-Supervised PCA-based Face Recognition Using Self-Training Semi-Supervised PCA-based Face Recognition Using Self-Training Fabio Roli and Gian Luca Marcialis Dept. of Electrical and Electronic Engineering, University of Cagliari Piazza d Armi, 09123 Cagliari, Italy

More information

HIGH SPEED TDI EMBEDDED CCD IN CMOS SENSOR

HIGH SPEED TDI EMBEDDED CCD IN CMOS SENSOR HIGH SPEED TDI EMBEDDED CCD IN CMOS SENSOR P. Boulenc 1, J. Robbelein 1, L. Wu 1, L. Haspeslagh 1, P. De Moor 1, J. Borremans 1, M. Rosmeulen 1 1 IMEC, Kapeldreef 75, B-3001 Leuven, Belgium Email: pierre.boulenc@imec.be,

More information

An Implementation of Double precision Floating point Adder & Subtractor Using Verilog

An Implementation of Double precision Floating point Adder & Subtractor Using Verilog IOSR Journal of Electrical and Electronics Engineering (IOSR-JEEE) e-issn: 2278-1676,p-ISSN: 2320-3331, Volume 9, Issue 4 Ver. III (Jul Aug. 2014), PP 01-05 An Implementation of Double precision Floating

More information

Abstract. Literature Survey. Introduction. A.Radix-2/8 FFT algorithm for length qx2 m DFTs

Abstract. Literature Survey. Introduction. A.Radix-2/8 FFT algorithm for length qx2 m DFTs Implementation of Split Radix algorithm for length 6 m DFT using VLSI J.Nancy, PG Scholar,PSNA College of Engineering and Technology; S.Bharath,Assistant Professor,PSNA College of Engineering and Technology;J.Wilson,Assistant

More information

Ensemble registration: Combining groupwise registration and segmentation

Ensemble registration: Combining groupwise registration and segmentation PURWANI, COOTES, TWINING: ENSEMBLE REGISTRATION 1 Ensemble registration: Combining groupwise registration and segmentation Sri Purwani 1,2 sri.purwani@postgrad.manchester.ac.uk Tim Cootes 1 t.cootes@manchester.ac.uk

More information

Vessel Junction Detection From Retinal Images

Vessel Junction Detection From Retinal Images Vessel Junction Detection From Retinal Images Yuexiong Tao Faculty of Computer Science Dalhousie University Halifax, Nova Scotia, CA B3H 1W5 E-mail: yuexiong@cs.dal.ca Qigang Gao Faculty of Computer Science

More information

Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs

Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Vendor Agnostic, High Performance, Double Precision Floating Point Division for FPGAs Xin Fang and Miriam Leeser Dept of Electrical and Computer Eng Northeastern University Boston, Massachusetts 02115

More information

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition Linear Discriminant Analysis in Ottoman Alphabet Character Recognition ZEYNEB KURT, H. IREM TURKMEN, M. ELIF KARSLIGIL Department of Computer Engineering, Yildiz Technical University, 34349 Besiktas /

More information

International Journal of Advance Engineering and Research Development. Iris Recognition and Automated Eye Tracking

International Journal of Advance Engineering and Research Development. Iris Recognition and Automated Eye Tracking International Journal of Advance Engineering and Research Development Scientific Journal of Impact Factor (SJIF): 4.72 Special Issue SIEICON-2017,April -2017 e-issn : 2348-4470 p-issn : 2348-6406 Iris

More information

Profiling-Based L1 Data Cache Bypassing to Improve GPU Performance and Energy Efficiency

Profiling-Based L1 Data Cache Bypassing to Improve GPU Performance and Energy Efficiency Profiling-Based L1 Data Cache Bypassing to Improve GPU Performance and Energy Efficiency Yijie Huangfu and Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University {huangfuy2,wzhang4}@vcu.edu

More information

A Simple Method to Improve the throughput of A Multiplier

A Simple Method to Improve the throughput of A Multiplier International Journal of Electronics and Communication Engineering. ISSN 0974-2166 Volume 6, Number 1 (2013), pp. 9-16 International Research Publication House http://www.irphouse.com A Simple Method to

More information

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction

More information

Introduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013

Introduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013 Introduction to FPGA Design with Vivado High-Level Synthesis Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products.

More information

Resource-efficient Acceleration of 2-Dimensional Fast Fourier Transform Computations on FPGAs

Resource-efficient Acceleration of 2-Dimensional Fast Fourier Transform Computations on FPGAs In Proceedings of the International Conference on Distributed Smart Cameras, Como, Italy, August 2009. Resource-efficient Acceleration of 2-Dimensional Fast Fourier Transform Computations on FPGAs Hojin

More information

RECONFIGURABLE ARCHITECTURE OF 2D- ADAPTIVE MEDIAN FILTER BASED IMAGE DENOISING

RECONFIGURABLE ARCHITECTURE OF 2D- ADAPTIVE MEDIAN FILTER BASED IMAGE DENOISING RECONFIGURABLE ARCHITECTURE OF 2D- ADAPTIVE MEDIAN FILTER BASED IMAGE DENOISING P.Karthikeyan Department of Electronics and Communication Engineering Velammal college of Engg & tech Madurai, India S.Vasuki

More information

Cellular Learning Automata-Based Color Image Segmentation using Adaptive Chains

Cellular Learning Automata-Based Color Image Segmentation using Adaptive Chains Cellular Learning Automata-Based Color Image Segmentation using Adaptive Chains Ahmad Ali Abin, Mehran Fotouhi, Shohreh Kasaei, Senior Member, IEEE Sharif University of Technology, Tehran, Iran abin@ce.sharif.edu,

More information

REVIEWING PREPROCESSING AND FEATURE EXTRACTION TECHNIQUES FOR RETINAL BLOOD VESSELS SEGMENTATION IN FUNDUS IMAGES

REVIEWING PREPROCESSING AND FEATURE EXTRACTION TECHNIQUES FOR RETINAL BLOOD VESSELS SEGMENTATION IN FUNDUS IMAGES REVIEWING PREPROCESSING AND FEATURE EXTRACTION TECHNIQUES FOR RETINAL BLOOD VESSELS SEGMENTATION IN FUNDUS IMAGES José Ignacio Orlando a,b and Mariana del Fresno a,c a Instituto Pladema, Universidad Nacional

More information

A Partitioning Flow for Accelerating Applications in Processor-FPGA Systems

A Partitioning Flow for Accelerating Applications in Processor-FPGA Systems A Partitioning Flow for Accelerating Applications in Processor-FPGA Systems MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical & Computer Engineering

More information

A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms

A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms Jingzhao Ou and Viktor K. Prasanna Department of Electrical Engineering, University of Southern California Los Angeles, California,

More information

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment LETTER IEICE Electronics Express, Vol.11, No.2, 1 9 A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment Ting Chen a), Hengzhu Liu, and Botao Zhang College of

More information

Image Retrieval Based on LBP Pyramidal Multiresolution using Reversible Watermarking

Image Retrieval Based on LBP Pyramidal Multiresolution using Reversible Watermarking Image Retrieval Based on LBP Pyramidal Multiresolution using Reversible Watermarking H. Ouahi* 1, K. Afdel* 2,M.Machkour* 3 * Laboratory Computing Systems & Vision LabSiv University Ibn Zohr of Agadir

More information

Object Purpose Based Grasping

Object Purpose Based Grasping Object Purpose Based Grasping Song Cao, Jijie Zhao Abstract Objects often have multiple purposes, and the way humans grasp a certain object may vary based on the different intended purposes. To enable

More information

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation. ISSN 2319-8885 Vol.03,Issue.32 October-2014, Pages:6436-6440 www.ijsetr.com Design and Modeling of Arithmetic and Logical Unit with the Platform of VLSI N. AMRUTHA BINDU 1, M. SAILAJA 2 1 Dept of ECE,

More information

Robust Steganography Using Texture Synthesis

Robust Steganography Using Texture Synthesis Robust Steganography Using Texture Synthesis Zhenxing Qian 1, Hang Zhou 2, Weiming Zhang 2, Xinpeng Zhang 1 1. School of Communication and Information Engineering, Shanghai University, Shanghai, 200444,

More information

The Pupil Location Based on the OTSU Method and Hough Transform

The Pupil Location Based on the OTSU Method and Hough Transform Available online at www.sciencedirect.com Procedia Environmental Sciences 8 (11) 35 356 ICESB 11: 5-6 November 11, Maldives The Pupil Location Based on the OTSU Method and Hough Transform Zhonghua Lin

More information

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol.

Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. 6937, 69370N, DOI: http://dx.doi.org/10.1117/12.784572 ) and is made

More information

Parameterized Convolution Filtering in a Field Programmable Gate Array

Parameterized Convolution Filtering in a Field Programmable Gate Array Parameterized Convolution Filtering in a Field Programmable Gate Array Richard G. Shoup Interval Research Palo Alto, California 94304 Abstract This paper discusses the simple idea of parameterized program

More information

Stacked Integral Image

Stacked Integral Image 2010 IEEE International Conference on Robotics and Automation Anchorage Convention District May 3-8, 2010, Anchorage, Alaska, USA Stacked Integral Image Amit Bhatia, Wesley E. Snyder and Griff Bilbro Abstract

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Motion estimation for video compression

Motion estimation for video compression Motion estimation for video compression Blockmatching Search strategies for block matching Block comparison speedups Hierarchical blockmatching Sub-pixel accuracy Motion estimation no. 1 Block-matching

More information

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology EE382C: Embedded Software Systems Final Report David Brunke Young Cho Applied Research Laboratories:

More information

Computer Organization and Assembly Language

Computer Organization and Assembly Language Computer Organization and Assembly Language Week 01 Nouman M Durrani COMPUTER ORGANISATION AND ARCHITECTURE Computer Organization describes the function and design of the various units of digital computers

More information

Design and Implementation of FPGA- based Systolic Array for LZ Data Compression

Design and Implementation of FPGA- based Systolic Array for LZ Data Compression Design and Implementation of FPGA- based Systolic Array for LZ Data Compression Mohamed A. Abd El ghany Electronics Dept. German University in Cairo Cairo, Egypt E-mail: mohamed.abdel-ghany@guc.edu.eg

More information