GIST. GPU Implementation. Prakhar Jain ( ) Ejaz Ahmed ( ) 3 rd May, 2009

Similar documents
Visual localization using global visual features and vanishing points

Statistics of Natural Image Categories

Tag Recommendation for Photos

ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING. Wei Liu, Serkan Kiranyaz and Moncef Gabbouj

Large-Scale Scene Classification Using Gist Feature

Scene Recognition using Bag-of-Words

Computer Vision for VLFeat and more...

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials

Templates, Image Pyramids, and Filter Banks

Evaluation of GIST descriptors for web scale image search

Content Based Image Retrieval

Contextual priming for artificial visual perception

Unsupervised Deep Learning for Scene Recognition

High performance 2D Discrete Fourier Transform on Heterogeneous Platforms. Shrenik Lad, IIIT Hyderabad Advisor : Dr. Kishore Kothapalli

Implementing a Speech Recognition System on a GPU using CUDA. Presented by Omid Talakoub Astrid Yi

Scene-Centered Description from Spatial Envelope Properties

Every Picture Tells a Story: Generating Sentences from Images

Lecture 2: 2D Fourier transforms and applications

Digital Image Processing. Image Enhancement in the Frequency Domain

International Journal of Computer Science and Network (IJCSN) Volume 1, Issue 4, August ISSN

Visual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space.

Visual Object Recognition

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Periocular Biometrics: When Iris Recognition Fails

Human detection solution for a retail store environment

Using the Forest to See the Trees: Context-based Object Recognition

Spatial Hierarchy of Textons Distributions for Scene Classification

Image processing in frequency Domain

A Scene Recognition Algorithm Based on Covariance Descriptor

SUBSET SELECTION FOR LANDMARK MODERN AND HISTORIC IMAGES

Beyond bags of Features

ACCELERATION OF IMAGE RESTORATION ALGORITHMS FOR DYNAMIC MEASUREMENTS IN COORDINATE METROLOGY BY USING OPENCV GPU FRAMEWORK

Schedule for Rest of Semester

Texture Segmentation

Image Enhancement Techniques for Fingerprint Identification

SUPPLEMENTARY MATERIAL FOR. Do computer vision models differ systematically from human object perception? RT Pramod 1,2 & SP Arun 1

An Introduction to Content Based Image Retrieval

Real-Time Detection of Landscape Scenes

Advanced CUDA Optimization 1. Introduction

Texture. COS 429 Princeton University

Weed Seeds Recognition via Support Vector Machine and Random Forest

GPU Based Face Recognition System for Authentication

Fingerprint Verification applying Invariant Moments

Aerial Image Classification Using Structural Texture Similarity

Computational Methods for Radiance. Render the full variety offered by the direct observation of objects. (Computationally).

5. Feature Extraction from Images

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Fourier Transform and Texture Filtering

Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

Advanced Video Content Analysis and Video Compression (5LSH0), Module 4

USING THE GPU FOR FAST SYMMETRY-BASED DENSE STEREO MATCHING IN HIGH RESOLUTION IMAGES

Part-based and local feature models for generic object recognition

CS 223B Computer Vision Problem Set 3

Short Survey on Static Hand Gesture Recognition

NumbaPro CUDA Python. Square matrix multiplication

Computer vision: models, learning and inference. Chapter 13 Image preprocessing and feature extraction

Artifacts and Textured Region Detection

Practical Image and Video Processing Using MATLAB

CS 229 Classification of Channel Bifurcation Points in Remote Sensing Imagery of River Deltas. Erik Nesvold

A SYNOPTIC ACCOUNT FOR TEXTURE SEGMENTATION: FROM EDGE- TO REGION-BASED MECHANISMS

ELL 788 Computational Perception & Cognition July November 2015

Comparing Local Feature Descriptors in plsa-based Image Models

Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs

Set Size, Clutter & Complexity

A Content Based Image Retrieval System Based on Color Features

ImageCLEF 2011

Face Detection CUDA Accelerating

Sar Image Segmentation Using Hierarchical Unequal Merging

B. Tech. Project Second Stage Report on

Dimensionality Reduction using Relative Attributes

Discriminative classifiers for image recognition

Indoor Outdoor Image Classification

Latest development in image feature representation and extraction

CS 231A Computer Vision (Fall 2012) Problem Set 3

Texture. Outline. Image representations: spatial and frequency Fourier transform Frequency filtering Oriented pyramids Texture representation

Patch-based Object Recognition. Basic Idea

CHAPTER 5 GLOBAL AND LOCAL FEATURES FOR FACE RECOGNITION

Classification of objects from Video Data (Group 30)

High Performance Video Artifact Detection Enhanced with CUDA. Atul Ravindran Digimetrics

CHAPTER 1 Introduction 1. CHAPTER 2 Images, Sampling and Frequency Domain Processing 37

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors

TEXTURE. Plan for today. Segmentation problems. What is segmentation? INF 4300 Digital Image Analysis. Why texture, and what is it?

TEXTURE ANALYSIS USING GABOR FILTERS

Frequency analysis, pyramids, texture analysis, applications (face detection, category recognition)

Dither Removal. Bart M. ter Haar Romeny. Image Dithering. This article has not been updated for Mathematica 8.

Coarse-to-fine image registration

Final Project Report: Filterbank-Based Fingerprint Matching

Learning global properties of scene images based on their correlational structures

Local Features Tutorial: Nov. 8, 04

Multimedia Information Retrieval

Content Based Image Retrieval Using Curvelet Transform

Automatic Classification of Outdoor Images by Region Matching

high performance medical reconstruction using stream programming paradigms

Scene segmentation and pedestrian classification from 3-D range and intensity images

CS534: Introduction to Computer Vision Edges and Contours. Ahmed Elgammal Dept. of Computer Science Rutgers University

Modeling Image Context using Object Centered Grid

Continuous Visual Vocabulary Models for plsa-based Scene Recognition

Why is computer vision difficult?

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Transcription:

GIST GPU Implementation Prakhar Jain ( 200601066 ) Ejaz Ahmed ( 200601028 ) 3 rd May, 2009 International Institute Of Information Technology, Hyderabad

Table of Contents S. No. Topic Page No. 1 Abstract 3 2 Introduction 4 3 Basic Algorithm 4 4 Parallelization 4 5 Getting Descriptor 5 6 Graphs 8 7 Speed Up 9 8 Precision / Accuracy 9 9 Related work Refrences 9 Page 2 of 10

ABSTRACT GIST is a computational model of the recognition of real world scenes that bypasses the segmentation and the processing of individual objects or regions. The procedure is based on a very low dimensional representation of the scene called the Spatial Envelope. Torralba proposed a set of perceptual dimensions (naturalness, openness, roughness, expansion, ruggedness) that represent the dominant spatial structure of a scene. Then, he has shown that these dimensions may be reliably estimated using spectral and coarsely localized information. The model generates a multidimensional space in which scenes sharing membership in semantic categories (e.g., streets, highways, coasts) are projected closed together. The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category. The implementation the authors have given is in Matlab which runs on the CPU.The objective of this project is to parallelize it using GPGPU ( CUDA ). Page 3 of 10

Introduction Many of the Computer Vision algorthims are computationally expensive.algorithms such as Adaboost training, SIFT Feature training, SFM etc require weeks to execute. But, these algorithms are inherently parallelizable. In this project we have parallelized the GIST feature exploiting the computation power of the GPU with the help of Nvidia s CUDA. The rest of the report descrives the Basic algorithm, flowchart and our approach to parallelize it. We conclude it with some results, speed ups and graphs to support the performance claims. BASIC ALGORITHM Create Gabor filter bank Create a Parameter Matrix Create transfer functions Preprocess Image Low contrast normalization Local luminance variance normalization Getting the descriptor For each filter Convolve image with filter Divide the image in blocks Mean of each block is the corresponding feature PARALLELIZATION Pixel level parallelization. Use one thread to calculate the value of one element of a Gabor filter. Number of threads will be equal to number of filters * size_of_image * size_of_image Preprocessing the image Page 4 of 10

o Fourier transform * gaussian ( element by element ) o Pixel level parallelization due to inter pixel independence. Getting the Descriptor:-> For calculating the actual descriptors three things are required:- Preprocessing of image to obtain normalized image suitable for calculating features. Calculation of Gabor filters for every orientation and scale. Number of bocks in which the image should be devided. Steps For Getting the Descriptor:- 1. Getting the Fourier Transform of an image:- For calculating the features the processing is done in frequency domain.therefore the image needs to be converted from spacial to frequency domain. Image R G B components R G B Components Spatial Domain Spatial Domain Frequency Domain Fourier Transform of each component of image is calculated using CUFFT Library's function, which is 2 times as fast as CPU implementation of FFT. 2. Applying the Filters on Images:- Number of scales Ns Number of Orientation Per Scale No Total Number of Filters Nf=Ns * No Number of Channels 3 Page 5 of 10

Each filter is applied to each channel of image to give Nf*3 images.this is done on GPU with the help of following Kernels Element Multiplication:- Number of Threads Created = ImageSize * imagesize*nf R G B image components in Frquency Domain Nf Number of Filters (i) Element Multiplication Result Of Element Multiplication in Frequency domain. Taking Inverse FFT of Nf * 3 Images :- Inverse Fourier Transform of Nf * 3 images is calculated using CUFFT Library's function, which is 2 times as fast as CPU implementation of IFFT. Taking Absolute of Every image :- For this absolute kernel is used. Total Number of threads created = ImageSize * ImageSize * Nf * 3. Page 6 of 10

3. Getting The Features:- For every Nf * 3 images (obtained from prev step) Devide the image in blocks. Mean Value of each block is the one feature. mean value of each block is one feature For calculating mean CUDPP's SEGMENTED SCAN is Used. Page 7 of 10

Graphs Page 8 of 10

SPEED UP Precision / Accuracy The values are found to be accurate upto 4 places of decimal when compared to results of Torralba s Matlab code. Related Work and Refrences Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope Aude Oliva, Antonio Torralba International Journal of Computer Vision, Vol. 42(3): 145-175, 2001 http://cvcl.mit.edu/papers/ijcv01-oliva-torralba.pdf CPU Implementation: http://people.csail.mit.edu/torralba/code/spatialenvelope/gist.zip Segmented Scan: http://www.gpgpu.org/static/developer/cudpp/rel/cudpp_1.0a/html/group pub lic_interface.html Page 9 of 10

Page 10 of 10