Data Compression. The Encoder and PCA

Similar documents
Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Image coding and compression

Why Normalizing v? Why did we normalize v on the right side? Because we want the length of the left side to be the eigenvalue

Center for Automation and Autonomous Complex Systems. Computer Science Department, Tulane University. New Orleans, LA June 5, 1991.

Slides adapted from Marshall Tappen and Bryan Russell. Algorithms in Nature. Non-negative matrix factorization

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II

Novel Lossy Compression Algorithms with Stacked Autoencoders

CSC321: Neural Networks. Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis. Geoffrey Hinton

Slide07 Haykin Chapter 9: Self-Organizing Maps

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders

Neural Network based textural labeling of images in multimedia applications

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

Machine Learning for Physicists Lecture 6. Summer 2017 University of Erlangen-Nuremberg Florian Marquardt

Unsupervised Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders

Neuro-Fuzzy Computing

Lab 8 CSC 5930/ Computer Vision

Efficient Visual Coding: From Retina To V2

Dimension Reduction CS534

Yuki Osada Andrew Cannon

Deep Learning. Volker Tresp Summer 2014

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Gray-Level Reduction Using Local Spatial Features

Data Mining. Neural Networks

Character Recognition Using Convolutional Neural Networks

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Clustering and Dimensionality Reduction

COMPUTATIONAL INTELLIGENCE

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically

CIE L*a*b* color model

Richard S. Zemel 1 Georey E. Hinton North Torrey Pines Rd. Toronto, ONT M5S 1A4. Abstract

CS 4510/9010 Applied Machine Learning. Deep Learning. Paula Matuszek Fall copyright Paula Matuszek 2016

CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]

Image Compression Using SOFM

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

Grundlagen der Künstlichen Intelligenz

HW Assignment 3 (Due by 9:00am on Mar 6)

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning

ImageNet Classification with Deep Convolutional Neural Networks

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM

Lossless Image Compression with Lossy Image Using Adaptive Prediction and Arithmetic Coding

Multimedia Data and Its Encoding

Radial Basis Function Neural Network Classifier

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck

So, what is data compression, and why do we need it?

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Multithresholding of color and gray-level images through a neural network technique

Clustering and Visualisation of Data

Support Vector Machines

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning

Visual object classification by sparse convolutional neural networks

A Fast Personal Palm print Authentication based on 3D-Multi Wavelet Transformation

Haar Wavelet Image Compression

IMAGE COMPRESSION. October 7, ICSY Lab, University of Kaiserslautern, Germany

Computational Photography Denoising

Non-linear Point Distribution Modelling using a Multi-layer Perceptron

Programming Exercise 7: K-means Clustering and Principal Component Analysis

The principal components of natural images. Peter J.B. Hancock, Roland J. Baddeley and Leslie S. Smith. University of Stirling, Scotland FK9 4LA

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

APPLICATION OF RECIRCULATION NEURAL NETWORK AND PRINCIPAL COMPONENT ANALYSIS FOR FACE RECOGNITION

Principal Component Analysis (PCA) is a most practicable. statistical technique. Its application plays a major role in many

Recognition: Face Recognition. Linda Shapiro EE/CSE 576

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1

Machine Learning A W 1sst KU. b) [1 P] Give an example for a probability distributions P (A, B, C) that disproves

Image Compression: An Artificial Neural Network Approach

Function approximation using RBF network. 10 basis functions and 25 data points.

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Combined Weak Classifiers

Support Vector Machines

CS228: Project Report Boosted Decision Stumps for Object Recognition

Image Compression Technique

Image Coding and Data Compression

Research Article A Novel Image Compression Method Based on Classified Energy and Pattern Building Blocks

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

2. Neural network basics

Using Machine Learning for Classification of Cancer Cells

D.A. Karras 1, S.A. Karkanis 2, D. Iakovidis 2, D. E. Maroulis 2 and B.G. Mertzios , Greece,

The Fly & Anti-Fly Missile

Neural Network Neurons

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

Why MultiLayer Perceptron/Neural Network? Objective: Attributes:

Deep Generative Models Variational Autoencoders

An introduction to JPEG compression using MATLAB

Dimension reduction : PCA and Clustering

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Machine Learning : Clustering, Self-Organizing Maps

Feature Extraction and Learning for RSSI based Indoor Device Localization

Artificial Neural Networks (Feedforward Nets)

Machine Learning Classifiers and Boosting

Artificial Neuron Modelling Based on Wave Shape

Notes on Multilayer, Feedforward Neural Networks

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

JPEG. Table of Contents. Page 1 of 4

11/14/2010 Intelligent Systems and Soft Computing 1

Transcription:

Data Compression The Encoder and PCA Neural network techniques have been shown useful in the area of data compression. In general, data compression can be lossless compression or lossy compression. In the latter, some portion of the information represented is actually lost. JPEG and M PEG (video & audio) compression standards are examples of lossy compression whereas LZW and 'packbits' are lossless. Neural net techniques can be applied to achieve both lossless and lossy compression. The following is a closer look at examples of different neural net based compression techniques. 1

The Encoder Self-supervised backpropagation The input is reproduced on the output Hidden layer compresses data Only hidden layer outputs transmitted Output layer used for decoding The encoder is a multi-layer perceptron, trained to act as an autoassociator, using backpropagation. 2

The Encoder The net is trained to produce the same output pattern that appears on the input. This is also known as self-supervised backpropagation. The aim is to reproduce the input pattern on the output, but using as few hidden layer neurons as possible. The output of the hidden layer then becomes the data to be transmitted. The "compressed" data is decoded at the receiver using the weights on the output layer. The illustration shows how an n dimensional input pattern can be transmitted using less than n inputs (since there are less than n hidden units). 3

The Encoder Lossless compression N orthogonal input patterns can be mapped onto log 2 N hidden units. Lossy compression N < log 2 N It is known (Rumelhart & M cclelland, 1986) that a set of N orthogonal input patterns can be mapped onto log 2 N hidden units. Thus, a figure of log 2 N can be taken as a theoretical minimum number of hidden units to achieve lossless compression. 4

The Encoder Cottrell et. Al. (1987) Image compression Greyscale 8 bit image, any dimensions Network size : 64 in, 64 out and 16 hidden Image processed in 8x8 patches. An example of this approach for image compression was investigated by Cottrell et al. (1987). The aim here was to compress an image (of any size). Their approach used a network with 64 inputs (representing an 8x8 image patch), 16 hidden units and 64 outputs. Each input represented a 256 level. 5

The Encoder Near state of the art results obtained! 64 greyscale pixels compressed by 16 hidden units. 150,000 training patterns Compression is image dependent however. Encode & transmit first 8x8 patch The net was trained using 150,000 presentations of input taken randomly from 8x8 patches of the image. Applying the net to each 8x8 non-overlapping patch of the image Cottrell obtained near state of the art compression results! Note however that compression was very much tuned to the actual image compressed and that results with other kinds of images were less impressive. 6

Principal Component Analysis PCA is dimensionality reduction m bit data converted to n bit data where n < m Another way to view data compression is to regard it as a reduction in dimensionality. That is, can a representation of a set of patterns expressed using n bits of information, be adequately described using m bits, where m is less than n? The goal is to effectively represent the same data using a reduced set of features. Given a set of data then principal component analysis, as we have already seen, attempts to identify axes (or principal components) along which the data varies the greatest. 7

PCA By definition PCA is lossy compression Reduction in number of features used to represent data Which features to keep and which to remove? 8

PCA Principal components Are axes along which data varies the most 1 st principal component exhibits greatest variance 2 nd principal component exhibits next greatest variance Etc. The first principal component is regarded to be the axis, which exhibits the greatest variance in data. The second component is orthogonal to the first and shows the second greatest variance of data, the third is orthogonal to the first two and so on. 9

PCA 2 nd component orthogonal to 1 st 3 rd orthogonal to 2 nd Etc. Original axes - clusters difficult to discriminate Original Data 2 nd PC 1 st PC Principal components - easier to discriminate 10

PCA The Hebb Rule Oja 1992 : a single neuron, can be trained to find the 1 st PC Sanger 1989 : in general m neurons can be trained to find the first m PCs The generalized Hebbian algorithm (GHA) In terms of neural nets, a Hebb like learning rule can be used to train a single neuron so that the weights converge to the principal component of a distribution (Oja, 1992). In general, a layer of m neurons, can be trained using a "generalized Hebbian algorithm" (GHA) to find the first m principal components of a set of input data (Sanger 1989). 11

PCA & Image Compression Haykin 1994 Describes the GHA for image compression Example Input image 256 x 256, each pixel with 256 grey levels PCA network 8 neurons, each with an 8 x 8 receptive field Haykin describes an application of GHA for image compression. A 256 by 256 image, where each pixel had 256 grey levels was chosen for encoding. The image was coded using a linear feedforward network of 8 neurons, each with 64 inputs. Training was performed by presenting the net with data taken from 8x8 non-overlapping patches of the image. To allow convergence, the image was scanned from left to right, and top to bottom, twice. The 8 neurons represent the first 8 eigenvectors obtained by convergence. (Sanger's rule). 12

PCA & Image Compression Haykin 1994 Processing Image scanned top-to-bottom, and left-to-right. The neurons let to converge. The 8 neurons represents the first 8 eigenvectors. 13

PCA & Image Compression Example input image From Haykin, S., "Neural Networks - A Comprehensive Foundation", 1994 Once the weights of the network had converged, they were used to encode the image (shown above) for transmission. 14

PCA & Image Compression Encoding details Each 8 x 8 block multiplied by each neuron This gives 8 coefficients Coefficients transmitted. In Haykin s example, 23 bits needed. I.e. 23 bits encoded an 8x8x8 image patch. Transmission Each 8x8 block of the image was multiplied by the weights from each of the eight neurons (I,e. applied to each neuron). This generated 8 outputs or coefficients. The coefficient from each neuron was transmitted. The number of bits chosen to represent each coefficient is determined by variance of the coefficient over the whole image. (I.e. a larger number of bits are needed to represent something that varies a lot, rather than something that varies a little). In the example described in Haykin, this required 23 bits to code the outputs of the 8 neurons. (That is, 23 bits were required to encode each 8x8 block of pixels, where each pixel was represented using 8bits). 15

PCA & Image Compression The weights of the 8 neurons From Haykin, S., "Neural Networks - A Comprehensive Foundation", 1994 The illustration above shows the weights obtained by each of the eight neurons. In the diagram, light areas depict positive weights and dark areas negative (or inhibitory) weights. 16

PCA & Image Compression Decoding details Neurons used to decode transmitted coefficients. Weights x coefficient = 8 x 8 patch reconstructed Receiving (decoding) The image was reconstructed from the transmitted coefficients using the neurons again. This time however, the weights of each neuron were multiplied by their coefficient and then added together to reconstruct each 8x8 patch of the image. The following illustration shows the weights obtained by each of the eight neurons. In the diagram, light areas depict positive weights and dark areas negative (or inhibitory) weights. 17

PCA & Image Compression! 1 1 2 2 Transmission 8x8 patch of image 8 8 Weights of each neuron represent one of first eight principal components of image data. Obtained using Sanger's rule. 18

PCA & Image Compression Example output image Input Image Output Image From Haykin, S., "Neural Networks - A Comprehensive Foundation", 1994 19