Analysis and Implementation of Automatic Reassembly of File Fragmented Images Using Greedy Algorithms. By Lucas Shinkovich and Nate Jones

Similar documents
CS 558 Project Presentation. Automated Reassembly of File Fragmented Images Using Greedy Algorithms[N.Memon, A.Pal]

Optimization II: Dynamic programming

Project is due on March 11, 2003 Final Examination March 18, pm to 10.30pm

Department of Computer Science and Engineering Analysis and Design of Algorithm (CS-4004) Subject Notes

SPARE CONNECTORS KTM 2014

CIS-331 Spring 2016 Exam 1 Name: Total of 109 Points Version 1

SPAREPARTSCATALOG: CONNECTORS SPARE CONNECTORS KTM ART.-NR.: 3CM EN

CIS-331 Fall 2013 Exam 1 Name: Total of 120 Points Version 1

EE 122: Inter-domain routing Border Gateway Protocol (BGP)

Multimedia Networking ECE 599

Chapter 16. Greedy Algorithms

Delhi Technological University (Formerly Delhi College of Engineering) Result Notification

Source-Destination Connection in Brain Network

Connectivity-aware Virtual Network Embedding

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

CIS-331 Exam 2 Spring 2016 Total of 110 Points Version 1

3D Modeling from Range Images

Spatial Domain Digital Watermarking of Multimedia Objects for Buyer Authentication. By:

Outline. 1 The matching problem. 2 The Chinese Postman Problem

Bit Error Recovery in MMR Coded Bitstreams Using Error Detection Points

G205 Fundamentals of Computer Engineering. CLASS 21, Mon. Nov Stefano Basagni Fall 2004 M-W, 1:30pm-3:10pm

Today s s lecture. Lecture 3: Search - 2. Problem Solving by Search. Agent vs. Conventional AI View. Victor R. Lesser. CMPSCI 683 Fall 2004

Bit error recovery in Internet facsimile without retransmission

CS141: Intermediate Data Structures and Algorithms Greedy Algorithms

Rooted Cycle Bases. David Eppstein, J. Michael McCarthy, and Brian E. Parrish

Chapter 4: Application Protocols 4.1: Layer : Internet Phonebook : DNS 4.3: The WWW and s

Processing and Others. Xiaojun Qi -- REU Site Program in CVMA

Pick up artist guide

Compiler Architecture

SM15K - Interface modules

Toward the joint design of electronic and optical layer protection

Tutorial: Working with layout

CSC 421: Algorithm Design & Analysis. Spring 2015

CSC 505, Spring 2005 Week 6 Lectures page 1 of 9

UNIT 3. Greedy Method. Design and Analysis of Algorithms GENERAL METHOD

CIS-331 Fall 2014 Exam 1 Name: Total of 109 Points Version 1

Thuraya XT-PRO Universally professional

Main approach: always make the choice that looks best at the moment.

Algorithms. Ch.15 Dynamic Programming

Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected

NP Completeness. Andreas Klappenecker [partially based on slides by Jennifer Welch]

Greedy Algorithms CHAPTER 16

CMPT 365 Multimedia Systems. Media Compression - Video

Country

CIS-331 Final Exam Spring 2018 Total of 120 Points. Version 1

The Shortest Path Problem

THE EFFECT OF JOIN SELECTIVITIES ON OPTIMAL NESTING ORDER

Register Allocation (via graph coloring) Lecture 25. CS 536 Spring

Problem Strategies. 320 Greedy Strategies 6

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

PACKING DIGRAPHS WITH DIRECTED CLOSED TRAILS

EE512 Graphical Models Fall 2009

RF119. w w w. r m s p l. c o m. a u R F 119 B 4 08/ 01/

Lecture 8 JPEG Compression (Part 3)

C E R T I F I C A T E O F C O M P L I A N C E

Main approach: always make the choice that looks best at the moment. - Doesn t always result in globally optimal solution, but for many problems does

Geometry Final Review

A-LEVEL MATHEMATICS. Decision 1 MD01 Mark scheme June Version/Stage: 1.0 Final

Practice Problems for the Final

Analysis of Algorithms Prof. Karen Daniels

A Video CoDec Based on the TMS320C6X DSP José Brito, Leonel Sousa EST IPCB / INESC Av. Do Empresário Castelo Branco Portugal

Design and Analysis of Algorithms

CMSC 313 Lecture 03 Multiple-byte data big-endian vs little-endian sign extension Multiplication and division Floating point formats Character Codes

Chapter 9 Graph Algorithms

Global Register Allocation

LID Assignment In InfiniBand Networks

Ver Install Guide. Ver. 4.3 Install Guide

ECE 533 Digital Image Processing- Fall Group Project Embedded Image coding using zero-trees of Wavelet Transform

Register Allocation 1

Chapter 16: Greedy Algorithm

11/22/2016. Chapter 9 Graph Algorithms. Introduction. Definitions. Definitions. Definitions. Definitions

Chapter 5. Relationships Within Triangles

CIS-331 Final Exam Fall 2015 Total of 120 Points. Version 1

mmm Internal Report #13 -^ /y ELAINE LITMAN

File Carving Using Sequential Hypothesis Testing

Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications

Outline. CS38 Introduction to Algorithms. Approximation Algorithms. Optimization Problems. Set Cover. Set cover 5/29/2014. coping with intractibility

CSE 2320 Notes 6: Greedy Algorithms

Algorithms for Minimum Spanning Trees

CIS-331 Exam 2 Fall 2014 Total of 105 Points. Version 1

Annotations: Enriching a Digital Library

Problem set 2. Problem 1. Problem 2. Problem 3. CS261, Winter Instructor: Ashish Goel.

Greedy Algorithms. Algorithms

Lecture 8 JPEG Compression (Part 3)

Algorithms Dr. Haim Levkowitz

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1

Workshop In Advanced Data Structures Push-Relabel Heuristics. Maya Orshizer Sagi Hed

CIS-331 Final Exam Spring 2015 Total of 115 Points. Version 1

Nikon Firmware Update for Coolpix 990 Version 1.1

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

Unit 5A: Circuit Partitioning

Request for Comments: 5109 December 2007 Obsoletes: 2733, 3009 Category: Standards Track. RTP Payload Format for Generic Forward Error Correction

Fall Compiler Principles Lecture 12: Register Allocation. Roman Manevich Ben-Gurion University

2-Type Series Pressurized Closures

PATH FINDING AND GRAPH TRAVERSAL

CIS-331 Exam 2 Fall 2015 Total of 105 Points Version 1

. ).-... I s 0 4 i o s ) ( i. Name CA K44-14". Block 3-4B: Linear Programming Homework

SMF Transient Voltage Suppressor Diode Series

Architectural Design

Iterative deepening multiobjective A*

Transcription:

Analysis and Implementation of Automatic Reassembly of File Fragmented Images Using Greedy Algorithms By Lucas Shinkovich and Nate Jones

Outline Introduction & Problem Method Algorithms Problems & Limitations Experimental implementation Future work

Introduction & Problem Files tend to get fragmented (stored in discontinuous blocks) on a regular basis. This is usually caused by files that are too large for one block or a block is deleted and reused by a larger/smaller file. Problem: Reassemble the fragmented blocks (image data only) without any prior knowledge of their original ordering. Assume: No missing or corrupted blocks.

Method Put fragments together based on a candidate weight value. Generalize to a graph problem of finding k vertex disjoint paths in a complete graph (NP-complete). Problem is a lot simpler because we know the first fragment (header) and the first fragment contains valuable information such as image Resolution, which we can use to obtain how many fragments it contains. Assume: Each fragment is at least width size, so we only need to compare from top and bottom. Find concatenation of fragments that minimizes or maximizes the candidate weight values to obtain the original image. Three types of candidate weight values: Pixel Matching (PM), Sum of Differences (SoD), and Median Edge Detection (MED).

Algorithms Eight different reconstruction algorithms based on vertex/non-vertex disjoint paths, done in parallel or serial and according to the heuristic (greedy or enhanced greedy). Greedy Heuristic: Start with header fragment hi, reconstruct the image by choosing the best available fragment t (based on candidate weight value). Add t to the reconstructed image path and repeat the process until the image is reconstructed. Time: O(nlogn)

Algorithms (cont.) 1) Greedy sequential unique path (SUP): Uses the greedy heuristic to assemble fragments and creates vertex disjoint paths (i.e. a fragment can only be used once). Problems: Dependent on the order of images being processed. 2) Greedy non-unique path (NUP): Uses the greedy heuristic to assemble fragments and creates non-vertex paths (i.e. a fragment may be used an infinite number of times).

Algorithms (cont.)

Algorithms (cont.) 3) Greedy parallel unique path (PUP): Start with all of the image headers and choose the best match for each header. We then pick the best header-fragment pair out of all the header-fragment best matches. 4) Greedy shortest path first (SPF UP): Attempts to gain the benefits of the NUP algorithms while being an UP algorithm. Assumes that the best reconstructed image is the one with the lowest average path cost. Runs the greedy NUP across all of the images and computes their total cost (sum of candidate weight values). The total cost divided by the number of fragments is the average path cost. We remove the image with the lowest average path cost and then run the algorithm again until we have obtained all of the images.

Algorithms (cont.) All of the algorithms mentioned (1-4) have a total running time of O(n2logn). Enhanced Greedy Heuristic: Same as greedy heuristic except before we choose the best fragment match hi and t we first check to see if t is a better match for another fragment. Algorithms 5-8 are the same as the previous ones discussed except they used the enhanced greedy algorithm.

Problems & Limitations Assumptions are not realistic. Assumption: Each fragment is at least width size This is not always the case. Example: Take a 512x512 image, this is 512X512X3 + 54 bytes = 786,486 bytes. We now divide that into 4096 bytes (cluster size) and get 192.0132. We have a fragment left over that is of size 54 bytes and the width of the image is 512X3 = 1536 bytes. How do we calculate candidate weights with different width sizes? A similar type of problem arises when the image width is larger then the cluster size. In this case, we can no longer compare fragments by just the top and bottom.

Problems & Limitations (cont.) Assumption: No fragments are corrupt or missing. Not realistic, if you have lost the file allocation table then you have most likely lost some data along with it. Missing regular fragments will only cause minor problems in some algorithms, but missing a header fragment will make that image s reconstruction impossible. This is because we will be missing vital header information such as the width of the image (we need this to calculate candidate weight values). Partial Solution: Major problem is finding out what fragment(s) are missing for what image(s). If we can solve this problem, we might be able to do some type of interpolation for the missing fragment especially if its at the end of the image.

Problems & Limitations (cont.) Assumption: Access to raw pixel data. (not clearly stated in paper) Need access to raw pixel data to get correct candidate weight values. Compressed image formats such as JPEG need to be decoded before we have access to the pixel data. Due to the fact that we might not have access to all of the 8x8 blocks in one fragment we will not be able to successfully do inverse DCT to access the pixel data. New ideas in research, such as image encryption will also have the same problem.

Problems & Limitations (cont.) Problems with the algorithm itself: Greedy algorithm without proof of greedy property or optimal substructure. Greedy choice does not always find shortest path (Dijkstra s algorithm). Not very efficient, lot of sorting and tree allocation (linear sort?). What do we do with two fragments with the same candidate weight values?

E x p er i m en tal I m p l em en tati o n

E x p er i m en tal I m p l em en tati o n (C o n t'd ) W ri tten i n C # D el egat es al l o w ed u s t o easi l y p l u g i n d i f f er en t w ei gh t f u n c t i o n s, c o m p ar at o r s, et c. W e w er e b o t h f am i l i ar w i t h C #. T h e G U I w as i m p o rtan t b ecau se i t al l o w s u s to q u i ck l y an d easi l y p u t to geth er n ew ex p eri m en ts w i th n ew i m age sets an d n ew p aram eters. W e set u p o u r E x p eri m en tal i m p l em en tati o n o f th ei r ex p eri m en t to b e a p l ace to easi l y test f u tu re w o rk o n th e su b j ect.

F r ag m en t.c s

R eassem b l ee x p er i m en t.c s

R ep o r ti n g R esu l ts G rap h o f A v erage C o l o r I n ten si ty S i n ce C o l o r I n ten si ty i s a M aj o r F acto r i n F ragm en t W ei gh ts S h o w h o w si m i l ar an I m age i s D o esn 't w o rk as w el l f o r sm al l i m ages R ed O ri gi n al I m age B l u e R eco n stru cti o n G reen D i f f eren ce

R esu l ts 51 2x 266

R esu l ts R ec o n str u c ti o n S O D

R esu l ts R ec o n str u c ti o n P M A

R esu l ts R ec o n str u c ti o n M ED

R esu l ts B i g I m ag e

R esu l ts B i g I m ag e S O D

Jer u sal em T o w er C o l o r S h i f ti n g

Jer u sal em T o w er P M A

Jer u sal em T o w er - M E D

F u tu r e W o r k G rap h th e can d i d ate w ei gh t v al u es S ee i f sp i k es i n t h e gr ap h c o r r esp o n d t o p r o b l em s i n t h e r esu l t an t i m age. C an t h i s h el p u s t o f i n d w h er e t h e al go r i t h m f al l s ap ar t? A ttem p t a m i x ed ap p ro ach to E n h an ced G reed y an d G reed y H eu ri sti c. M i gh t i m p r o v e t i m e b o u n d f o r av er age c ase E n h an c ed G r eed y H eu r i st i c. D o u b t f u l t h at i t w i l l h av e a p o si t i v e af f ec t o n t h e R ec o n st r u c t ed I m age. A ttem p t R eco n stru cti o n o f I m ages w i th m i ssi n g d ata. H o w m u c h m i ssi n g d at a m ak es t h e i m age i r r et r i ev ab l e?

Questions? Any questions?

References N.Memon and A.Pal, Automated reassembly of file fragmented images using greedy algorithms, IEEE Trans. Image Processing, vol.15, no.2, pp.385-393, 2006.