Customizing Progressive JPEG for Efficient Image Storage Eddie Yan Kaiyuan Zhang Xi Wang Karin Strauss Luis Ceze HotStorage 17 July 11, 2017
2
2
2
2
2
2
2
2
2
Summary Today s image hosts need to store many images, but also at many different sizes (resolutions), increasing pressure on storage systems Leverage progressive image storage technology to reduce the effect of serving multiple resolutions of each image 3
Contents Intro & Related Work JPEG & Progressive JPEG Progressive JPEG for Dynamic Resizing Future Work & Conclusion 4
Facebook Stores Four Versions of Each Photo (2010) Facebook s photo store: handle many small files efficiently [Beaver OSDI 10] For each uploaded photo, Facebook generates and stores four images of different sizes, which translates to over 260 billion images and more than 20 petabytes of data. 5
Facebook Serves Different Image Sizes Depending on Context Facebook s photo cache: resizing on the fly + SSD caching [Huang SOSP 13] Facebook serves photos in many different forms to many different users. For instance, a desktop user with a big window will see larger photos than a desktop users with a smaller window who in turn sees larger photos than a mobile user. 6
FLICKR: No Capacity Upgrades for a Year with Resizing A Year Without A Byte [Flickr 2017 1] & Real-time Resizing of Flickr Images Using GPUs [Flickr 2017 2] 7
Goals for Image Storage image storage for large (social media services) should: 8
Goals for Image Storage image storage for large (social media services) should: offer low latency, high throughput [Beaver OSDI 10] [Huang SOSP 13] 8
Goals for Image Storage image storage for large (social media services) should: offer low latency, high throughput [Beaver OSDI 10] [Huang SOSP 13] serve multiple resolutions [Beaver OSDI 10] [Flickr 17 1 & 2 17] [Huang SOSP 13] 8
Goals for Image Storage image storage for large (social media services) should: offer low latency, high throughput [Beaver OSDI 10] [Huang SOSP 13] serve multiple resolutions [Beaver OSDI 10] [Flickr 17 1 & 2 17] [Huang SOSP 13] use minimal storage capacity [Google 17] [Horn NSDI 17] [Mozilla circa 14] 8
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2... 3... 9
JPEG: Quantization 1 2 less quantization 3...... 9
JPEG: Quantization 1 2 less quantization 3... more quantization... 9
JPEG: Lossless Compression (0, 2)( 3); (1, 2)( 3); (0, 2)( 2); (0, 3)( 6); (0, 2) (2); (0, 3)( 4); (0, 1)(1); (0, 2)( 3); (0, 1)(1); (0, 1)(1); (0, 3)(5); (0, 1) (1); (0, 2)(2); (0, 1)( 1); (0, 1)(1); (0, 1)( 1); (0, 2) (2); (5, 1)( 1); (0, 1)( 1); (0, 0). 10
Baseline JPEG: Scanline Order 11
Baseline JPEG: Scanline Order 12
Progressive JPEG 1 2... 3... 13
Progressive JPEG 1 2... 3... 13
Progressive JPEG 1 2... 3... each scan is a refinement pass 13
Progressive JPEG represent image data in the frequency domain, roughly in frequency order load data in frequency order, grouped by scans header scan 1 scan 2 scan 3 scan 4 scan 5 14
Progressive JPEG Example header scan 1 PSNR: 27.9 db 15
Progressive JPEG Example header scan 1 scan 2 PSNR: 29.8 db 16
Progressive JPEG Example header scan 1 scan 2 scan 3 PSNR: 31.9 db 17
Progressive JPEG Example header scan 1 scan 2 scan 3 scan 4 PSNR: 37.9 db 18
Progressive JPEG Example header scan 1 scan 2 scan 3 scan 4 scan 5 PSNR: inf db 19
Progressive JPEG Example header scan 1 scan 2 scan 3 scan 4 scan 5 PSNR: 31.9 db PSNR: inf db 20
Progressive JPEG Example header scan 1 scan 2 scan 3 scan 4 scan 5 21
Targeting Dynamic Resizing 22
Targeting Dynamic Resizing request( image.jpg, 500px) 22
Targeting Dynamic Resizing request( image.jpg, 500px) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) resize( image.jpg, 500px) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) resize( image.jpg, 500px) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) resize( image.jpg, 500px) send( image.jpg, 500px) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) resize( image.jpg, 500px) send( image.jpg, 500px) 22
Targeting Dynamic Resizing request( image.jpg, 500px) read( image.jpg, src) resize( image.jpg, 500px) you can cache this send( image.jpg, 500px) 22
Three Methods for Dynamic Resizing read( image.jpg, src) 23
Three Methods for Dynamic Resizing read source JPEG read( image.jpg, src) decode 23
Three Methods for Dynamic Resizing read source JPEG read progressive JPEG partially read( image.jpg, src) decode decode 23
Three Methods for Dynamic Resizing read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) decode decode decode 23
Three Methods for Dynamic Resizing Baseline Progressive JPEG Custom Progressive JPEG read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) decode decode decode 23
Our Approach Store only one size of each image, resize dynamically, and customize JPEGs to match target sizes 24
Using Custom Progressive JPEG find custom scan configuration predefined resolutions (a) write resolution to read offset mapping (b) saved data (c) read read necessary scans resize and encode (transcode) 25
Using Custom Progressive JPEG find custom scan configuration predefined resolutions (a) write resolution to read offset mapping (b) saved data (c) read read necessary scans resize and encode (transcode) 25
Using Custom Progressive JPEG find custom scan configuration predefined resolutions (a) write resolution to read offset mapping (b) saved data (c) read read necessary scans resize and encode (transcode) 25
Customizing Progressive JPEG 1 2... 3... 26
Customizing Progressive JPEG 1 2... 3..... 26
Tune Coefficients Included in Each Scan to Match Quality Targets Directly 1 2... 3... 27
Tune Coefficients Included in Each Scan to Match Quality Targets Directly 1 2... 3..... 27
Tune Coefficients Included in Each Scan to Match Quality Targets Directly 1 2... 3..... 10% 27
Tune Coefficients Included in Each Scan to Match Quality Targets Directly 1 2... 3..... 10% 25% 27
Tune Coefficients Included in Each Scan to Match Quality Targets Directly 1 2... 3..... 10% 25% 50% 27
Choosing Coefficients 28
Choosing Coefficients 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Choosing Coefficients best 28
Evaluation 29
Evaluation minimize required storage capacity 29
Evaluation minimize required storage capacity minimize amount of data read (bandwidth) 29
Evaluation minimize required storage capacity minimize amount of data read (bandwidth) limit decode-time overheads 29
Dynamic Resizing Saves Capacity over Static Schemes 30
Customizing Progressive Reduces Read Bandwidth Requirements 31
Where to Decode for Resizing? Baseline Progressive JPEG Custom Progressive JPEG read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) decode decode decode 32
Where to Decode for Resizing? Baseline Progressive JPEG Custom Progressive JPEG read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) decode decode decode 32
Where to Decode for Resizing? Progressive JPEG is more expensive to decode than baseline JPEG Baseline Progressive JPEG Custom Progressive JPEG read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) decode decode decode 32
Where to Decode for Resizing? Progressive JPEG is more expensive to decode than baseline JPEG Baseline Progressive JPEG Custom Progressive JPEG read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) decode decode decode 32
Where to Decode for Resizing? Progressive JPEG is more expensive to decode than baseline JPEG Baseline Progressive JPEG Custom Progressive JPEG read source JPEG read progressive JPEG partially read custom progressive JPEG partially read( image.jpg, src) Two Choices: decode decode decode 1. decode on client 2. decode on server 32
Where to Decode for Resizing? 33
Where to Decode for Resizing? Client Server 33
Where to Decode for Resizing? Client Server Input Truncated PJPEG Truncated PJPEG 33
Where to Decode for Resizing? Client Server Input Truncated PJPEG Truncated PJPEG Output Resized JPEG Resized JPEG 33
Where to Decode for Resizing? Client Server Input Truncated PJPEG Truncated PJPEG Output Resized JPEG Resized JPEG Baseline Input Resized JPEG Full Size JPEG 33
Where to Decode for Resizing? Client Server Input Truncated PJPEG Truncated PJPEG Output Resized JPEG Resized JPEG Baseline Input Resized JPEG Full Size JPEG Change in Input Resized JPEG to Truncated Progressive Full Size JPEG to Truncated Progressive 33
Decoding on the Server has Lower Overhead 34
Decoding on the Server has Lower Overhead 34
35
Minimize required storage capacity 35
Minimize required storage capacity Minimize amount of data read 35
Minimize required storage capacity Minimize amount of data read Limit decode-time overheads 35
dynamic resizing custom PJPEG +dynamic resizing Minimize required storage capacity Minimize amount of data read Limit decode-time overheads 35
dynamic resizing custom PJPEG +dynamic resizing Minimize required storage capacity Minimize amount of data read Limit decode-time overheads 35
dynamic resizing custom PJPEG +dynamic resizing Minimize required storage capacity Minimize amount of data read Limit decode-time overheads 35
dynamic resizing custom PJPEG +dynamic resizing Minimize required storage capacity Minimize amount of data read Limit decode-time overheads 35
Future Work we need accurate, perceptual, image quality metrics we should make this faster 36
Conclusion 37
Conclusion modern image storage services store a lot of images, at many resolutions 37
Conclusion modern image storage services store a lot of images, at many resolutions serving multiple resolutions is necessary, but wastes capacity 37
Conclusion modern image storage services store a lot of images, at many resolutions serving multiple resolutions is necessary, but wastes capacity we can do better by only reading the necessary data for each request and customizing image data layout 37
Thank You Questions? 38