Forensic analysis of JPEG image compression Visual Information Privacy and Protection (VIPP Group) Course on Multimedia Security 2015/2016
Introduction
Summary Introduction The JPEG (Joint Photographic Expert Group) standard Forensic analysis of JPEG images
What is JPEG? JPEG (Joint Photographic Expert Group) is an international standard for lossy image compression released in 1992 JPEG is still today one of the most popular image formats on the Web JPEG is used by 73.5% of all the websites Source: https://w3techs.com/technologies/overview/image_format/all (updated April 2016)
What is JPEG? JPEG is used in many applications. It is particularly suitable for the compression of photographs and paintings of realistic scenes with smooth variations of tone and color GIF JPEG GIF JPEG With respect to the also widely diffused GIF format, JPEG ensures better visual quality compressed images for the same file size
How does it work? JPEG achieves good trade-off between visual quality and compression efficiency by exploiting a number of algorithms Color Space Transform and subsampling Discrete Cosine Transform (DCT) Quantization Zig-zag ordering Differential Pulse Code Modulation (DC component) Run Length Encoding (AC components) Entropy Coding (Huffman or Arithmetic)
JPEG baseline encoding C b Y C r f(i, j) 8 x 8 DCT F(u, v) 8 x 8 Quantization Fq(u, v) 1. Discrete Cosine Transform of each 8x8 pixel block 2. Scalar quantization Main steps: Header Tables Data Coding tables Entropy Coding Quantization tables Zig Zag Scan DPCM RLC 3. Zig-zag scan to exploit redundancy 4. Differential Pulse Code Modulation (DPCM) on the DC component and Run Length Encoding of the AC components 5. Entropy coding (Huffman) Reverse order for decoding
Color space transform: RGB to YCbCr RGB color space is not the only method to represent an image There are several other color spaces, each one with its properties A popular color space in image compression is the YCbCr, which: o separates luminance (Y) from color information (Cb,Cr) o processes Y and (Cb,Cr) separately (not possible in RGB) RGB to YCbCr and YCbCr to RGB linear conversions:
Color space transform example
Color space transform subsampling Y is taken every pixel, and Cb,Cr are taken for a block of 2x2 pixels Data size is reduced to a half without significant losses in visual quality Example: block 64x64 Without subsampling, one must take 64 2 pixel values for each color channel: 3* 64 2 = 12288 values (1 bytes per value) JPEG takes 64 2 values for Y and 2x32 2 values for chroma 64 2 + 2x32 2 = 6144 values (1 bytes per value)
Inverse DCT Forward DCT Discrete Cosine Transform (DCT) Transformed data are more suitable to compression (e.g. skew probability distribution, reduced correlation). No lossy 7 7 1 (2x 1) u (2y 1) v F( u, v) C( u) C( v) f ( x, y)cos cos 4 x 0 y 0 16 16 for u 0,...,7 and v 0,...,7 where Ck ( ) 1/ 2 for k 0 1 otherwise 7 7 1 (2x 1) u (2y 1) v f ( x, y) C( u) C( v) F( u, v)cos cos 4 u 0 v 0 16 16 for x 0,...,7 and y 0,...,7
Quantization Goal: to reduce number of bits per sample For each 8x8 DCT block, F(u.v) is divided by a 8x8 quantization matrix Q Example: F = 45 = (101101) 2 (6 bits) Q(u,v), quantization step at frequency (u,v) o Truncate to 4 bits (Q=4): (1011) 2 = 11. (De-quantize: 11x4 = 44 against 45) o Truncate to 3 bits (Q=8): (101) 2 = 5. (De-quantize: 8x5 = 40 against 45) Quantization error is the main reason why JPEG compression is LOSSY
Quantization Each F[u,v] in a 8x8 block is divided by constant value Q(u,v) Higher values in the quantization matrix Q achieve better compression at the cost of visual quality How to choose Q? Eye is more sensitive to low frequencies (upper left corner of the 8x8 block) less sensitive to high frequencies (lower right corner) Idea: quantize more (large quantization step) the high frequencies, less the low frequencies The values of Q are controlled with a parameter called Quality Factor (QF) which ranges from 100 (best quality) to 1 (extremely low)
Quantization: luminance Quantization table Q for QF = 50
Quantization: chrominance Quantization table Q for QF = 50 Can quantized color more coarsely due to reduced sensitivity of the Human Visual System (HVS)
Quantization: luminance and chrominance An example of quantization table Q for QF = 70 The quantization is less strong at larger QF
NO JPEG (20MB) JPEG 100 (9MB) JPEG 60 (1.3MB) JPEG 20 (0.6MB) JPEG 5 (0.4MB)
Zig-zag ordering of quantized coefficients For further processing, each 8x8 block is converted to a 1x64 vector To do so, JPEG adopts a method called zig-zag scan, which packs together low, medium and high frequency coefficients
Zig-zag ordering of quantized coefficients Packing coefficients in a clever way Low 0 1 5 6 14 15 27 28 2 4 7 13 16 26 29 42 3 8 12 17 25 30 41 43 9 11 18 24 31 40 44 53 Medium Typically, these are very small or 0, for RLE it is good to have them packed 10 19 23 32 39 45 52 54 20 22 33 38 46 51 55 60 High 21 34 37 47 50 56 59 61 35 36 48 49 57 58 62 63 Normal (e.g. column-wise) ordering: frequencies are mixed 0 2 3 5 10 20 21 35 1 47 49 14 16 46 50 64 Zig-zag: frequencies are better sorted 0 2 3 9 10 1 4 8.. 20 19 18.. 49 50.. 60 61 63
DPCM on DC component DC component is large and can assume several different values. However, often the difference between DCs of two adjacent blocks is not large To save bits, encode the difference from DC of previous 8x8 blocks o This procedure is called Differential Pulse Code Modulation (DPCM)
DPCM on DC component DPCM: Laplacian distribution sharply peaked in 0 (right) Pixels: uniform distribution in [0,255] (left) The entropy of the error image is much smaller than that of the original image
RLE on AC components The 1x64 vectors have a lot of zeros, more so towards the end of the vector. o Higher up entries in the vector capture higher frequency (DCT) components which tend to be capture less of the content. Encode a series of 0s as a (skip,value) pair, where skip is the number of zeros and value is the next non-zero component. o Send (0,0) as end-of-block sentinel value.
Entropy coding DC components are differentially coded as (SIZE,Value) o The code for a Value is derived from the following table SIZE Value Code 0 0 --- 1-1,1 0,1 2-3, -2, 2,3 00,01,10,11 3-7,, -4, 4,, 7 000,, 011, 100, 111 4-15,, -8, 8,, 15 0000,, 0111, 1000,, 1111.... 11-2047,, -1024, 1024, 2047
Entropy coding DC components are differentially coded as (SIZE,Value). The code for a SIZE is derived from the following table SIZE Length Code 0 2 00 1 3 010 2 3 011 3 3 100 4 3 101 5 3 110 6 4 1110 7 5 11110 8 6 111110 9 7 1111110 Example: If a DC component is 40 and the previous DC component is 48. The difference is -8. Therefore it is coded as: 1010111 0111: The value for representing 8 (see Size_and_Value table) 101: The size from the same table reads 4. The corresponding code from the table at left is 101. 10 8 11111110 11 9 111111110
Forensic Analysis of JPEG images
JPEG compression footprints Like any other image processing, JPEG leaves traces into the image, especially at low quality factors o Such traces can be exploited to gather useful information on the image Some JPEG artifacts are immediately identified o Blocking due to block discontinuities o Ringing on edges due to the DCT o Graininess due to coarse quantization o Blurring due to high frequency removal Other (statistical) alterations are way more subtle to identify!
Blocking artifacts Processing each 8x8 blocks independently introduces discontinuities along the block boundaries, thus making image tiling visible
Ringing No ringing Ringing artifacts Spurious signals near sharp transitions o Visually, they appear as bands or ghosts o Particularly evident along edges an in text images
Graininess artifacts Particularly evident as dots along the edges
Blurring artifacts Removing high frequency DCT coefficients increases the smoothness of the image, retaining shapes but making textures less distinguishable o Human eye is particularly good at spotting smoothness
Double JPEG compression: footprints Double JPEG compression is when an image is JPEG compressed first with QF 1 and then JPEG compressed again with QF 2 Statistical footprints, due to double quantization In MM-Forensics, several approaches have been proposed to reveal the footprints (periodic artifacts) left by double compression
Statistical footprints: double quantization Why understanding whether an image has been JPEG compressed (quantized) twice is important?
Suppose you took this nice picture with your camera. Image that this picture did not undergo any compression (a TIF image, for example)
Download an image from the Internet. It is very likely that this one is a.jpg file that is JPEG compressed with a certain QF Start your favorite image editing software.
Create a fake, realistic and deceptive image. Save your effort as JPEG
Create a fake, realistic and deceptive image. Save your effort as JPEG How can one reveal your manipulation?
By observing that This region has been quantized twice (in the image you download and when you save the fake) All the rest is quantized once (when you saved the fake)
Single quantization (SQ) Quantization is the point-wise operation: Where: o is a strictly positive integer called quantization step o The value is approximated to the largest previous integer De-quantization brings the quantized values back to their original range Qa is not invertible because of the truncation operation
Double quantization (DQ) Double quantization is again a point-wise operation: Where: o and are the quantization steps of the first and second quantization Double quantization can be represented as a sequence of three steps 1. Quantization with step 2. De-quantization with step 3. Quantization with step
Double quantization footprints (1/2) When a<b, some bins are empty (holes). This happens because the second quantization re-distributes the quantized coefficients into more bins that the first quantization Consider a signal whose samples are normally distributed in [0,127]. The histogram of the signal quantized step 2 is the following: The histogram of signal quantized with step 3 followed by 2 is :
Double quantization footprints (2/2) When a>b, some bins contain more samples that neighbouring bins. This happens because even bins receive samples from four original bins, while the odd bins receive samples from only two Consider the same signal, now quantized with step 3. Its histogram is: The histogram of the signal quantized with step 2 followed by 3:
Double JPEG compression forensics Double quantization occurs when an image is JPEG compressed first with QF 1 and then JPEG compressed again with QF 2 choice of the quantization table) (Remind: QF rules the Typically, the former quality factor is lower than the latter (QF 1 < QF 2 ) most frequent case in practice More reliable detection of double JPEG compression Rule of thumb:
Detection of double JPEG compression Image Forensics proposes several detectors of double JPEG compression o Huang, Fangjun, Jiwu Huang, and Yun Qing Shi. "Detecting double JPEG compression with the same quantization matrix." Information Forensics and Security, IEEE Transactions on 5.4 (2010): 848-856. o Bianchi, Tiziano, and Alessandro Piva. "Detection of nonaligned double JPEG compression based on integer periodicity maps." Information Forensics and Security, IEEE Transactions on 7.2 (2012): 842-848. o Pevný, Tomáš, and Jessica Fridrich. "Detection of double-compression in JPEG images for applications in steganography." Information Forensics and Security, IEEE Transactions on 3.2 (2008): 247-258. o Bianchi, Tiziano, and Alessandro Piva. "Detection of non-aligned double JPEG compression with estimation of primary compression parameters." Image Processing (ICIP), 2011 18th IEEE International Conference on. IEEE, 2011. o o o Lukáš, Jan, and Jessica Fridrich. "Estimation of primary quantization matrix in double compressed JPEG images." Proc. Digital Forensic Research Workshop. 2003. Fu, Dongdong, Yun Q. Shi, and Wei Su. "A generalized Benford's law for JPEG coefficients and its applications in image forensics." Electronic Imaging 2007. International Society for Optics and Photonics, 2007. He, Junfeng, et al. "Detecting doctored JPEG images via DCT coefficient analysis." Computer Vision ECCV 2006. Springer Berlin Heidelberg, 2006. 423-435. o Popescu, Alin C., and Hany Farid. "Statistical tools for digital forensics."information Hiding. Springer Berlin Heidelberg, 2004.
One possible approach (1/4) Use machine learning techniques (Support Vector Machines) to build a detector that can distinguish between single quantized histograms ( without holes ) and double quantized histograms (with holes ) What is SVM? SVM: a supervised learning metodology that analysis data for classification Given a set of training examples with labels (marked for belonging to one of two categories), an SVM training algorithm builds a classifier that assigns new examples into one category or the other.
One possible approach (2/4) Step 1: preparation of image data sets Gather a rather large number of uncompressed (TIF) images (~500-1000) o Compress each image once with relatively low QF (e.g. 70) to create examples of the first class of images (C1) o Compress each image twice, first with QF=70 (e.g.) and then with a larger QF (e.g. 90) to create examples of the second class of images (C2) o (Look at the peak and gap artifact!!!!!)
One possible approach (3/4) Step 2: compute histograms of DCT coefficients For each image of the single quantized class (C1) o Divide the image in 8x8 blocks and compute the DCT for each block o Compute 64 DCT histograms (1 DC, 63 AC) and concatenate them all o (This vector must be fed to the SVM as example of the first class) For each image of the double quantized class (C2) o Divide the image in 8x8 blocks and compute the DCT for each block o Compute 64 DCT histograms (1 DC, 63 AC) and concatenate them all o This vector must be fed to the SVM as example of the second class
One possible approach (4/4) Step 3: train a Support Vector Machine Choose 90% of the images of each class to train the SVM (with N-fold crossvalidation) o Use LIBSVM MATLAB toolbox (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) Step 4: test the above Support Vector Machine Use the remaining 10% of the images of each class to evaluate the accuracy of classification