The Sensitivity Matrix

Size: px

Start display at page:

Download "The Sensitivity Matrix"

Norah Carson
5 years ago
Views:

1 The Sensitivity Matrix Integrating Perception into the Flexcode Project Jan Plasberg Flexcode Seminar Lannion June 6, 2007 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 1

2 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 2 Outline Flexcode in a Nutshell The Sensitivity Matrix Coding with the Sensitivity Matrix Open Issues

3 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 3 Who? Ericsson KTH Nokia RWTH Aachen France Telecom

4 The Problem Heterogeneity of networks increasing Networks inherently variable (mobile users) But: Coders not designed for specific environment Coders inflexible (codebooks and FEC) Feedback channel underutilized SIP - Sound and Image Processing Lab, EE, KTH Stockholm 4

5 Adaptation and Coding adaptive coding flexible set of models, fixed quantizer AMR CELP with FEC fixed model fixed quantizer PCM rigid SIP - Sound and Image Processing Lab, EE, KTH Stockholm 5

6 The Tools Tools include Models of source, channel, receiver High-rate quantization theory Multiple description coding (MDC) Iterative source-channel decoding Distortion measures using the sensitivity matrix SIP - Sound and Image Processing Lab, EE, KTH Stockholm 6

7 Outline Flexcode in a Nutshell The Sensitivity Matrix Why, what and how? What can it tell us? Coding with the Sensitivity Matrix Open Issues SIP - Sound and Image Processing Lab, EE, KTH Stockholm 7

8 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 8 Human Auditory Masking Simultaneous masking (frequency masking)

9 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 9 Human Auditory Masking Non-simultaneous masking (time masking) Not accounted for by simple auditory models Coding standards use (heuristic) work-arounds

10 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 10 Motivation Merge the worlds of auditory modeling and audio coding Advanced auditory models too complex for coding x AM xˆ AM + d( x,ˆ x) perceptual distortion criterion

11 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 11 The Sensitivity Matrix Sensitivity Matrix M from Taylor expansion 1 H d( x, xˆ) ( x xˆ) M x ( x) ( x xˆ), d( x, xˆ) 2 0 Complexity problem solved! [ M x ( x)] i, j = 2 d( x, xˆ) xˆ xˆ i j xˆ = x Analysis of model properties using linear algebra Masking curve Subspace-based analysis

12 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 12 Application to an Auditory Model A typical auditory model (here: the Dau model) channel m x Filterbank Stage 1 Stage 2 L Stage K y Distortion is sum of per-channel distortions

13 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 13 Application to an Auditory Model x Filterbank Stage 1 Stage 2 L Stage K y Find sensitivity matrix for model output y per channel m d ( m) ( y, yˆ) = 1 ( y 2 yˆ) H M ( m) y ( y) ( y yˆ) here ( m) y M ( y) = 2I

14 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 14 Application to an Auditory Model x Filterbank Stage 1 Stage 2 L Stage K y Each model stage linearized by respective Jacobian z J z (z) y

15 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 15 Application to an Auditory Model x Filterbank Stage 1 Stage 2 L Stage K y Get sensitivity matrix for channel m in input signal domain from chain rule; M ( m) x H ( m) ( x) = 2 J k J k k ( m) k

16 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 16 Application to an Auditory Model Sensitivity Matrix is sum of per-channel matrices Filterbank Stage 1 Stage 2 Stage K L y x = = m k m k H k m k m m x x ) ( ) ( ) ( 2 ) ( ) ( J J x M x M

17 Validity of the Approximation Narrow-band music + white noise at different SNR (10-60 db), correlation > 0.9 d (db) SM coding possible d Dau (db) SIP - Sound and Image Processing Lab, EE, KTH Stockholm 17

18 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 18 Analysis: Fourier Transform Sensitivity matrix M X (X) for frequency domain vectors, with F = DFT in matrix notation H ( x xˆ) = F ( X Xˆ ) M ( X) = F M ( x) X x F H

19 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 19 Analysis: Masking Curve Assume masking threshold is at d( X,Xˆ ) = 1 Masking curve at frequency i: gain for unit distortion in DFT bin i g i u to approach masking threshold 1 = g i [ 0 L 1 L ] H i = 0 = 1 H u i M X ( X) u 2 2 [ M ] X ( X) i, i pos. i i g i

20 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 20 Example 1: Simultaneous Masking Masking curve for 50 db sinusoid at 1000 Hz sensitivity matrix for Dau et al. - spectral model (v.d.par, 2002) O human listeners (Zwicker, 1982) db MASK F f (Hz)

21 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 21 Example 2: Non-Simultaneous Masking Subspace analysis of M: low sensitivity space signal noise projection t (ms)

700 200 signal WMSE-coder f (Hz) F 700 Sens.

22 A Spectro-Temporal Experiment Spectrograms for two sinusoids (200 Hz and 700 Hz) coded with an APCM coder f (Hz) F f (Hz) F signal WMSE-coder f (Hz) F 700 Sens.-coder t (ms) T SIP - Sound and Image Processing Lab, EE, KTH Stockholm 22

23 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 23 Outline Flexcode in a Nutshell The Sensitivity Matrix Coding with the Sensitivity Matrix Open Issues

24 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 24 Problem Sensitivity Matrix generally not diagonal in typical coding domains (e.g., time, frequency, etc.) distortion not single-letter: 1 H d( x, xˆ) ( x xˆ) M x ( x) ( x xˆ) 2 no analytical solution known for optimal coding so far only trained VQ available (undoable for large dimensions)!

25 Solution Outline Sensitivity Matrix diagonal in perceptual domain transformsignal vectors into perceptual domain (W)MSE-optimal coding in perceptual domain results in minimum perceptual distortion Optimal Transform: T opt F ( x) F ( x) = cm ( x) opt x SIP - Sound and Image Processing Lab, EE, KTH Stockholm 25

26 Making it Practical Problem: optimal transform depends on x not known at decoder! Solution: Parameterize transform as DCT + diagonal weights F ( x) = W T f DCT W t combines DCT coding gain with perceptual transform! Model-driven approach to temporal noise shaping (TNS) SIP - Sound and Image Processing Lab, EE, KTH Stockholm 26

27 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 27 Examples Original Freq. weights F ( x) = W f T DCT Time-freq. weights F ( x) = W T f DCT W t

28 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 28 Outline Flexcode in a Nutshell The Sensitivity Matrix Coding with the Sensitivity Matrix Open Issues

29 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 29 Complexity Main complexity lies here: ( m) M( x) = 2 J k J m k k ( m) k Can we make multiplications sparse? Can we reduce the size of the matrices? H

30 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 30 Conclusion Using auditory models in coding is possible General method Direct applicability in AbS structures (CELP) Solution outline for transform coding

31 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 31 The End Thank you!

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution