Interpretable Compressed Domain Video Annotation TU Berlin IC / Frauhofer HHI, TU Berlin IDA

Large Databases (Youtube, Netflix) Internet Traffic by 2019 Requirements Solutions 20% Other Data Compressed domain analysis Motion vector based models 80% Video Data (compressed, multimodal) Efficient (generic) representation Multimodal Integration Fisher Vectors Multi-stream networks 1.6 zettabyte Interpretable Analysis LRP & Deep taylor analysis New streaming applications (Autonomous driving, Industry 4.0) Computationally heavy methods (Deep learning) # 2013 Berlin Big Data Center All Rights Reserved

Projects done Compressed Domain Video Analysis: Motion vector histograms & Fisher Vector representation Tracking in compressed domain based on Markov Random Field (MRF) Robustness of pixel domain vs. compressed domain methods Multi-stream convolutional neural networks + LSTM Interpretable machine learning: Layer-wise Relevance Propagation Deep Taylor Decomposition Evaluating the visualizations Application to images, videos, text, time series Scalable Retrieval: Multi-purpose Locality Sensitive Hashing (mplsh) Similarity Search in Compressed Domain Publications PLOS15, MMSP16, EUVIP16, JMLR16, ICANN16, ICIP16, ICML16, NIPS16, TNNLS16, CVPR16, ICISA16, ICIP216, GCPR16, ACL16, JNM16 #

Interpretable Compressed Domain Video Annotation

Requires only partial decoding of motion vectors, transform coefficients, block coding modes, etc. Direct analysis of compressed videos reduces: - storage and bandwith requirements - computational overhead #

Motion vectors as features

Motion vectors as features Motion vectors Video sequence Optical flow Motion vectors can be extracted from the compressed video BUT: - They not necessarily represent the true motion. - They are much sparser and have smaller resolution than e.g. optimal flow.

Fisher Vector based video annotation 1. Extract the motion vectors 2. Compute HOF and MBH features and stack all cubes over all time slices. 3. Cluster the descriptors using a GMM 4. Compute Fisher Vectors and stack them 5. Annotate the video with a linear SVM classifier.

Interpretable classification Black-box classifier Wrong prediction cat Famous example good weather no tanks bad weather tanks

Interpretable classification 1st step in improving ML algorithms is to unterstand their weaknesses Interpretability has also a legal aspect (EU s right to explain regulation by 2018) bank has to explain why you don t get loan in order to avoid discrimination by algorithms Also important in the sciences new hypotheses by better understanding what s going on (e.g. genetic studies) Interpretability helps to retain human responsibility important in e.g. medial applications, ML algorithm just a helping tool (medical doctor is responsible)

Interpretable classification Main idea: ladybug

Interpretable classification Classification cat ladybug dog

Interpretable classification Explanation cat ladybug dog Initialization =

Interpretable classification Explanation? cat ladybug dog Theoretical interpretation (Deep) Taylor Decomposition (Montavon et al., arxiv 2015) Relevance of upper layers is redistributed to lower layers proportionally (depending on activations & weights).

Interpretable classification Explanation cat ladybug dog Relevance Conservation Property

Interpretable classification

Interpretable classification We can explain compressed domain classifier by LRP. - Redistribute relevance from output to input in a meaningful manner - Observe layer-wise relevance conservation principle

Interpretable classification

Demo: Interpretable Compressed Domain Video Annotation #