A Deep Learning Framework for Authorship Classification of Paintings Kai-Lung Hua ( 花凱龍 ) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei, Taiwan
Massive digitized painting images on the Internet Everyone can upload artworks.
Digitized painting images lack information. For example, authorship can determine era, place, media and style.
Why artist-based image classification? Facilitate artist-based retrieval Identify unknown paintings Provide insights from an artist s artwork
Challenges: Different artists may draw similar objects. An artist may have several techniques. Different artists may have similar techniques.
Methods
Deep Learning Using brain-inspired mechanics to achieve brain-like function [Ref] http://www.slideshare.net/roelofp/041114-dl-nlpwordembeddings
Different Levels of Abstraction
Traditional Learning Approach [Ref] http://slazebni.cs.illinois.edu/spring14/lec24_cnn.pdf
What about learning the features? Learn a feature hierarchy all the way from pixels to classifier Each layer extracts features from the output of previous layer Train all layers jointly
Shallow vs. Deep Architectures
Basic Neural Network [Ref] http://www.amax.com/blog/?p=804
Deep Neural Network
State-of-the-art Image Classification Caffe: A framework based on Convolutional Neural Network (CNN) Caffe framework: Y. Jia, et al, Caffe: convolutional architecture for fast feature embedding," ACM International Conference in Multimedia, 2014. Convolutional Neural network: Y. LeCun, et al, Backpropagation applied to hand-written zip code recognition, Neural Computing, 1989.
Caffe has a rigid architecture Only receive 227 x 227 size of image input Images have arbitrary sizes. Relax the rigidness Utilize spatial pyramid representation
Spatial Pyramid Representation A collection of orderless features computed over cells defined
Training of Multi-scale Networks Train a model for each layer Fine-tune pre-trained network Pre-trained network: A. Krizhevsky, et al. Imagenet classication with deep convolutional neural networks, NIPS, 2012.
Methods using Trained Networks 1. Single-scale with Hard Decision using fine-tuned Krizhevsky s CNN Method (Baseline) 2. Multi-scale Pyramid with Hard Decision 3. Multi-scale Pyramid with Soft Decision 4. Multi-scale Pyramid with Soft Decision and Adaptive Pooling 5. MRF Refining Scheme Baseline: A. Krizhevsky, et al. Imagenet classication with deep convolutional neural networks, NIPS, 2012. 11 of 33
1 Single-scale with Hard Decision using fine-tuned Krizhevsky s CNN Method (Baseline) Only use the 1 st layer Predict using Caffe framework Baseline: A. Krizhevsky, et al. Imagenet classication with deep convolutional neural networks, NIPS, 2012.
General Work Flow of Using Multi-scale Networks Work flow for the 2 nd, 3 rd and 4 th method
2 Multi-scale Pyramid with Hard Decision Method Obtain a predicted class label for each image patch from the model Use voting to pool the labels for each layer Combine the results of 3 layers using weights 16:4:1
3 Multi-scale Pyramid with Soft Decision Method Obtain a distribution over the classes for each image patch from the model Sum the probability values in each layer Combine 3 layers using weights 16:4:1
4 Multi-scale Pyramid with Soft Decision Method and Adaptive Pooling Combine 3 layers using adaptive weights based on class entropy
Adaptive Weights Entropy for the j-layer: (1) Weight for the j-layer: (2) Aggregation of 3 layers: (3) Choose the class with the maximum probability as the final decision
5 Markov Random Field (MRF) Refining Method Obtain a predicted class label for each image patch from the model Fuse patch-level results using MRF
MRF Refining Scheme Work Flow
MRF Refining Scheme Formula MRF energy function to predict a node as class i: (4) data. Data term Compute the compatibility of the labeling with the given (5) Smoothness term Penalize node that has different labeled neighbor. (6)
MRF Scheme: Combining Patch-level Results Convert hard into soft decision (7)
MRF Scheme: Combining Patch-level Results Layer Entropy: (8) Layer Weight: (9) Aggregate 3 layers using the adaptive weight method
EXPERIMENTS
Dataset PaintingDb dataset 1300 painting images of 13 artists (100 images / artist) WikiArt dataset 20405 painting images of 23 artists PaintingDb: PaintingDb, PaintingDb fastest growing art gallery in the web," 2015. www.paintingdb.com WikiArt: www.wikiart.org WikiArt, WikiArt the online home for visual arts from all around the world., 2016.
Experiment Setup Experiment 1 (PaintingDb dataset) 80 images for training, 20 images for testing (for each class) Experiment 2 (WikiArt dataset) 2/3 images for training, 1/3 images for testing (for each class) Measurements: precision, recall, and F-score.
Experiment 1: Contribution of Multi-scale Pyramid
Experiment 1: Result
Experiment 2: Result
Conclusion We address the painter classification problem using deep learning approaches. Experiments show our proposed methods that utilize multi-scale pyramid outperform the baseline method. In experiment 1, the MRF refining scheme can significantly boost the performance of Multi-scale pyramid with adaptive fusion.
Future Work A better fusion scheme that aggregates the results from different layers should be further investigated. An inspection of a good method for finding MRF parameters automatically.