MNIST 1- Prepare Dataset cd $CAFFE_ROOT./data/mnist/get_mnist.sh./examples/mnist/create_mnist.sh If you installed VM and Linux libraries as in the tutorial, you should not get any errors. Otherwise, you may need to install wget or gunzip. 2-Open Text Editor cd $CAFFE_ROOT cd examples cd mnist gedit Now, you have a text editor that waits for you to configure the deep architecture. 3-Define Network Name name: "LeNet" 3-Define Trian Data Layer name: "mnist" type: "Data" top: "data" top: "label" include { phase: TRAIN transform_ scale: 0.00390625 data_ source: "examples/mnist/mnist_train_lmdb" batch_size: 64 backend: LMDB
Let s look at what the code refers. mnist is the layer name. DATA is the layer type. Data is read from the lmdb source. Batch size is 64. Scale is set to 1/256 to set the pixel value range to [0,1). This layers produces two blobs as data and label. The naming is self-explanatory so the layer definitions can easily be understood. 3-Define Test Data Layer name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST transform_ scale: 0.00390625 data_ source: "examples/mnist/mnist_test_lmdb" batch_size: 100 backend: LMDB 4 Define Convolutional Layer name: "conv1" type: "Convolution" bottom: "data" top: "conv1" lr_mult: 1 lr_mult: 2 convolution_ num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" bias_filler { type: "constant"
This layer takes the data blob as input and generates conv1 layer. Output has 20 channels, kernel size is set to 5 and stride is 1. Weights and bias values are randomly initialized. xavier is the algorithm to adjust the scale of the initialization based on number of input and output neurons. lr_mult are the learning rate adjustments. It means the weight learning rate is set to the same value given by the solver and bias learning rate is set to the double. 5- Define Pooling Layer name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_ pool: MAX kernel_size: 2 stride: 2 We defined a non-overlapping max pooling operation with block size and stride of 2. Let s add another convolutional and pooling layer to increase the abstraction in the network. 6- Define another Pooling and Convolutional Layer name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" lr_mult: 1 lr_mult: 2 convolution_ num_output: 50 kernel_size: 5 stride: 1 weight_filler { type: "xavier" bias_filler { type: "constant" name: "pool2" type: "Pooling" bottom: "conv2"
top: "pool2" pooling_ pool: MAX kernel_size: 2 stride: 2 7 Define the Fully Connected Layer name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" lr_mult: 1 lr_mult: 2 inner_product_ num_output: 500 weight_filler { type: "xavier" bias_filler { type: "constant" This layers take the input from the pooling layer and outputs 500 nodes. 8- Define the Activation Layer name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1" Note that the bottom and top layers are defined as the same. This kind of configuration corresponds to the in-place operation which can be used for element-wise operations to save some memory. 8- Define another Fully Connected Layer
name: "ip2" type: "InnerProduct" bottom: "ip1" top: "ip2" lr_mult: 1 lr_mult: 2 inner_product_ num_output: 10 weight_filler { type: "xavier" bias_filler { type: "constant" 9- Define the Accuracy layer name: "accuracy" type: "Accuracy" bottom: "ip2" bottom: "label" top: "accuracy" include { phase: TEST This layer is just to show the accuracy of the output with respect to the target and it does not have a backward step. 10 - Define the Loss Layer name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss" - Save the file as msl_lenet_train_test.prototxt
Loss layer takes the predictions and labels as the input. This layer does not have any outputs but it initiates the gradient and calculates the loss when backpropagation starts. 11 Define Solver - Go to $CAFFE_ROOT/examples/mnist - Open text editor and type the following: # The train/test net protocol buffer definition net: "examples/mnist/lenet_train_test.prototxt" # test_iter specifies how many forward passes the test should carry out. # In the case of MNIST, we have test batch size 100 and 100 test iterations, # covering the full 10,000 testing images. test_iter: 100 # Carry out testing every 500 training iterations. test_interval: 500 # The base learning rate, momentum and the weight decay of the network. base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 # The learning rate policy lr_policy: "inv" gamma: 0.0001 power: 0.75 # Display every 100 iterations display: 100 # The maximum number of iterations max_iter: 10000 # snapshot intermediate results snapshot: 5000 snapshot_prefix: "examples/mnist/lenet" # solver mode: CPU or GPU solver_mode: CPU - Save the file as msl_lenet_solver.prototxt: 12 - Write the test script - Go to $CAFFE_ROOT/examples/mnist - Open text editor and type the following: #!/usr/bin/env sh./build/tools/caffe train --solver=examples/mnist/msl_lenet_solver.prototxt - Save the file as msl_lenet.sh - Go to $CAFFE_ROOT/examples/mnist chmod +x msl_lenet.sh
13 Run the test script cd $CAFFE_ROOT./examples/mnist/msl_lenet.sh Layer Writing Rules: layers { //...layer definition... include: { phase: TRAIN Layers can have rules about when and how they are included in the network. For example, if the layer definition includes the above statement, that layer is only included in the training phase. Layer Types in Caffe Vision Layers [KEYWORD] PS: Keywords can change from version to the version - Convolution [CONVOLUTION] - Pooling [POOLING] - Local Response Normalization [LRN] Loss Layers - Softmax [SOFTMAX_LOSS] - Sum-of-Squares / Euclidean [EUCLIDEAN_LOSS] - Hinge/Margin [HINGE_LOSS] - Sigmoid Cross Entropy [SIGMOID_CROSS_ENTROPY_LOSS] - Infogain [INFOGAIN_LOSS] - Accuracy and Top-k: [ACCURACY]: accuracy of the output with respect to the target, no backward steps Activation / Neuron Layers
- ReLU / Rectifies-Linear and Leaky-ReLU [RELU] - Sigmoid [SIGMOID] - TanH / Hyperbolic Tangent [TANH] - Absolute Value [ABSVAL] - Power [POWER] - Binomial Normal Log Likelihood [BNLL] Data Layers - Database [DATA] - Memory [In-Memory]: Reads data directly from memory without copying it - HDF5 Output [HDF5_OUTPUT]: Write input blobs to disk - Images [IMAGE_DATA] - Windows [WINDOWS_DATA] - Dummy [DUMMY_DATA] Common Layers - Inner Product [INNER_PRODUCT] - Splitting [SPLIT]: input blob -> multiple output blobs - Flattening [FLATTEN]: Blob to vector conversion - Concatenation [CONCAT] - Slicing [SLICE]: input layer -> multiple output layer - Element-wise operations [ELTWISE] - Argmax [ARGMAX] - Softmax [SOFTMAX] - Mean-Variance Normalization [MVN]