We will finish this chapter with a new family of CNN that not only has good accuracy, but is lighter and works faster on mobile devices.
Created by Google, MobileNet's key feature is that it uses a different "sandwich" form of convolution block. Instead of the usual (CONV
, BATCH_NORM,RELU
), it splits 3x3 convolutions up into a 3x3 depthwise convolution, followed by a 1x1 Pointwise CONV. They call this block a depthwise separable convolution.
This factorization reduces the computation and the model size:
This new convolution block (tf.layers.separable_conv2d
) consists of two main parts: a depthwise convolution layer, followed by a 1x1 pointwise convolution layer. This block differs from the normal convolution in a couple of ways:
- In the normal convolution layer, each filter F will be applied to all channels on the input channel at the same time (F is applied to each channel and then summed)
- This new convolution F is applied on each channel separately...