Why do deep convolutional networks generalize so poorly to small image transformations?


Deep convolutional network architectures are often assumed to guarantee generalization for small image translations and deformations. This paper shows that modern CNNs (VGG16, ResNet50, and InceptionResNetV2) can drastically change their output when an image is translated in the image plane by a few pixels and that this failure of generalization also happens with other realistic small image transformations.


Want to receive more content like this in your inbox?