Original paper
Deep neural networks are easily fooled High confidence predictions for unrecognizable images
In order to know the contents of DNN, an image that is not understood by humans but is classified into DNN with 99% or more confidence was generated.
For example, it looks like this:
 
 (Quoted from the paper)
(Quoted from the paper)
Let's generate this fooling image for the generating model as well.
like this
 If you include an image of a horse, you will get a caption like this. It seems that you can see two horse-like things.
The probability of a sentence appearing is calculated from the probability of a word appearing. Three sentences that are easy to come out are displayed.
It is judged that the smaller the number on the left is, the more appropriate the sentence is for the image.
(Actually, the sign inversion of the sum of the logarithms of softmax for each word divided by the number of words)
If you include an image of a horse, you will get a caption like this. It seems that you can see two horse-like things.
The probability of a sentence appearing is calculated from the probability of a word appearing. Three sentences that are easy to come out are displayed.
It is judged that the smaller the number on the left is, the more appropriate the sentence is for the image.
(Actually, the sign inversion of the sum of the logarithms of softmax for each word divided by the number of words)
Also, if you insert an image with random pixel values, the following sentence will be generated.
 Although it is a sentence, the number is large, that is, it is not possible to judge what is in the image.
Although it is a sentence, the number is large, that is, it is not possible to judge what is in the image.
I was able to generate it well for the time being.
 

Two sheets were generated. Neither is known to humans, and machines have a high probability of generating sentences about horses. (= The number is smaller than the previous example)
Above: direct encoding, the pixel of the image is the direct gene Bottom: indirect encoding, pixels have some correlation In the paper, the indirect encoding had a beautiful pattern and was exhibited as art, but it didn't work just by creating an NN and giving it a correlation. (Maybe it was too good)
The image was evolved so that the probability of generating one sentence is high. At the top of the sentence that was finally generated in the first example "a couple of horses are standing in a field" Was selected, and the image was evolved so that the probability of generating this sentence was high. Eight new individuals were generated each time, leaving eight excellent individuals, and direct encoding gave such a result in about 300 generations.
This time, a fooling image was generated for the caption generation model Show, Attend and Tell. The BLEU value for COCO of the model was 0.689 / 0.503 / 0.359 / 0.255.
Using an evolutionary algorithm, we succeeded in generating a fooling image that increases the probability of generating a sentence for the generative model. If you feel like this image can be fooled to other models trained on the same CNN, or if you evolve it for multiple sentences, try it.