--The label used is different. That's the only difference. For "categorical_crossentropy", use the label onehot (somewhere is 1 and all others are 0). Use an integer for the label "sparse_categorical_crossentropy".
| one-hot expression | Integer representation | 
|---|---|
| [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.] | [9] | 
| [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.] | [2] | 
| [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] | [1] | 
| [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.] | [5] | 
I get the impression that many datasets have integer labels, but many loss functions do not work unless you give them one-hot labels instead of integer labels. In such a case, you need to convert. (Rather, I feel that there are a minority of loss functions that can be learned with integer labels, such as "sparse_categorical_crossentropy".)
The code is shown below.
import numpy as np
n_labels = len(np.unique(train_labels))
train_labels_onehot = np.eye(n_labels)[train_labels]
n_labels = len(np.unique(test_labels))
test_labels_onehot = np.eye(n_labels)[test_labels]
        Recommended Posts