Robust Slide Cartography in Colon Cancer Histology: Evaluation on a Multi-scanner Database
Petr Kuritcyn, Carol I. Geppert, Markus Eckstein, Arndt Hartmann, Thomas Wittenberg, Jakob Dexl, Serop Baghdadlian, David Hartmann, Dominik Perrin, Volker Bruns, Michaela Benz
Fraunhofer IIS, Erlangen
Abstract
Robustness against variations in color and resolution of digitized whole-slide images (WSIs) is an essential requirement for any computer-aided analysis in digital pathology. One common approach to encounter a lack of heterogeneity in the training data is data augmen- tation. We investigate the impact of different augmentation techniques for whole-slide cartography in colon cancer histology using a newly cre- ated
multi-scanner database of 39 slides each digitized with six differ- ent scanners. A state of the art convolutional neural network (CNN) is trained to differentiate seven tissue classes. Applying a model trained on one scanner to WSIs acquired with a different scanner results in a significant decrease in classification accuracy. Our results show that the impact of resolution variations is less than of color variations: the accu- racy of the baseline model trained without any augmentation at all is 73% for WSIs with similar color but different resolution against 35% for WSIs with similar resolution but color deviations. The grayscale model shows comparatively robust results and evades the problem of color vari- ation. A combination of multiple color augmentations methods lead to a significant overall improvement (between 33 and 54 percentage points). Moreover, fine-tuning a pre-trained network using a small amount of annotated data from new scanners benefits the performance for these particular scanners, but this effect does not generalize to other unseen scanners.