Counterfactual Testing of Deep Neural Networks
Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing
Modern deep neural networks tend to be evaluated on static test sets with fine-grained naturalistic variations such as object pose, scale, viewpoint, lighting and 3d occlusions.Or"would your classification still be correct when the object were partially occluded by another object?".Our method allows for a fair comparison of the recent recently released, state-of-the-art convolutional neural networks and visiontransformers, with respect to these naturalistically variations.We find evidence that convnext is more robust to pose and scale variations than swin, that convnext generalizes better to our simulated domain and that swin handles partial occlusion better than convnext.
A sample of objects in our proposed