GTC 2020: Improving CNN Performance with Spatial Context

GTC 2020 S21388
Presenters: Daniel Russakoff,Voxeleron
Deep learning with convolutional neural networks (CNN) is a powerful technique with wide-ranging applications. It has largely replaced traditional computer vision as the go-to method for solving image-analysis and classification problems. At its essence, however, training a CNN is an enormous global optimization problem which, like all optimizations, can fall victim to local extrema. We’ll discuss ways of mitigating this issue using computer vision to add spatial context information to restrict the domain of optimization. These techniques not only speed up the training, but also improve the overall performance of the networks. We’ll demonstrate results on real-world classification and segmentation problems.

Watch this session
Join in the conversation below.


I did not expect this content when reading the abstract, but I am happy to have attended this lecture. Thank you for sharing these interesting techniques.

  • Spatial injection seems to be a rather available solution in medical imaging with the existence of DICOM RTSTRUCT or other files of contours of anatomical structures. Maybe other domains and tasks can lack of such annotated databases but they clearly deserve a try.
  • For homogenisation, I think I see how we can implement it with piecewise interpolations for the 2D example you have presented but the implementation may be harder to for 3D cases. For complex shapes or volumes, maybe we can take advantage of a polar transform.