Visualising Regions that Contributed to the Prediction/Identification of an image

Hi there,
I am currently developing a teaching resources for some students in Year 9/10 at a government school in Victoria, Australia.
We have two Jetbots using the Jetson Nano and I would like to create a unit of work that challenges students to recognise the fundamentals of image processing using AI & Machine Learning.

One activity that I’m trying to get off the ground is a way to visualise explanations of image data, similar to what Google have don with this tutorial:

Is there a way to add an overlay on an image, in addition to the classification, that shows pixels/regions that had a large weighted contribution to the prediction?

We are using Jupyter Notebooks and Python, if that helps.

Thanks for your help!


Hi Mark,

Thanks for reaching out!

This sounds like a really interesting addition for educational purposes!

I think it should be possible to accomplish something like this (I’m not sure how “real-time” it would be). I’m happy to look into it in more detail.

Though not exactly what you mentioned, I’ve actually created a small library I use often for real-time model visualization with Jupyter. It’s very simple, it just allows you to view the activations of different layers in the network. It’s really fun to see how different activations change during training, or to see how the different activations respond to live camera data.

The way it works

from ipython.display import Display
from pytorch_module_visualizer import ModuleVisualizer2d

# create a model
model = resnet18(...)

# pick a sub-module you want to visualize the inputs / outputs of
layer1_visualizer = ModuleVisualizer2d(model.layer1)

# show visualizer in notebook

# execute the model inside context manager for visualizer to update the visualization
with layer1_visualizer:
     output = model(data)
# the following will not update the visualizer
output = model(data)

Please note, this is not a maintained project by any means, and is subject to change. There may be similar techniques out there.

Please let me know if you find this useful, or have any other questions!


Hi John,
Thanks for your reply!
Although operation in real-time would be amazing, the idea would be that students would identify regions in an image based on what they see as a dominant feature and compare these to images processed by the Jetson.
If you’re able to look into the details I’d be very grateful…

I’ll definitely try out your library when I get back into the office!


Hi Mark,

Sorry for the delayed response.

I just wanted to follow up since I ended up doing this manually recently. One method for determining which regions contribute is to compute the gradient with respect to input data. You can do this in PyTorch by

data = ... # load an image, say 3x224x224
data.requires_grad = True
torch.fill_(data, 0)   # zero gradient
output = model(data)

scalar = output[0, 0]  # compute some scalar measure.. here we pick value of first output neuron
scalar.backward()  # compute gradient w.r.t input

data.grad[0] # input now holds gradient (change in output w.r.t input data) of shape 3x224x224

data_vis = torch.sqrt(torch.sum(data.grad[0]**2, dim=0))  # compute norm of each pixel


Apologies, I haven’t yet tested the above code line-for-line, but hopefully the approach helps.

Please let me know if this helps or you have any questions.