Hey,
I have put some tensorboard code into the train_ssd.py from Dusty’s repo and it is pretty straight forward at first. My graphs need smoothing but they appear at least!.
The main issue I am having is working out which bits go where in terms of the results that the validation inference returns.
I need to take the groundtruths and predictions and draw them side by side in tensorboard, as in the example below:
This is a screenshot from my training using the tensorflow detection API. They use visualization techniques that are different to DetectNet.
So the code I have entered so far:
from torch.utils.tensorboard import SummaryWriter
# Writer will output to ./runs/ directory by default
current_day = datetime.date.today()
current_moment = datetime.datetime.now()
current_time = current_moment.strftime("%H-%M-%S")
board_name = f'{current_day}_{current_time}'
tb = SummaryWriter(comment=f'{board_name}')
def train(loader, net, criterion, optimizer, device, debug_steps=100, epoch=-1):
net.train(True)
running_loss = 0.0
running_regression_loss = 0.0
running_classification_loss = 0.0
running_correct = 0
for i, data in enumerate(loader):
images, boxes, labels = data
images = images.to(device)
boxes = boxes.to(device)
labels = labels.to(device)
optimizer.zero_grad()
confidence, locations = net(images)
regression_loss, classification_loss = criterion(confidence, locations, labels, boxes) # TODO CHANGE BOXES
loss = regression_loss + classification_loss
loss.backward()
optimizer.step()
running_loss += loss.item()
running_regression_loss += regression_loss.item()
running_classification_loss += classification_loss.item()
running_correct += (locations == boxes).sum().item()
if i and i % debug_steps == 0:
avg_loss = running_loss / debug_steps
avg_reg_loss = running_regression_loss / debug_steps
avg_clf_loss = running_classification_loss / debug_steps
logging.info(
f"Epoch: {epoch}, Step: {i}/{len(loader)}, " +
f"Avg Loss: {avg_loss:.4f}, " +
f"Avg Regression Loss {avg_reg_loss:.4f}, " +
f"Avg Classification Loss: {avg_clf_loss:.4f}"
)
#Tensorboard scalar entry
tb.add_scalar('Avg Loss', avg_loss, epoch)
tb.add_scalar('Avg Regression Loss', avg_reg_loss, epoch)
tb.add_scalar('Avg Classification Loss', avg_clf_loss, epoch)
tb.flush()
running_loss = 0.0
running_regression_loss = 0.0
running_classification_loss = 0.0
tb.add_scalar('Training Loss', running_loss / len(data), epoch)
tb.add_scalar('Training Accuracy', running_correct / len(data), epoch)
def test(loader, net, criterion, device):
net.eval()
running_loss = 0.0
running_regression_loss = 0.0
running_classification_loss = 0.0
num = 0
for _, data in enumerate(loader):
images, boxes, labels = data
images = images.to(device)
boxes = boxes.to(device)
labels = labels.to(device)
num += 1
with torch.no_grad():
confidence, locations = net(images)
regression_loss, classification_loss = criterion(confidence, locations, labels, boxes)
loss = regression_loss + classification_loss
#TODO: write images to tensorboard of detections and ground truths
running_loss += loss.item()
running_regression_loss += regression_loss.item()
running_classification_loss += classification_loss.item()
return running_loss / num, running_regression_loss / num, running_classification_loss / num
What current output looks like:
some scalars do not look right to me…
I am pretty much working with the official docs and whatever I can find online but I do not find many people out there doing this particular thing with detectNet :/
https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html
A point in the right direction as to how I go about extracting what I need from the results in test() to write to the board would be greatly appreciated!
Once I have completed this venture into integrating it properly I will post the full code for others.
EDIT: I have read through the code in the API: https://github.com/tensorflow/models/blob/master/research/object_detection/utils/visualization_utils.py
as well as detectnet.cpp etc and I can see this could be difficult getting the original boxes drawn and displayed however if I could even get just the detection images with boxes written to tensorboard that would be great without the originals next to them. It just makes it so much easier to scroll back through points in training and work out sweet spots for balance.