Getting Model Prediction From Multiple cameras (DLI Course)

Abuelgasim · April 8, 2021, 9:25am

Hi everyone, I am currently going through the DLI Course, I tried the classification task and it works great and with really fast inference time, I decided I wanted to get the same model predictions but with 2 cameras (CSI Camera and USB Webcam) and so I duplicated the codes from the Live Inference so that it can take in both my camera inputs and make predictions and it worked! but unfortunately the latency is quite high with 2 cameras and I know the problem is with just duplicating the codes (there must be a more efficient way) Anyways I’m posting my code here and any feedbacks would be highly appreciated. Thanks!

import threading
import time
from utils import preprocess
import torch.nn.functional as F

CATEGORIES = ['background', 'bluecar', 'yellowcar']

state_widget = ipywidgets.ToggleButtons(options=['stop', 'live'], description='state', value='stop')
prediction_widget = ipywidgets.Text(description='prediction')
prediction_widget1 = ipywidgets.Text(description='prediction')
score_widgets = []
score_widgets1 = []
for category in CATEGORIES:
    score_widget = ipywidgets.FloatSlider(min=0.0, max=1.0, description=category, orientation='vertical')
    score_widget1 = ipywidgets.FloatSlider(min=0.0, max=1.0, description=category, orientation='vertical')
    score_widgets.append(score_widget)
    score_widgets1.append(score_widget1)

def live(state_widget, model_trt, camera, camera1, prediction_widget, prediction_widget1, score_widget, score_widget1):
    while state_widget.value == 'live':
        image = camera.value
        image1 = camera1.value
        preprocessed = preprocess(image)
        preprocessed1 = preprocess(image1)
        output = model_trt(preprocessed)
        output1 = model_trt(preprocessed1)
        output = F.softmax(output, dim=1).detach().cpu().numpy().flatten()
        output1 = F.softmax(output1, dim=1).detach().cpu().numpy().flatten()
        category_index = output.argmax()
        category_index1 = output1.argmax()
        prediction_widget.value = CATEGORIES[category_index]
        prediction_widget1.value = CATEGORIES[category_index1]
        for i, score in enumerate(list(output)):
            score_widgets[i].value = score
        for i, score1 in enumerate(list(output1)):
            score_widgets1[i].value = score1        
            
def start_live(change):
    if change['new'] == 'live':
        execute_thread = threading.Thread(target=live, args=(state_widget, model_trt, camera, camera1, prediction_widget, prediction_widget1, score_widget, score_widget1))
        execute_thread.start()

state_widget.observe(start_live, names='value')

live_execution_widget = ipywidgets.VBox([
    ipywidgets.HBox(score_widgets),
    prediction_widget
])

live_execution_widget1 = ipywidgets.VBox([
    ipywidgets.HBox(score_widgets1),
    prediction_widget1
])

live_execution_widget2 = state_widget

dusty_nv · April 8, 2021, 2:14pm

Hi @Abuelgasim, I’m not exactly sure of the best way to handle multiple input streams with the events/callbacks in Jupyter notebook would be, but perhaps @jaybdub may have some suggestions. You may also want to try adding some performance timing into your live function to see where exactly the additional latency is occurring. That may give you a better idea about how to structure it.

Abuelgasim · April 8, 2021, 2:25pm

Hi @dusty_nv, thank you very much for the suggestion would definitely try it out as well as wait for the response from @jaybdub in case he might be familiar with handling multiple input streams on Jupyter lab and might be able to assist as well.

Best,
Abu

jaybdub · July 6, 2021, 8:00pm

Hi @Abuelgasim,

One way to increase the throughput with minimal modification to your code might be to execute the neural network in batch.

For example in your live function do,

input_cat = torch.cat([preprocessed, preprocessed1], dim=0)
output_cat = model_trt(input_cat)
output = output_cat[0:1]
output1 = output_cat[1:2]

You would need to optimize the model with TensorRT setting max_batch_size=2 for this to work.

As dusty mentioned, profiling the code will give the best indication of where you can improve.

It might also be worth trying each camera independently, to see if the pipeline is bottlenecked by one particular camera.

Please let me know if this helps or you have any questions.

Best,
John

Abuelgasim · July 7, 2021, 4:42am

@jaybdub thank you very much, will definitely try it out!

system · September 12, 2021, 3:15am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.