Deep Learning for Computer Vision with MATLAB and cuDNN

Originally published at: https://developer.nvidia.com/blog/deep-learning-for-computer-vision-with-matlab-and-cudnn/

Deep learning is becoming ubiquitous. With recent advancements in deep learning algorithms and GPU technology, we are able to solve problems once considered impossible in fields such as computer vision, natural language processing, and robotics. Figure 1: Pet detection and recognition system. Deep learning uses deep neural networks which have been around for a few…

Nice writeup. For those interested in GPU systems with cuDNN preinstalled I found the following which readers here might be interested in: http://exxactcorp.com/index...

Hi. Nice work. Could you share your images and videos for this benchmark. Thus, I'll be able to compare it on my hw? Tnx in advance.

Hi Wendell, glad you liked the post. The images and videos belong to my my colleagues and unfortunately I don’t have permissions to share all of them. You can find several dog and cat image datasets and videos on the internet that be readily used for this task. Please note that your mileage may vary since the solution is sensitivity to the training images. For example, if your training data is small and only includes certain pet poses, your model may not be robust to all poses in the video. You may then need to gather more images to introduce pose invariance.

>> opticFlow = opticalFlowFarneback;

Undefined function or variable 'opticalFlowFarneback'.

where can i find opticalFlowFarneback library ?

Hi Kadir, opticalFlowFarneback is part of the Computer Vision System Toolbox and was introduced in R2015b (current release or MATLAB). Make sure you upgrade to this release and you should be able to run the example. Feel free to contact us if you have further questions or need help with upgrade or usage: http://www.mathworks.com/su...

I setup Computer Vision System Toolbox, but in examples, with
mexOpencv example.cpp command, creates the example.mex and example.m files and then build example.m using this mex file. But I have now only script file so it returns Undefined function or variable 'opticalFlowFarneback'. error. So, is there any other solution or path-lib settings I have forgotten?

Hi Elif, see my response to Kadir below. If you have R2015b version of MATLAB and Computer Vision System Toolbox, you should be able to just run 'opticalFlowFarneback' without the need to install anything from external packages.
Here's the link to the function's documentation page:
http://www.mathworks.com/he...

i have problems and i am using 2015a. i am using rgb image. Should i have to use gray scale image?

Reference to non-existent field 'normalization'.

Error in cnnPredict/cnnPreprocess (line 84)

im = imresize(im, cnnModel.net.normalization.imageSize(1:2));

Error in cnnPredict (line 26)

resTemp = vl_simplenn(cnnModel.net, cnnPreprocess(predImage(:,:,:,1)), [], []);

Error in PetDetectionRecognitionScript (line 18)

label = cnnPredict(cnnModel,img);

I have a problem.Under MatConvNet, the function vl_nnconv.m has nothing within it. If you have it, then kindly send the zip folder
''matconvnet-1.0-beta15'' to me. It will be very helpful for me.

email: rahman3.1416@yahoo.com

Under MatConvNet, the function vl_nnconv.m has nothing within it. If you have it, then kindly send the zip folder
''matconvnet-1.0-beta15'' to me. It will be very helpful for me.

email: rahman3.1416@yahoo.com

Hi sadman,

"Reference to non-existent field 'normalization'" means that the cnn model you provided to cnnPredict function doesn't have a field called 'normalization' cnnPredict function needs field to do two things: (1) To resize your input image such that it is compatible with the imagenet network (2) subtract the imagenet average image.
If you downloaded a pretrained imagenet model from vlfeat webpage as suggested in the code files, the model must already have a 'normalization' field that cnnPredict expects, in order to make a prediction.

hi, i have a problem. I ran the code successfully but i didn't get the desired ouput. There was no bounding box around dog or cat in the constructed video test.avi. What's wrong with it please explain someone.

Great, but how i can try a net with my images for recognize pictures?
I dont want download pretrained mat file.
Thank you in advance

Hi Shashank Prasanna,

I have exactly the same problem :
>> imageSize = cnnModel.net.normalization.imageSize;
Reference to non-existent field 'normalization'.

The cnnModel.net is properly downloaded from Vlfeat. I tried "imagenet-vgg-f.mat" and 'imagenet-matconvnet-vgg-f.mat".

Where I got it wrong ? Thank you.

Hi Nico, The version we used for this post is: matconvnet-1.0-beta15
It's possible that later releases store normalization differently.
See list of changes here:
http://www.vlfeat.org/matco...

The blog post outlines the steps you would take to use a pretrained CNN as a feature extraction technique. Alternatively you could train a network from scratch, you should find code examples to do in the MatConvNet examples folder.

https://github.com/vlfeat/m...

Hi Nico,

I had the same problem with matconvnet-1.0-beta 18, but there are only a few lines to fix in the code to get tit working.
You simply need to update the NN in order to make it compatible by:

net = load('imagenet-vgg-f.mat');

cnnModel.net = vl_simplenn_tidy(net)

Those networks apparently have a slightly different structure, than in earlier versions.

In Shashank Prasanna's function cnnPredict.m simply add the "meta" struct field (e.g. cnnModel.net.normalization --> cnnModel.net.meta.normalization ) in lines 78, 84 and 85:
78: classLabel = cnnModel.net.meta.classes.description(labelId)';
84: im = imresize(im, cnnModel.net.meta.normalization.imageSize(1:2));
85:im = bsxfun(@minus,im,cnnModel.net.meta.normalization.averageImage);

Hope that helps.
And thanks to Shashank Prasanna for the great blog post!

Hi Daniel and Shashank,

It works thank you !
I thought about the vl_simplenn_tidy conversion but it was not enough.

Now I have a curious problem when I test cnnPredict:
Number of images: 1
Number of batches: 1
Whereas the "summary(trainingLabels)" indicates the right number of images (more than 50 images in cat and dog folders.
Any idea please ?
Thank you again !

Hi Shashank Prasanna,
first of all thanks for your great blog post, really interesting and useful.
I have tried it and I've had a problem in this part

for ii = 1:numel(imset)
for jj = 1:imset(ii).Count
trainingImages(:,:,:,jj) = imresize(single(read(imset(ii),jj)),imageSize(1:2));
end
end

PROBLEM --> Assignment has fewer non-singleton rhs dimensions than non-singleton subscripts

If I continue doing the following steps, in this one I get another problem:

svmmdl = fitcsvm(cnnFeatures,trainingLabels);

PROBLEM --> Error using classreg.learning.FullClassificationRegressionModel.prepareDataCR (line 138)
X and Y do not have the same number of observations.

Error in ClassificationSVM.prepareData (line 607)
[X,Y,W,dataSummary] = ...

Error in classreg.learning.FitTemplate/fit (line 205)
[X,Y,dataPrepOut{1:this.NDataPrepOut}] = ...

Error in ClassificationSVM.fit (line 237)
this = fit(temp,X,Y);

Error in fitcsvm (line 279)
obj = ClassificationSVM.fit(X,Y,varargin{:});

What could I do to solve it?
Thank you very much.