NVIDIA DIGITS Assists Alzheimer's Disease Prediction

jwitsoe · January 13, 2017, 3:40am

Originally published at: https://developer.nvidia.com/blog/nvidia-digits-alzheimers-disease-prediction/

Pattern recognition and classification in medical image analysis has been of interest to scientists for many years. Machine learning techniques have enabled researchers to develop and utilize complicated models to classify or predict various abnormalities or diseases. Recently, the successful applications of state-of-the-art deep learning architectures have rapidly expanded in medical imaging. Cutting-edge deep learning…

anon4107089 · January 19, 2017, 7:20pm

Why after showing me a beautiful ROC curve do you in the result section state the accuracy result and not the ROAUC value? The whole reason for ROAUC is to avoid the pitfalls of using accuracy as your evaluation metric, ie high accuracies can be achieved concurrently with very low sensitivities or specificities in rare or common conditions, respectively. I'm left wondering what the actual AUC value is. If you need help with medical/scientific writing from someone who also knows machine learning, feel free to contact me. Chip Reuben, MS

anon74809396 · January 19, 2017, 8:31pm

Thank you for this article! It seems that axial slices at different location/time from the *same* subject were used both in training and testing sets, meaning that these sets are not separated on the subject level. Am I understanding this correctly?

anon22217163 · January 20, 2017, 5:39am

Dear Tanya,
Thanks for your interest in this work. And I also need to thank you for attention to details . In this work, we performed the classification in the slice level. As you understood, we created samples from all subjects' fMRI time series. Next, we shuffled the data and created training and testing samples. The reported accuracy is for the slice level classification. One more thing I'd like to share with you is slices from a given subject are independent even they are highly correlated. It means our training and testing datasets are completely independent from each other in "slice-level". I realize the medical imaging researchers are more interested in "subject-level" classification that's why I continued the project.
In our more complete paper called DeepAD, we performed "subject level" classification which means we divided the subjects into two groups : training and testing and then we did the "subject level" classification. Again, we achieved a very high accuracy rate. We also designed a decision making algorithm to stabilize the prediction process in order to make a decision whether a subject is Alzheimer's or not.
The beauty of CNN architecture is to generate a well-generalised model once it was successfully trained and validated by high volume of data. Please feel free to have a look at the DeepAD at http://biorxiv.org/content/... where we used a huge dataset to classify slice-level, subject-level structural and functional MRI data.
Hope it helps,
Saman

anon22217163 · January 20, 2017, 5:47am

Dear Chip,
Thanks for your interest in this paper.
There are different ideas regarding using ROC / AUC as the classification metric or not. I personally have more tendency to stick with the accuracy as the performance metric of classification since it makes more sense in our research field. The reason, I generated the ROC curve was to ensure that the classification has been successfully performed. In addition, ROC / AUC is more informative in case of imbalanced data which is not the case in our work.
Thanks for your comment and offer. I will keep your contact information.
Best,
Saman

anon74809396 · January 20, 2017, 6:52pm

Thank you for the elaborate response and the link to the DeepAD paper. After reading it, I'm still not clear how the subject-level classification was done and I would appreciate some clarification since the accuracy you report is pretty remarkable.
In section 6 of that paper you mention that "the adopted LeNet model and GoogleNet were adjusted..." - by "adjusted" do you mean "fine-tuned"? I.e. have you used the networks you previously trained for the slice-level classification and fine-tuned the last layers on the subject-level task? If so, I'm assuming that none of the slices from the subjects chosen for the subject-level classification were seen by the networks during the training stage of the slice-level classification. This would leave very little data to fine-tune with as you only have 52AD/92NC for rs-fMRI and 211AD/91NC for MRI.
Can you please provide more details as to what layers you fine-tuned and how much data was used for this purpose?

anon22217163 · January 20, 2017, 7:30pm

Dear Tanya,
1 - We trained the networks from “scratch”. “NO fine-tuning” was performed and it has been clearly mentioned in the paper. The initial LeNet and GoogleNet architectures have been designed for different number of classes but I used them for a binary classification so that some adjustment were required.
2 - If you take a look at the pipeline and data conversion section in the paper, I explained how I extracted 2D slices from the data to generate a huge dataset for both fMRI and MRI pipeline.
As I said in my previous comment, for “subject-level” classification, the “subjects” were divided into the training and testing group. Next, the 2D slice samples were generated. It means there was no slice from the same subject in the training and testing datasets, simultaneously. In another word, the training and testing data had “NO” subjects in common.
Regarding the accuracy reported, some research groups using different strategies could achieve a very high accuracy rate and I mentioned them in the literature review. Please look at the comparison table. However, I could improve the accuracy rate for MRI data by much more accurate preprocessing and some tricks in DL. In addition, for the first time, fMRI used for this classification and thanks to a very accurate and massive preprocessing pipeline and certain optimization, I achieved the highest accuracy rate reported so far.
Hope it helps. If it is still unclear or you need more clarification, I will be more than happy to help you or anybody else to replicate the DeepAD paper and achieve the same accuracy as long as you use exactly the “identical” methods I used in the paper.
You can reach me at samansarraf@ieee.org.
Best,
Saman

anon74809396 · January 20, 2017, 7:42pm

Thanks for your quick response, Saman! For subject level, I understand that you do the subject level separation prior to generation of the 2D slices for classification, but since your test set include multiple slices from the same subject, how do you calculate subject-level accuracy? Do you average accuracy across all slices of the same test subject? Or is it still slice-level accuracy?

anon22217163 · January 20, 2017, 9:12pm

That’s my pleasure to help.
In the subject-level, the accuracy reported is still based on the slice classification. How can we report an accuracy rate for a subject? It does not make sense at all. What you could do is to measure the probability of the subject whether to be AD or NC. Does it make sense?
What I developed was a decision making algorithm that was counting the number of slices classified as AD or NC and then calculate the probability and vote for the majority. Let’s say (the number is just an example) , for a given subject having 1000 slices, 900 slices were recognized as AD which can say the probability is 90% to be AD. In this case, the decision maker votes for AD.
In DeepAD, the table 5 and figure 15 summarizes what I explained above.
Feel free to post your comments or reach me out if you have more questions.
Best,
Saman

anon93406192 · January 21, 2017, 2:48am

That is very nice paper and useful report for every user including me as a beginner in deep learning. My question is the accuracy of 97% is the best accuracy you got from your data?

anon22217163 · January 21, 2017, 7:06pm

Hi there,
This is the averaged accuracy after 5 time shuffling the data in this conference paper.
As I showed in DeepAD , by updating the preprocessing pipeline and adding more samples for the training, I could achieve up to 99.9% for slice level recognition.
Thanks,

anon51189750 · February 10, 2017, 7:46am

Hi. I really admire your work and I am trying to replicate it as well. I want to ask regarding table 1. what is actually the volume is? I was taking a volume as one .nii image. but now I am totally confused about getting huge number of total images.
Regards.

anon51189750 · February 10, 2017, 7:48am

Is there any other good tool to replace fsl-VBM to get GM as I have tried it and it is taking a lot of time.

anon22217163 · February 10, 2017, 5:31pm

Dear Ammarah, Thanks for your interest in this paper and the expanded version DeepAD.
Let me answer your both questions in this reply. I think there is a misleading here, what we tried in this paper was "functional MRI" data which are 4D data (3Dxtime) . The volume in table 1 means 3D volumes of a given subject have been collected 300 times (time points).
Regarding your second question, actually FSL VBM is a tool for structural MRI not functional MRI. You can also use SPM8 to process you structural MRI data.
Good luck,
Saman

anon51189750 · February 11, 2017, 10:22am

Thank you,Saman.
please also make me clear about selecting slices for sMRI. I get about 256 slices for one subject/nifti image. then I discard slices from start and end which are just black and have no information.it gives me about 70 to 100 useful slices but having different brain portions like very small from top as well as good looking axial slices. Am I doing it correctly? Also have you used data augmentation for sMRI.
Regards

anon49268263 · July 18, 2017, 8:23am

good job. thanks for sharing your experineces.
do you write Roc curve code in digit? How can I access to Roc curve and confusion matrix in digits ? can I addany code?

anon22217163 · July 19, 2017, 2:57pm

Hi there, thanks for your interest in this work and paper.
Actually, what I did to generate ROC curves was done out of DIGITS . Firstly, you need to use classify many option of DIGITS to get predicted labels and scores for your testing samples. Next, you need to save the results as html files or any format (that you are more comfortable with) and write your in-house codes to draw ROCs. I did it in MATLAB.

anon49268263 · July 20, 2017, 9:04am

thanks a million

anon88492780 · May 12, 2018, 1:00pm

Dear Prof. Sarraf,

Thanks for your work. I may ask a very stupid question. In your original paper, you just spit the data into training and testing dataset, and in the original paper, you gave the loss and accuracy figures over the 30 epochs for both training and testing. From what I understand, in the original paper, you 'testing data' is actually used to validate your model, not really a testing dataset. How did you calculate the accuracy of your testing data??

In the post here, you split the data into 3 dataset, training, testing and validation dataset. You mentioned 'We repeated the entire dataset generation and classification process five times for 5-fold cross validation, achieving an average accuracy rate of 96.85%.' So this accuracy is based on the real testing dataset, not like in the original paper, no?

Look forward to your response

Best

Hao

anon22217163 · May 14, 2018, 2:55pm

Dear Hao,

The accuracy rates reported in this tutorial have been extracted from the original paper. No matter what if you use testing or testing / validation datasets for evaluating your model against unseen data , the evaluation is completely valid.

Hope it helps.

Topic		Replies	Views
Image Segmentation Using DIGITS 5 Technical Blog	44	857	February 13, 2019
DetectNet: Deep Neural Network for Object Detection in DIGITS Technical Blog	23	1381	July 7, 2019
DIGITS: Deep Learning GPU Training System Technical Blog	54	727	January 7, 2025
Building Medical 3D Image Segmentation Using Jupyter Notebooks from the NGC Catalog Technical Blog	13	1006	April 7, 2022
Solving SpaceNet Road Detection Challenge With Deep Learning Technical Blog	4	440	March 17, 2019
Optimizing Fraud Detection in Financial Services with Graph Neural Networks and NVIDIA GPUs Technical Blog	7	954	July 25, 2023
Generative AI Research Spotlight: Demystifying Diffusion-Based Models Technical Blog	0	330	December 14, 2023
Deep Learning for Computer Vision with MATLAB and cuDNN Technical Blog	38	1291	October 28, 2017
Execute a DIGITS trained tensorflow model on TX2 using python Jetson TX2	14	2383	October 18, 2021
Boosting NVIDIA MLPerf Training v1.1 Performance with Full Stack Optimization Technical Blog	2	1219	April 3, 2022

NVIDIA DIGITS Assists Alzheimer's Disease Prediction

Related topics