how to use DataParallel on TX2?

iven_huang · December 21, 2018, 9:07am

Hi,

when I use DataParallel on TX2, my python3.5 code was killed,

but use on GPU desktop server, my python3.5 code can work fine

I import DataParallel from torch.nn.parallel

GPU desktop server pytorch version is 0.4.1

TX2 pytorch version is 0.4.0a0+3749c58

Maybe it’s the pytorch version issue or any else?

how can I solved this issue?

thanks

linuxdev · December 21, 2018, 8:12pm

You say “killed”. Did it run out of memory? Run something like “htop” and watch RAM usage if that is the case.

iven_huang · December 24, 2018, 10:23am

Hi,

I run “htop” and my code on the same time,

when RAM arrived to 100% and my code was killed,

so how can I solved this issue?

maybe Release memory or any else methods?

I find this discussion([url]https://devtalk.nvidia.com/default/topic/971120/jetson-tk1/detectnet-on-tk1/[/url])

maybe it is good for me, but my space is not enough…

/dev/root 28G 26G 578M 98% /

linuxdev · December 24, 2018, 6:17pm

I can’t give a complete answer, but I’m sure someone else will add comment for this particular case.

In general, anything CUDA can’t use swap space…but other competing processes probably can, so there is some use in adding swap.

Closing unneeded programs is of course another way to help if there is anything running and consuming RAM which you don’t need at that moment.

Sometimes in cases where lots of threads are being generated you can cut back to one or two threads and the memory required will go down (it still goes through all of the logic, but not at the same time).

Can someone suggest ways to lower RAM usage from a Python based CUDA program?

AastaLLL · December 28, 2018, 8:26am

Hi,

Maybe you can try to a smaller batch size to decrease the memory usage.
Thanks.

iven_huang · January 14, 2019, 3:08am

Hi,

I decreased my one model on TX2…

I want to compression my model at present and let it smaller,

Is this a good way?

Thanks

Hi AastaLLL,

The batch size is set to 1, maybe it’s can’t smaller.

Thanks your answer

And my classmate suggest to use TensorRT, maybe I will try it?

AastaLLL · January 15, 2019, 2:53am

Hi,

Sure, TensorRT will decrease much of memory consumption.

But why you want to use DataParallel?
Do you want to use it for multiple GPU training?

Thanks.

iven_huang · January 15, 2019, 3:32am

Hi,

Because I run it on GPU server first,

and then I want to run it on TX2,

But TX2 has only one GPU, so DataParallel is not really work what I think.

So, I will survey TensorRT on next step

Because I changed yolo to tiny-yolo on TX2, it increased fps but losed the accuracy…

Thanks.