Run LLM in K80

aleksey.malyshev · July 21, 2023, 9:41am

Some LLMs require large amount of GPU memory. I am considering using a K80 card, which has two GPU modules. Could someone please clarify if the 24Gb RAM is shared between GPUs or is it dedicated RAM divided between the GPUs? I.e. - will I be able to use the entire 24Gb RAM with one GPU?

aleksey.malyshev · July 21, 2023, 9:48am

This is what ChatGPT is saying. Is this correct?

njuffa · July 21, 2023, 11:21am

See this previous thread:

Note that CUDA 12.x removed support for sm_35 and sm_37 meaning the Tesla K80 is no longer supported.

Note also that this is a passively cooled card. Unless you run this in a server enclosure, you will need to construct your own cooling solution.

aleksey.malyshev · July 21, 2023, 12:36pm

There are cooling solutions for this card. But since one GPU can only use 12GB RAM, this is not very useful anyway.

Topic		Replies	Views
Use all 24 Gb for one application on the K80 GPU CUDA Programming and Performance cuda , tensorflow , pytorch , driver	2	1677	February 4, 2022
Tesla K80 dual GPU Frameworks pytorch	0	761	September 14, 2020
Installing driver for Tesla K80 CUDA Setup and Installation	3	3065	July 20, 2019
PyTorch with nvidia K80? CUDA Programming and Performance	1	3071	March 9, 2018
Memory management among multiple GPUs 9800 GX2 Shared Device Memory CUDA Programming and Performance	2	1875	October 30, 2008
K80 with Deep Learning Frameworks pytorch	1	654	April 1, 2021
Physical RAM and GPU VRAM capacity matching? General Discussion	0	780	July 11, 2023
Overview of K80 Architecture CUDA Programming and Performance	8	4893	November 4, 2015
Multi GPU scaling using 4 A5500 GPU - Hardware cuda	7	1010	July 7, 2023
how to merge resources of GTX295 CUDA Programming and Performance	3	2888	July 30, 2009

Run LLM in K80

Related topics