Can NeMo (GitHub - NVIDIA/NeMo: NeMo: a toolkit for conversational AI ) & Megatron be used without a GPU and only with CPU?
SunilJB
September 30, 2022, 7:04am
2
Hi @kevin129
Pipeline parallelism and Tensor Parallelism will only run on GPU, so I don’t think NeMo Megatron will run only on CPU.
But some of the other NeMo models might work based on ONNX opset, please refer to below link:
opened 10:57AM - 11 Oct 19 UTC
closed 07:44PM - 18 Oct 19 UTC
Hi,
I am currently trying to run 'simplest_example.py' on a CPU within a doc… ker container.
I have tried modifying the code to run on CPU by passing:
- "placement=DeviceType.CPU" to the Factory which produces an Error regarding CUDA:
> Traceback (most recent call last):
File "simplest_example.py", line 27, in <module>
optimizer="sgd")
File "/opt/conda/lib/python3.6/site-packages/nemo_toolkit-0.8-py3.6.egg/nemo/core/neural_factory.py", line 526, in train
stop_on_nan_loss=stop_on_nan_loss)
File "/opt/conda/lib/python3.6/site-packages/nemo_toolkit-0.8-py3.6.egg/nemo/backends/pytorch/actions.py", line 1022, in train
'amp_min_loss_scale', 1.0))
File "/opt/conda/lib/python3.6/site-packages/nemo_toolkit-0.8-py3.6.egg/nemo/backends/pytorch/actions.py", line 359, in __initialize_amp
opt_level=AmpOptimizations[optim_level],
File "/opt/conda/lib/python3.6/site-packages/apex/amp/frontend.py", line 358, in initialize
return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs)
File "/opt/conda/lib/python3.6/site-packages/apex/amp/_initialize.py", line 170, in _initialize
check_params_fp32(models)
File "/opt/conda/lib/python3.6/site-packages/apex/amp/_initialize.py", line 92, in check_params_fp32
name, param.type()))
File "/opt/conda/lib/python3.6/site-packages/apex/amp/_amp_state.py", line 32, in warn_or_err
raise RuntimeError(msg)
RuntimeError: Found param fc1.weight with type torch.FloatTensor, expected torch.cuda.FloatTensor.
When using amp.initialize, you need to provide a model with parameters
located on a CUDA device before passing it no matter what optimization level
you chose. Use model.to('cuda') to use the default device.
To fix that issue I additionally passed:
- 'optimization_level=1' to prevent APEX from being called which returned
> 2019-10-11 09:32:10,688 - WARNING - Data Layer does not have any weights to return. This get_weights call returns None.
Starting .....
Starting epoch 0
Traceback (most recent call last):
File "simplest_example.py", line 27, in <module>
optimizer="sgd")
File "/opt/conda/lib/python3.6/site-packages/nemo_toolkit-0.8-py3.6.egg/nemo/core/neural_factory.py", line 526, in train
stop_on_nan_loss=stop_on_nan_loss)
File "/opt/conda/lib/python3.6/site-packages/nemo_toolkit-0.8-py3.6.egg/nemo/backends/pytorch/actions.py", line 1184, in train
final_loss.get_device()))
RuntimeError: Device index must not be negative
How do I run the example on CPU? Thanks.
Thanks