How to train my model on multiple GPU

Cpluz_Shrijayan · March 4, 2024, 7:48am

I am trying to train the Sentence Transformer Model named cross-encoder/ms-marco-MiniLM-L-12-v2 where When I try to train it utilizes only one GPU, where in my machine I have two GPUs. I tried DataParallel and DistributedDataParallel, but didn’t Work.

from sentence_transformers import SentenceTransformer, losses
from torch.utils.data import DataLoader

# Replace 'model_name' and 'max_seq_length' with your actual model name and max sequence length
model_name = 'your_model_name'
max_seq_length = your_max_seq_length

# Load SentenceTransformer model
model = SentenceTransformer(model_name)
model.max_seq_length = max_seq_length

# Replace 'train_examples' with your actual training examples
train_examples = your_train_examples

# Create DataLoader for training
train_dataloader = DataLoader(train_examples, batch_size=16, drop_last=True, shuffle=True)

# Define the loss function
train_loss = losses.MarginMSELoss(model)

# Tune the model
model.fit(train_objectives=[(train_dataloader, train_loss)], epochs=500, warmup_steps=int(len(train_dataloader) * 0.1))

# Replace 'output_path' with the desired path for saving the trained model
output_path = 'your_output_path'

# Save the model after training
model.save(output_path)

Robert_Crovella · March 4, 2024, 2:31pm

This is a torch question. I don’t think you will find much help here. You may get better help on a torch forum, such as discuss.pytorch.org There are NVIDIA experts that patrol that forum.

Cpluz_Shrijayan · March 11, 2024, 3:48am

Thank you will reach out there.

Topic		Replies	Views
New Workshop: Data Parallelism: How to Train Deep Learning Models on Multiple GPUs Technical Blog	0	293	November 29, 2022
Training Multiple Models in one GPU in linux Frameworks	0	634	November 3, 2022
How to train a model with multiple GPUs TAO Toolkit	6	631	August 30, 2021
Training a deep learning model on different machine Deep Learning (Training & Inference)	0	410	November 20, 2019
Optimize fine tuning of a Citrinet model in multi GPU environment Frameworks nemo	0	757	October 28, 2021
Transfer Learning Toolkit Multi-GPUs Deep Learning (Training & Inference)	0	404	April 23, 2020
Training a TLT model with multiple computers TAO Toolkit	9	623	October 12, 2021
Training multiple models on multiple GPUs hangs Frameworks pytorch	0	813	February 19, 2021
Workshop: Model Parallelism: Building and Deploying Large Neural Networks Technical Blog	0	246	October 12, 2023
Training multiple models in Same GPU simultaneously and How to setup AI lab. DIGITS (Locked)	2	3038	May 2, 2019

How to train my model on multiple GPU

Related topics