How does the shuffle parameter in DataLoader works? Is it related to “cross validation"?

Only “training data” gets shuffled before every epoch and the validation data remains the same for each epoch??..

or it gets shuffled all together with the “validation data”?

And the other question is… if shuffle=True is not cross validation, how could I make cross validation (dividing data in folds and changing the validation fold) instead of using the regular method?

Thanks in advance!

Hi @100375195, it is best practice to shuffle the training dataset, because randomizing the order of the data can help the model to converge. Some datasets may be ordered by class, ect which can benefit from randomizing the order of the data samples.

The validation dataset doesn’t need to be shuffled because it has no impact on the model. It doesn’t typically get shuffled along with the training dataset (unless you explicitly set shuffle=True on the validation set dataloader)

I don’t believe PyTorch natively implements k-fold cross validation, so you may want to try referring to these resources instead:

Excellent

Thanks a lot for your answer!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.