Would smaller GPU make for a better learning experience?

Suppose one planned to make a CUDA project for a college class. Would using a pair of GPUs with only 96 cores be more challenging than using a couple of high-end cards?

Just for example, it seems as though one would need a huge data set just to be able to look at algorithm times and so forth.

Would you learn more with high-end or low-end GPUs?

I think you would eventually learn in either way.
I mean, sure, you may need bigger data sets on higher-end GPU to come to some issues normally found, like for instance if the algorithm you are actually trying, needs more threads to parse the data, but to this, all you need is, as you said, a larger data-set.
If the idea is learning, then I would go with a higher end GPU, it has more features to learn, and the “drawbacks” of low-end GPU (smaller threadBlocks, smaller shared memory, etc…) would eventually round up, or you could force it yourself.
Also, you can always compile your code for lesser capability.
About the part of algorithm times, of course, you may not be able to diferentiate 6ms from 30ms, but hey, that’s what cudaEvents exists for ;). Also, never forget your nvpp, the Visual Profiler.

If you are learning CUDA, I would suggest going with a mid-range GPU (~$200-$250) if you can afford it. High end GPUs are not really needed until you know your application better, and the low end GPUs are often slower than your CPU and have terrible memory bandwidth, which is no fun when you are testing out algorithms.

CUDA in general is really only effective when you are dealing with large problems (lots of data, lots of calculations, or both), so you might as well face that early on.

If you specifically want to experiment with multi-GPU programming, that might be a good reason to buy lower end cards so that you can afford 2 (or more) of them on a fixed budget.