Low or normal performance?

I am not sure where DES came into the thread, but will point out (25 years after last dealing with DES) that it is very amenable to bit-slice approaches that should perform (extremely) well on GPUs.