Disappointing Performance on Tesla K20c Compared to AMD HD 7970

No links, no source code. Update your post, please :)

Updated in the original post.

Were you able to fix the performance issue with your K20c implementation?