Bank conflicts and reuse flags in Pascal

ladberg · April 13, 2017, 12:15am

I’ve read what Scott Gray found out about register banks in Maxwell for MaxAs, but I haven’t been able to replicate his findings for Pascal. It seems that whenever I add/remove reuse flags and try to align registers to be in the same bank, I can’t get a conflict. The instruction works the same and doesn’t take longer in the pipeline, so it looks like reuse flags and register alignment have no effect. Could the behavior of register fetching have changed significantly for Pascal? If so, does anyone know the new behavior? I still feel like it must do something because NVCC still will add reuse flags wherever it can, but I just can’t find a difference.

Thanks!

njuffa · April 13, 2017, 2:42am

NVIDIA is notoriously secretive about the details of their microarchitectures, and from historical observation it is clear that they do tend to make substantial implementation changes between major architecture generations.

So the answer to your first question is “yes”. The answer to the second question is “I don’t know, and I am not aware of anybody who has successfully reverse-engineered Pascal”.

Unless you have a lot of experience in reverse-engineering, I consider it possible that your experiments are not yet sophisticated enough to reveal the salient differences (if any) between Maxwell and Pascal implementations. This is not to discourage you in your efforts. But from personal experience reverse engineering the details of x86 FPUs in the 1990s I know how much painstaking work and experience successful reverse engineering requires.

ladberg · April 13, 2017, 2:52am

This is where I think I’m stuck on, assuming that reuse flags have an effect and bank conflicts exist. I have ran cubins that are identical except for the existence of reuse flags (confirmed with cuobjdump) and can’t seem to find any difference when executed. I also did the same with shuffling around registers to cause what would have been a bank conflict with Maxwell, but still no difference. Is there anything else that it could effect?

At the moment I’m considering that Pascal reuse flags do nothing but were left in as an artifact from Maxwell…

njuffa · April 13, 2017, 3:49am

That’s possible, of course. On the other hand, the trend in NVIDIA GPU architectures has been to reduce hardware complexity, and move that into software, essentially arriving at VLIW-like instruction bundles, then increasing the amount of op-steering information per bundle (first one control word per seven instructions, then one control word per three instructions). That makes it a tad harder to believe that Pascal would move some of that complexity back into the hardware.

The size of the register file and Pascal’s operating frequencies would suggest to me that the register file is still banked. But my processor building days are long behind me (AMD Athlon being the last one), so I don’t know what tricks circuit designers have up their sleeves these days.

Topic		Replies	Views
About "register bank-conflict" CUDA Programming and Performance	2	4454	February 14, 2017
Register Bank trace CUDA Programming and Performance	6	1378	April 17, 2018
How to understand register bank on RTX A4000 CUDA Programming and Performance	1	368	September 18, 2023
How to optimize my cuda code? CUDA Programming and Performance	14	2280	June 28, 2023
So what's new about Maxwell? CUDA Programming and Performance	166	57764	March 10, 2015
Pascal resorting to zero-copy memory CUDA Programming and Performance	9	2050	August 14, 2017
Any advice on adjusting code for Maxwell when coming from Kepler CUDA Programming and Performance	20	2943	November 6, 2014
".reuse" in SASS instructions CUDA Programming and Performance	5	3031	October 8, 2017
Better control of register use CUDA Programming and Performance	4	1927	July 1, 2009
Getting nvcc to consolidate registers CUDA Programming and Performance	19	19660	November 19, 2012

Bank conflicts and reuse flags in Pascal

Related topics