FPGA - GPU, no host PC

So we have FPGAs, CPUs and GPUs.

One way of combining those is to have software (OpenGL based for example) running on a PC’s CPU, under say Windows, control both GPU and FPGA to orchestrate data movement and to control the GPU.

The thing is, in some cases the PC is doing almost nothing since all processing happens on either FPGA or GPU. For example the system could be setup for the FPGA to DMA direct to GPU’s memory. Results from the GPU may be sent back to FPGA once computed. So really all the PC is doing is setting up a few DMAs and occasionally moving data around. So I’ve been wondering if it would be possible to get rid of it and replace it with a processor embedded in the FPGA.

The main reasons would be cost, power and size.

I was thinking a possible way of doing this would be to run Linux on an embedded PPC - using the Linux drivers for the nVidia card.
Or perhaps booting Windows Embedded on embedded PPC.

I’d love to know if anyone has any thought on the topic or think this is a dumb idea :¬).


Unfortunately, the NVIDIA drivers are only available for x86 and x86-64, not PPC. The drivers are a binary blob, so compiling them yourself isn’t an option.

But I do like the idea very much, and I would like to see a setup with only a GPU and FPGA. So far, the best ‘solution’ is to use the smallest CPU possible, for example an Intel Atom.

Although not documented formally, I suspect the GPU is less autonomous than you think. Comments from NVIDIA employees have hinted that the setup and launch of a kernel requires some additional assistance from the CPU beyond just pushing the parameters to the GPU. An FPGA running a full CPU core (like the Virtex chips with the PPC core, or maybe MicroBlaze) would almost certainly be required, but the lack of drivers would be an obstacle, as mentioned.

That said, NVIDIA is almost certainly experimenting with putting an ARM core onto a CUDA-capable GPU. That would be a fully self-contained system to which you could attach an FPGA. (Whether or not this ARM + CUDA processor makes it to market so we can buy it is a different question.)

You can do this on AMD cards with direct write, I believe.

Unfortunately NVIDIA’s direct write is only GPU to GPU… GPUDirect may do it but I’m not sure how that would work from DMA logic on an FPGA.

Not sure if this is the place to post this as am new to this space.
If I get an NVIDIA card with a GPU and plug it into an x86 server - the root port is in the server/motherboard while the GPU itself has a pcie end point - correct?
And does the server communicate with the gpu directly via pcie or is there an FPGA in the middle (in a standard card)
My question is can the pcie inside the GPU be configured to be a root port instead (if the above is correct).

To the best of my knowledge (corrections welcome):

[1] The PCIe root complex in a typical x86 server is part of the non-core portion of the CPU
[2] The GPU is a PCIe endpoint and does not contain a PCIe root complex
[3] In some (high-end) systems there is a PCIe switch between the PCIe root complex and the PCIe end points