TYAN FT77A-B7059 and 8 GPGPUs

Has anyone used one of these with 8 GPUs? We are looking to deploy 8 tesla K20ms

I am looking for confirmation that the cuda drivers can address 8 GPUs simultaneously. I seem to recall a restriction of 4 GPUs per CPU, which is fine because there are 2 CPUs, but are these addressed directly without code modification or do they just act independently as two separate nodes, each with 4 GPUs and therefore we have to manually manage communication via e.g. MPI?

It is not clear from the brochure on this product so if there is anyone with any experience I would be happy to hear your experiences before committing to rather substantial expenditure.

~~ note the below response is to a comment which now seems to have been deleted *** I do not seem to have that option!

I have no idea what that reply is supposed to mean. I have been programming with CUDA for over 5 years, I have no problem understanding the architecture or the programming model. I have been using tesla in an HPC environment for most of this period and using MPI to do the communication. I suggest you read my question before sending such an idiotic reply.

My question relates to a specific motherboard which you clearly know nothing about.

I’d advise contacting the manufacturer – presumably they would be better useful for advice on their own product. Considering their spec sheet clearly lists 8 GPU support, I’d say so:
http://www.tyan.com/datasheets/DataSheet_FT77A-B7059.pdf

Also, see this other recent thread:
https://devtalk.nvidia.com/default/topic/649542/18-gpus-in-a-single-rig-and-it-works/

Thanks for the suggestions. I am pretty sure the manufacturers will say it will work (certainly that is impression given by the reseller) but I was hoping for some real world experience from someone who has actually deployed.

My recollection is that use above 4GPUs requires a bios hack. It could be that Tyan have a customised bios, in which case hopefully all will be well. I don’t have many options for 8 GPGPUs and if possible I want to avoid getting two servers each with 4 GPGPUs, with one box I can cut down the amount of MPI calls I have to make (hopefully).