3K Deep Learning Build / Question on 3D XPoint


Wanted to know if anyone had suggestions for optimizing this build. The GPU was given to me through a research grant.

Also, I have been reading about 3D XPoint technology and wanted to know if it had a role in deep learning at this point. Seems like it utilizes less power compared to normal DDR4? Any insights would be helpful.

Thank you!

In somewhat random order:

DDR4-3000 speed grade for the system memory looks like overkill. Intel’s spec say DDR4-2400 is what is supported by the Core i7-7800X. All benchmarks I have seen suggest that using faster DDR4 speed grades, even where supported by the motherboard, do not significantly improve system memory performance.

Four DDR4 channels versus two would make a useful improvement in system memory throughput and should be supported by your CPU according to Intel’s specification (https://www.intel.com/content/www/us/en/products/processors/core/x-series/i7-7800x.html). Consider doubling system memory size to 64 GB. My rule of thumb for a high-end system like yours is (and has been for many years) system memory size = 4x GPU memory size. So maybe populate four DDR4 channels with 16 GB each for a total of 64 GB. Make sure the motherboard actually supports this, I have no insights into that.

Are you planning on adding more GPUs to this machine later? Because in the present configuration, the 1200W power supply seems oversized, a 750W PSU should be sufficient. In general, the sum of nominal wattage for all system components should not exceed 60% of the nominal wattage of the PSU, for robustness as well as efficiency reasons. 80 PLUS Platinum PSUs are typically most efficient from about 25% load to 60% load. 80 PLUS Platinum rating for the PSU is an excellent choice, I have no comments on the brand.

The CPU choice looks good, but note that Core i7 offers a limited the number of PCIe lanes, which will come into play once you add additional GPUs. The PCIe interface is rarely a performance bottleneck, though. I often emphasize that if the main point of a new high-end machine is to run well-parallelized workloads on GPUs, people should make sure to pick a CPU with great single-thread performance, so applications do not become bottlenecked on the serial portions. 3.5 GHz base frequency is about as good as you can get with a hexacore i7, as far as I know.

Does the Samsung SSD have an NVMe interface? If so, that is a good choice, as they enjoy significantly higher throughput than non-NVMe devices.

I think the Titan GPUs are usually quite long, make sure the case you picked has sufficient room to fit one.

Thank you so much for your reply. That information is really helpful!

  1. I will switch to DDR-2400 in that case which will be more cost effective. I have read 2x GPU memory is a good target, any reason for 4x is your metric? (Just thinking in terms of batch size and pre-processing requirements).

  2. Yes, I will be adding more GPUs in the next few months through grants and research funds. I wanted to get a larger power source so upgrades will be easier.

  3. CPU choice: Thanks. I think in the coming year the better x299 CPUs with more PCIe lanes will be cheaper in price? I was going to use the i7-7800X for now, and then upgrade if I get more than 2 GPUs. Going better than i7 seemed less cost effective.

  4. Yes the Samsung SSD has M2-NVMe. Only getting 500 GB for now as that should be plenty for the OS and active applications. Going to save my data on 3 x RedWestern Digital - Red 8TB – Image data is very large and figured this made more sense.

Curious if anyone has thoughts on implications of Optane SSD to deep learning like:

That’s just based on experience over a decade. Obviously, system memory requirements will differ based on use case, so this is a recommendation that covers the vast majority of use cases. If you look at the specifications of NVIDIA’s DGX1, you will find the same ratio there (128 GB GPU memory vs 512 GB system memory).

The question of additional GPUs and different CPU is probably best pondered when you get to that point. In general, four CPU cores per GPU should be sufficient, and generally you would want to favor CPUs with high single-thread performance, instead of massive core count.

Great, thanks again!

Just an update. Going to integrate Optane, add a slightly better memory card, and go to 7820x CPU. I will see how it does and add more memory as needed! Just very curious about Optane given all of the amazing benchmarks.


Note that the Intel Core i7-7820X processor is designed for DDR4-2666. I didn’t know Intel had this beast of a CPU on offer, octa-core at 3.6 GHz base frequency. Good find.

I have zero exposure to, and experience with, Optane. However the following review made me skeptical:


That’s one review only, there may be others that draw more positive conclusions.

Yes, I read that review! He did say say: “Most users- even those with relatively intense storage performance needs - will be better served by high quality flash SSDs like the Samsung 960 PRO”

I think optane could really help accessing data from my SATA drives and seems to have an incredible random memory access profile. (Not 100% sure how this will overall fit into deep learning but for $2-300 it’s not a risk). When prices drop for SSD I’ll definitely upgrade to a larger octane drive and a larger regular M.2 SSD.

Well, you could always report back in this thread how well Optane works for your use case compared to an NMVe SSD :-)

From what I understand of the flash market, significant price drops for SSDs aren’t likely in the near future as there is a world-wide shortage of all kinds of memory due to the growing memory capacities of mobile devices. I read that existing fabs are maxed out and it is unclear when new production capacity will come online.

I’ll post back some benchmarks after I complete the build! Thanks.