You should assist in the cudaminer development

IF you notice, crappier AMD cards are selling at higher prices and selling out, this is because of mining. I am sure that the problem in speeds is not particularly the nvidia card itself, but in the use of it. If you took two experienced cuda developers for 2 weeks and had them assist in the code base, I bet your sales would double, and even AMD would have to lower their prices.

It’s a win / win for you!

An nVidia engineer has already submitted a pretty fast kernel to cudaminer recently (it’s available in the cudaminer code repo on Github). We’re now very close to AMD performance, however the price point of nVidia cards cannot compete. Cryptocoin mining rigs are cheaper to equip with AMD cards currently.

EDIT: an overclocker just broke 900 kHash/s on a watercooled nVidia GTX 780Ti. Can’t get much faster with AMD devices either.

help is actually needed in the following areas:
-CUDA error handling, recovery
-failover options in case of mining pool outage
-more reliable auto-tuning logic
-temperature monitoring + control and setting a GPU utilization control target (use of NVAPI)
-providing an overclocking tool for Linux that works with Fermi + Kepler devices (NVAPI? undocumented APIs?)
-providing APIs for external monitoring similar to CGMiner (maybe even a cgminer-compatible API)
-better logging functionality

It’s open source, so hack away!

Was that a Windows machine? Wow!

I am going test this miner on the K20c this weekend. I was getting about an average of 400 using rpcminer(bitcoin) but probably can do better.

This is a very strong statement. I just bought a amd crd the 290x which is top of the line ( I suppose). It costs almost half of the Titan (and less than 780 ) and is beating the Titan by a large margin whithout special tricks 750 KH/s vs 480 on the Titan and I am aiming higher. AMD cards DO have an advantage in this particular algorithm Something related to integer operations. I am a big CUDA fan overall, but please do your homework before writing.

The new code does change the game though. Thanks cbuchner1 and nvidia.

Since we’re talking about crypto and in case anyone is interested, I wrote a SHA-256 routine back in March and promptly forgot about it. The only thing interesting about it is that it uses heavy macro expansion and explicitly inlines PTX in a few spots (which is probably not needed since NVCC is quite good at recognizing these idioms).

I threw it up on GitHub: A CUDA SHA-256 subroutine using macro expansion

I suspect the performance of the subroutine is about the same as every other implementation but macro expansion is either interesting or offensive depending on your aesthetic. :)

Let’s just say that nVidia seems to be recognizing crypto mining at more than just the engineering level.
And you bet so does AMD. However so do dozens of other companies, and some already have specialized ASICs on the market.

My take is that nVidia has probably planned ahead the entire feature set of the next two GPU generations… We will see how well Maxwell chips will perform (going to get myself a new 750Ti at launch time). I cannot imagine nVidia catering to the mining community with specialized silicon or GPU features anytime soon.


the relevant screen shot (and description) is here:

From my understanding GPUs are currently only competitive in the bandwidth bound types of bitcoin mining such as litecoin otherwise people just go for the ASICS. And I guess GPUs will continue to be competitive in the bandwidth area since that’s probably harder to build an ASIC for.

I’ve heard that AMD is selling out their cards and are unable to supply the regular gamers due to bitcoin demand, I’m pretty sure nvidia will and already have taken notice of this and implement some additional hardware support…

But it seems you guys have basically caught up to AMD levels of raw performance. But what about Khash / watt?

The 750 Ti should be very power efficient even though it’s still on 28 nm. Will the first maxwells feature ARM cores? If so I wonder if you’ll be able to take a lot of strain of your regular x86 CPU aswell by utilizing the on-chip ARM core as the “host” in the relationship.

I believe that we’re competitive in the kHash/s per watt figures for Compute 3.5 devices. After all, both AMD and nVidia use an 28nm process and physics is physics.

If any of the two contenders started dedicating some silicon area to crypto acceleration features (e.g. putting the entire Salsa20/8 round function into hardware and mapping it to a single CUDA assembly instruction) this would be a game changer for power efficiency. But I do not really see that happening soon.