Cuda installation NVCC sucks too much?? Value 'sm_21'???

I am using windows 10. I installed cuda with Visual Studio 15.5. Then I got error saying that it is not compatible. I deleted the whole VS. I downloaded an old version. YES exactly funny. I am downloading old version of something to make HIGH technology GPU using software. Please don’t laugh.

I have very stupid C file and 2 functions inside. It managed to compile perfectly.
By the way: CL.exe was not found. I also added that to the path variable.

Now comes the question.

I am getting some error like
nvcc fatal : Value ‘sm_21’ is not defined for option ‘gpu-architecture’

No one has an answer to that except one guy saying in github that GPU driver is too new.

What is that? Why Nvidia guys are so slow for supporting VS new versions. This is my first question. Well as far as I know NVIDIA is a company and it is not an open source/ open support/ leisure company. This is for profit organization so what is the problem?

Second. can someone give me a solution for that stupid error.
nvcc fatal : Value ‘sm_21’ is not defined for option ‘gpu-architecture’

This error message looks like microsoft’s stupid error messages like
Memory location at 0x34324322352 couldnt be read.

Yeah thank you. very descriptive. You guys took software engineering or principles of user interface design course in your life. I totally advice it. Very good course;)

I lost my 2 days on just installing stuff and I am terribly frustrated. Please try to understand me.

I also know the torture process in linux. most of the time even graphics drivers are problematic.

If you want I can also share the C file but I guess we have deeper stuff here.

NVCC version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:08:12_Central_Daylight_Time_2017
Cuda compilation tools, release 9.1, V9.1.85

Visual Studio version:
2017 15.4

I would be grateful if you can help.

Regards

You’re getting that error because your project or makefile or Cmake settings are calling out that compilation switch (sm_21) which tells the compiler to compile for a Fermi device (old), but you have CUDA 9 or 9.1 installed, and CUDA 9 and 9.1 do not provide support for Fermi devices (too old).

The solution is to find the compilation switches in your project or makefile or Cmake settings and update them to something supported by CUDA 9 or 9.1 like sm_30. This obviously won’t work if you are actually trying to run your code on a Fermi device. In that case you cannot use CUDA 9 or 9.1

When I put that error message in a google search, this is the first item returned:

https://stackoverflow.com/questions/47663033/nvcc-fatal-value-sm-20-is-not-defined-for-option-gpu-architecture

It is saying the same thing.

Thank you for the answer. Well I downloaded vc but I am in fact just using nvcc command to compile.

nvcc -ptx distance.cu -o distance.ptx
nvcc -cubin -m64 -arch sm_21 distance.cu -o distance.cubin

Where is this cmakelists file? How can I give that to nvcc?

You’re not using CMake so don’t worry about cmakelists file.
What GPU are you intending to run this code on?

My gpu is Nvidia 960M GTX.

I am using these commands to compile and now it is compiling fine no errors.

nvcc -ptx distance.cu -o distance.ptx
nvcc -cubin -m64 -gencode arch=compute_50,code=sm_50 distance.cu -o distance.cubin

I have seen these commands in jcuda tutorial and in jcuda the source code is using the ptx file in fact. So I am not sure why I compile cubin file. What is ptx and cubin file. I dont know. can anyone explain or send me a link?

now the thing is the results of the functions are returning 0 interestingly. I converted java source code to C to write this file and java code is correct. I think C is also.
I used this answer in the jcuda forum to write the code.

Here is the file distance.cu

#include <math.h>
extern “C”
global void relative(double x1, double y1, double x2, double y2,double* result )
{
int index = blockIdx.x * blockDim.x + threadIdx.x;
double returned=0;
double sum=(x1-x2)(x1-x2)+(y1-y2)(y1-y2);
returned=sqrt(sum);
if (index == 0) result[0] = returned;
}

extern “C”
global void real(double lat1,double lon1,double lat2,double lon2,double* result )
{
int index = blockIdx.x * blockDim.x + threadIdx.x;

double returned=0;

double R = 6371; // Radius of the earth in km
double dLat = (lat2-lat1)* (3.14159265359/180);  // deg2rad below
double dLon = (lon2-lon1)* (3.14159265359/180); 
double a = 
		sin(dLat/2) * sin(dLat/2) +
		cos(lat1* (3.14159265359/180)) * cos(lat2* (3.14159265359/180)) * 
		sin(dLon/2) * sin(dLon/2); 
double c = 2 * atan2(sqrt(a), sqrt(1-a)); 
double d = R * c; // Distance in km
returned=d*1000;

if (index == 0) result[0] = returned;

}