Functions with deduced return type in device code, cuda 8.5

Functions with deduced return type is a new feature in C++14. I noticed that it is supported in CUDA 8.5 because the documentation is updated.
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#return-type-deduction
But I haven’t found any url of CUDA 8.5 to download.
Is there anybody could tell me when or where I can download the CUDA toolkit 8.5? I really need this new feature.

CUDA 8.5 was never released. However CUDA 9 is on the way. It should be available in release candidate form in the next month or 2.

That’s wired, the documentation has already been released but the toolkit has not been available to download.

The documentation you linked is the CUDA 8 docs. If you look at the top of the file/doc carefully, this will be evident.

The references to CUDA 8.5 in that doc were something that slipped through the review system when CUDA 8 was released, i.e. a typo effectively.

Those are not CUDA 8.5 docs. As I stated already, CUDA 8.5 was never released.

To be fair, I’m not a fan of deduced return types in the first place :P

They’re cool for like two-line lambdas but after enough JavaScript, I never want to have to read a function to figure out what it returns ever again.

Function signatures are the one place where I really don’t use the almost always auto idiom. Having concrete types for your parameters and return type are critical for understanding a function at a glance.

At present, I must use deduced return types. Because I’m working on the template and I need to implement a function wrapper. The return type of the function wrapper is determined by the function parameter. However, if I want to implement it in C++11, I need to using things like std::result_of and std::forward but CUDA does not support STL. It makes me don’t know how to deal with this problem at the moment.

Huh, I think you can forward in device functions. I think it’s literally just a cast.

Can you post some code?

If it is possible, I want to implement these codes in device.

template <typename F, typename ...Args>
typename std::result_of<F &&(Args &&...)>::type wrapper(F && f, Args &&... args)
{
    return std::forward<F>(f)(std::forward<Args>(args)...);
}

Actually, this code works fine for me:

// I'm on Windows so you might need to add -std=c++11
// nvcc -gencode arch=compute_61,code=sm_61 -o deduced deduced.cu
#include <stdio.h>

#include <type_traits>
#include <utility>

template <typename F, typename ...Args>
__device__
auto test(F&& f, Args&&... args) 
  -> typename std::result_of<F&&(Args&&...)>::type
{
  return std::forward<F>(f)(std::forward<Args>(args)...);
}

__global__
void proof(void)
{
  auto const f = [](int const x, int const y) -> int
  {
    return x + y + 3;
  };

  printf("%d\n", test(f, 1, 2));
}

int main(void)
{
  proof<<<1, 1>>>();
  cudaDeviceSynchronize();
  return 0;
}

Oh, thank you very much, it works really fine. I thought that the STL is not supported in device codes and I have never tried it before. Thank you for teaching me that practice is the most important.

Btw, it’s been pointed out to me that the above code is technically broken.

Do not attempt to pass in a pointer to a member function or a piece of member data that’s callable as the above code is broken in that instance. You’d need to use std::invoke which is just a gigantic cluster of fudge.