Setup for R, involving PERL, and some choice of operating system and C compiler

Dear customer service at Nvidia,

Sorry that this “chat” question is so long.

Some time ago, I purchased an NVIDIA GTX 1080 that features 2560 CUDA cores.

It is my intention to use this NVIDIA GPU to perform some parallel processes involving large amounts of data, and of great interest to me. I’m attempting to build a custom neural-net model for casino sports gaming. (I live in Las Vegas, baby.)

The future of NVIDIA computing is artificial intelligence modeling and operation, as far as I am concerned.

I want to somehow use the open source computer languages PERL, C, and R for this purpose.

Since “R” is an interpreted language, and I need for it to be processed on the CUDA cores, special care must be taken to make its compilation effective with the same “C” compiler as would be used in the command line, and in using the same “C” compiler that was used to build PERL, and all of the PERL components required, as well as Mysql, and whatever else might be used on the system.

I definitely don’t want accidental cross-compiler errors. You can understand this, right ?

After looking through your offerings for CUDA related languages, I have not seen much if any references to using “R”, the world’s best resource for open source statistical and AI modeling packages.

Without using the “R” language, I would be completely stuck. I MUST be able to use “R” in my CUDA cores…Everything depends upon being able to bake my cake with “R”. It’s more important than chocolate.

And since I have used PERL since the early 1980’s (about 40 years), I would prefer to continue to use it for my future work as well. As a high level language, particularly with its ability to easily process strings and command lines, and so forth, I wouldn’t want to relearn such a significant tool unless I absolutely had to. I’m kind of stuck with PERL.

So… After using PERL to call a routine that obtains information from a SQL database (such as MYSQL), I want to be able to use the same PERL routine to call some constructed routines that utilize my NVIDIA GPU for its parallel processing capabilities.

This code should be tight because it’s going to be repeatedly used a lot.

The PERL language apparently has some routines to convert database data constructions into CUDA data constructions. This is an essential step. Additionally the PERL language has some routines (RCPP) to populate and execute some C blocks of code suitable for the CUDA standard. The idea is to embed the desired R code within the above C code using the rules of the PERL (Rinside) packages.

In other words, PERL may have the exact tools (RCPP and Rinside) that one needs to use “R” in a CUDA standard fashion.

I don’t yet know how to do all of this complicated and sophisticated and interwoven coding. I’m flying by the seat of pants here. I just know that it all has to be done somehow.

It would be of immense help if you could create such an installation for test purposes on your end, and provide a copy of the whole thing - perhaps on a CD-ROM - so that almost anybody could make it work for the purposes I have described.

As I explain below, this should be of immense value to NVIDIA.

Unless I am misunderstanding the process, all of this CUDA compliant code must be compiled into a DLL (Dynamic Link Library), or the like, able to be called from a language that permits such DLL usage, which I guess would be a combination of PERL and C and R.

Subsequently, the further idea is to write code in a higher level language, such as PERL, that follows the protocol of making use of the CUDA calling methods which properly loads the data in question within the properly constructed CUDA blocks, runs the code in the CUDA cores via the above DLL, or the like, accumulates the results of all the CUDA cores via the higher level language of the calls and returns, and stores the results in the general system’s mass data storage, such as a set of flat files or, even better, a SQL database, for later examination and further utility.

Whew. Is that a bunch of complexity, or what ?

Please note that the seemingly simple but inadvertently erroneous choice of the operating system, or the open source or proprietary “C” compiler, or the combinations of PERL modules, or whatever, could make such a system useless. It would be doomed from the start, but this is something that one wouldn’t know until the cake was presumably baked.

In other words, without a robust reliable combination of system, and language software, all compiled by the same “C” compiler, for exactly this purpose, the whole project might fail badly, and be a terrific waste of time.

Gee whiz.

The choice of a single C compiler for each of these languages is such a vital choice that I beg you to let me know which one, if any, would be the proper choice. Does it work under UNIX and/or Windows? If only one or the other, does it compile and load and execute the necessary parts of PERL and “C” and “R” ? In other words, have you tested it? Does it work?

What I want to know is, has NVIDIA already done such a thing?

If not, then why not ?

If yourselves, or some other entity, have already done it, has it been written up, or has NVIDIA, or some other entity, produced a video which demonstrates and illustrates such a thing ?

If I were you, I would illustrate and widely advertise this kind of thing.

Of course, I would like it if I could get this kind of information for little or no cost. After all, I already own the pretty nice computer equipped with a cutting edge NVIDIA GPU. (Within reason, such a ready-made, off-the-shelf, system software installation would be worth almost as much, or perhaps even more, than a brand new workstation computer with the really nice hardware NVIDIA GPU installed.)

Sometimes, it’s software first.

In addition, a number of these interdependent software components are sufficiently difficult that I would like to have a decent tutorial that addresses all of them in detail before I would even start.

Meanwhile, I’m burning daylight. Time’s wasting.

Help.

But, even then, I might need even more handholding… unless such a thing were based upon a simple proven installation that is as simple as possible.

If neither yourselves nor any other entity with years of NVIDIA cooperation has done this work before, then, what would it cost, and what would the process be for me to visit the NVIDIA campus, or some other alternative method, and learn as best I can to construct such a thing myself, and be able to reproduce it on my home system for my own purposes?

Often, standardization invites mediocrity, but, in this case, some form of standardization invites a low barrier to usage that would strongly encourage the use of NVIDIA over almost any alternative.

However, if such a detailed system tutorial already existed, and it were successfully tested and deliverable on a system with a standard installation of “C”, “R”, PERL (and related modules), MYSQL, and so forth, on a UNIX or Windows system, and if such a system could be built on a well-installed hardware system with such an NVIDIA GPU as the 1080, the result would be that almost anyone could become a deep-neural-net data-analysis professional. Or at least a passable mimic of one. As a result, the ability to command a 6 figure salary doing such a job would be sufficient incentive to devotedly keep at the task until it was completed, and firmly insist upon using exclusively NVIDIA GPU hardware for their work.

For my sake, and for the sake of NVIDIA being able to sell thousands of such systems with open source systems software being able to reliably deliver useful data analysis at a fraction of the cost of “cloud” based systems (also without the risks associated with proprietary confidential company data being exposed to the risks of the “cloud”) could I please find out what progress NVIDIA has made in this regard?

You have had years to prepare for this question. What have you done about it ?

Imagine if you could simply provide a CD-ROM of such an installation disk to anyone with such an eventual neural-net or modeling product in mind? How many such systems would you sell ? How many organizations and students and professionals could you encourage to abandon their expensive and risky “cloud” methods forever, and instead rely upon an inexpensive work station, perhaps entirely off-the-grid, with an NVIDIA GPU or two to do the job ?

Nearly everybody with large data tasks of analysis would want to switch to such an easily used and inexpensive system with the added bonuses of reduced risks and inexpensive operation.

Imagine what a world it would be if basic open source systems software were all that someone needed to be able to do all of this kind of parallel processing systems development - for the entry price of an NVIDIA GPU?

Of course, I guess that I would be willing to pay some of the cost of the development of such a thing, though, of course, if I were to be asked to pay for most or all of it, that would be a potentially terrible hardship. Please give me a ballpark for the costs, if any, for such a thing. (I’m a disabled person, living on a fixed income.)

I’m afraid that even this kind of systems installation product would test my considerable coding capabilities to the max. Unless, …you employed customer service personnel with advanced expertise in it, and the ability to answer questions about it on the fly.

I don’t want to re-invent the wheel.

Or, in the alternative, I don’t want to try doing something this complicated that has never been done before, and could repeatedly fail without getting anywhere.

You understand, right ?

Where are we?

Thanks.

Michael

NVIDIA doesn’t provide any specific CUDA bindings for either R or PERL that I am aware of.

However the R community has created some connections between R and CUDA. These are easy to find with a google search.

For most languages that provide the ability to interface to C (python, java, etc. come to mind) it’s usually possible to use this interface to connect to CUDA also.

Other than that, I don’t have any ready made NVIDIA samples to direct you to. However this blog highlights both approaches (R GPU packages, and/or access via libraries or compiled interface):

https://devblogs.nvidia.com/accelerate-r-applications-cuda/

I’m doing some preliminary testing on the Jetson Nano Developer Kit / L4T with R. My plan is to build an R package with some applications. I’ve got a preliminary roadmap and have a few things working. You can follow along at https://znmeb.github.io/edgyR.

It should be noted that R 3.4 is in the Jetson Nano repositories and works, but I’ve been unable to get the current release, 3.6.2, to pass “make check” after compiling, and I’m abandoning that effort. I have RStudio Server working but can’t get RStudio Desktop to build because Qt 5 is too old on the L4T distro.

Arm64 Docker images work, but I haven’t found any supported ones for CUDA, just a few individual efforts with little documentation. I have two next steps planned:

  1. Test update-alternatives with the R in L4T to see if it can access the CUDA BLAS / LAPACK.
  2. See if I can access the Jetpack Python AI tools from R via reticulate.

Both of the R GPU packages noted in that blog post have been withdrawn from CRAN. However, there is a package, gpuR (https://cran.r-project.org/web/packages/gpuR/index.html) that can access an NVIDIA GPU if the OpenCL drivers are installed, which I believe is the default. I’ve tested this on an HP Omen with a GTX 1050 Ti in Arch Linux and it works.

As I noted in my previous post, another way to get at the NVIDIA tools from R is to access the Python bindings via the R package reticulate (https://rstudio.github.io/reticulate/). I’m pretty sure this will work for the Keras / TensorFlow stack, which is on my roadmap for edgyR.