Automatically creating device nodes

I’d like to reliably create device nodes for my Cuda cards.
There is a script that counts your number of cards by
lspci | grep -e nvidia -e 3D
But this doesn’t work very well for me, since my cards are not listed as “3D Controllers”.

What I currently do is
cards=$(lspci | grep VGA | grep -c nVidia)
but that’s not reliable either. On a system with a 7300 and a 295 this matches the 7300 and “half” the 295.

What I’d like to do is execute a devicequery, and awk the x from “NumDevs = x”, but that doesn’t work, because the devicequery won’t die without the user pressing .
Not very useful for automation.

So what’s the best way you know of, to solve this? :)

I would be incredibly surprised if you could run a useful device query before creating the devices CUDA needs to work…

Avid is correct - devicequery will fail if the drivers are not properly installed.

we use the following:

#!/bin/bash

timestamp=`date`

#  modprobe nvidia

if [ ! -c /dev/nvidiactl ]

then

echo "Server $machine device files re-created by crontab at $timestamp" >> ~/devicecrash.log

  # Count the number of NVIDIA controllers found.

  N3D=`/sbin/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`

  NVGA=`/sbin/lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`

N=`expr $N3D + $NVGA - 1`

  for i in `seq 0 $N`; do

  mknod -m 666 /dev/nvidia$i c 195 $i;

  done

  mknod -m 666 /dev/nvidiactl c 195 255

. /home/run/stopGSlaves.sh

else

echo "Files exists"

exit 1

fi

Change to your SDK /C/src/deviceQuery directory, open the deviceQuery.cpp file, comment-out the last line (// CUT_EXIT(argc, argv); ) and type “make”.

The command string would be

cards=$(deviceQuery | grep supporting | awk ‘{print $3}’) Of course, it is trivial to code:

## devcount.cpp

#include <stdlib.h>

#include <stdio.h>

#include <string.h>

#include <cuda.h>

#include <cuda_runtime_api.h>

#include <cutil.h>

int

main( int argc, char** argv)

{

	int deviceCount;

	cudaGetDeviceCount(&deviceCount);

	printf("%d\n", deviceCount);

}

nvcc -I=/usr/local/cuda_sdk/C/common/inc/ devcount.cpp -o devcount

 ./devcount

2

As the others have noted, the title of this thread makes no sense.

The proper way to do it is by opening the /dev/nvidiactl device node and running a NV_CARD_INFO ioctl(2) on it.

Decoding the results makes clear the number of devices. This information can then be used to mknod(2) the

appropriate /dev/nvidiaX nodes.

The nvidia module will be loaded by opening the nvidiactl device node, so long as char-major-195-* is properly

bound to nvidia in modprobe.d/aliases.conf.

The ‘cudaminimal’ binary built with my libcudest library performs these tasks when run as root (otherwise,

mknod(2) will fail on EPERM). You can see how libcudest handles things here:

http://github.com/dankamongmen/libcudest/b…er/src/cudest.c

Look at init_ctlfd() and get_card_count(). There’s more info on libcudest on its home page:

http://dank.qemfd.net/dankwiki/index.php/Libcudest

Observe:

mmas $ ls -l /dev/nvidia*

crw-rw-rw- 1 root root 195, 0 2010-07-08 06:50 /dev/nvidia0

crw-rw-rw- 1 root root 195, 1 2010-07-08 06:50 /dev/nvidia1

crw-rw-rw- 1 root root 195, 2 2010-07-08 06:50 /dev/nvidia2

crw-rw-rw- 1 root root 195, 3 2010-07-08 06:50 /dev/nvidia3

crw-rw-rw- 1 root root 195, 255 2010-07-08 05:32 /dev/nvidiactl

mmas $ sudo rm /dev/nvidia?

mmas $ ls -l /dev/nvidia*

crw-rw-rw- 1 root root 195, 255 2010-07-08 05:32 /dev/nvidiactl

mmas $ sudo out/cudaminimal

cuInit:362] CTL handle (/dev/nvidiactl) at fd 3

convert_version:281] Expecting version ‘256.35’

Verified version 256.35

init_ctlfd:325] PAT support: yes

get_card_count:236] Probing for up to 32 cards

get_card_count:261] Found device 1 ID #7 (IRQ 18)

get_card_count:263] Domain: 0 Bus: 7 Slot: 0

get_card_count:265] Vendor ID: 0x10de Device ID: 0x05e7

get_card_count:266] Flags: 0x0001

get_card_count:268] Framebuffer: 0x4000000 @ 0xb4000000

get_card_count:270] Register base: 0x01000000b @ 0xd4000000

get_card_count:261] Found device 2 ID #9 (IRQ 16)

get_card_count:263] Domain: 0 Bus: 9 Slot: 0

get_card_count:265] Vendor ID: 0x10de Device ID: 0x05e7

get_card_count:266] Flags: 0x0001

get_card_count:268] Framebuffer: 0x4000000 @ 0xb8000000

get_card_count:270] Register base: 0x01000000b @ 0xd8000000

get_card_count:261] Found device 3 ID #17 (IRQ 18)

get_card_count:263] Domain: 0 Bus: 17 Slot: 0

get_card_count:265] Vendor ID: 0x10de Device ID: 0x05e7

get_card_count:266] Flags: 0x0001

get_card_count:268] Framebuffer: 0x4000000 @ 0xa4000000

get_card_count:270] Register base: 0x01000000b @ 0xc8000000

get_card_count:261] Found device 4 ID #19 (IRQ 16)

get_card_count:263] Domain: 0 Bus: 19 Slot: 0

get_card_count:265] Vendor ID: 0x10de Device ID: 0x05e7

get_card_count:266] Flags: 0x0001

get_card_count:268] Framebuffer: 0x4000000 @ 0xa8000000

get_card_count:270] Register base: 0x01000000b @ 0xcc000000

Found 4 cards

init_dev:174] Device #0 PMC: 0x2000b @ 0xd4000000

init_dev:192] Device #0 handle (/dev/nvidia0) at fd 4

init_dev:200] Architecture: GA0

GPU returned error 12 on fd 3

init_dev:174] Device #1 PMC: 0x2000b @ 0xd8000000

init_dev:192] Device #1 handle (/dev/nvidia1) at fd 4

init_dev:200] Architecture: GA0

GPU returned error 12 on fd 3

init_dev:174] Device #2 PMC: 0x2000b @ 0xc8000000

init_dev:192] Device #2 handle (/dev/nvidia2) at fd 4

init_dev:200] Architecture: GA0

GPU returned error 12 on fd 3

init_dev:174] Device #3 PMC: 0x2000b @ 0xcc000000

init_dev:192] Device #3 handle (/dev/nvidia3) at fd 4

init_dev:200] Architecture: GA0

GPU returned error 12 on fd 3

Couldn’t initialize CUDA (101)

mmas $ !ls

ls -l /dev/nvidia*

crw-rw-rw- 1 root root 195, 0 2010-07-08 07:08 /dev/nvidia0

crw-rw-rw- 1 root root 195, 1 2010-07-08 07:08 /dev/nvidia1

crw-rw-rw- 1 root root 195, 2 2010-07-08 07:08 /dev/nvidia2

crw-rw-rw- 1 root root 195, 3 2010-07-08 07:08 /dev/nvidia3

crw-rw-rw- 1 root root 195, 255 2010-07-08 05:32 /dev/nvidiactl

mmas $

I hope this helps!

That code isn’t going to run if you’re lacking /dev/nvidia* devices, chief. You’ll draw a 100 or 101 error. Try it and see.

While this topic is pretty old, I figured I would post what I do to create nodes.

I created the following perl script and call it from my /etc/rc.local script

#!/usr/bin/perl -w

my $file = "/dev/nvidiactl";

unless ( -e $file )

{

    `mknod -m 666 $file c 195 255`;

}

my @gpus = `ls -1 /proc/driver/nvidia/gpus`;

foreach my $dev(@gpus)

{

    my $cmd;

    chomp($dev);

    $file = "/dev/nvidia$dev";

unless ( -e $file )

    {

        $cmd = "mknod -m 666 $file c 195 $dev";

        `$cmd`;

    }

}