My board is completely unusable. It locks up or it has graphical glitches then shows the start-up logo and goes back to the Ubuntu screen. This seems to happen in random time frames but I can’t do more than 3 minutes of anything without it logging into. I’m guessing something is wrong with my board who do I contact to get this resolved?
To be more clear there are two issues. One is a hard lockup which has happened pretty often but less often than the 3 minutes or less where it glitches and logs me out.
Have you checked if X.Org log (maybe /var/log/Xorg.0.log) or kernel log (through the serial port) are showing errors?
If you want to get the board replaced I guess you need to contact whoever you bought it from. If it was NVIDIA, there seems to be more info on that at: https://store.nvidia.com/store/nvidia/help#returns
@kulve. It is hard to look at the logs as it usually blows up before then either a hard reset or otherwise but I did see in the xorg log something along the lines of “failed to enable Nvidia Damage manager”
It seems the logouts are accelerating and I can barely get the log files open before it crashes. I can’t even get a copy in.
I got the board from Nvidia and I would think they would want to debug it either way in case it is common issue because it will likely happen to more people. Also I would rather not just to a refund and get stuck at the back of the backorder line.
So I tried contacting Live assistance which the Jetson TK1 board site says I should if there are problems and they just direct me back to here. Is there anyone on these forums from Nvidia who can confirm what I need to do to get support?
You should connect the serial port to your PC and start some terminal application (e.g. Minicom on linux) so that you can see the kernel prints from the Jetson as they are printed. Often kernel prints something before hanging and with the serial port you will see those even if the board freezes immediately after.
It would be helpful to see the issue, can you record what’s happening on your screen with a cell phone camera, then upload it and post a link? If it’s a damage issue, that sounds like it may be software-related, but it’s hard to know for sure without more information.
The serial output mentioned by kulve would be helpful as well. The nvidia-bug-report-tegra.log file generated by the nvidia-bug-report-tegra.sh command may also be useful.
Ok so I got a serial cable setup and monitored the output.
When the machine gliches and logs out and sends me back to the login screen this is on the terminal. NOTE: this only happens the first time every other time it doesn’t display anything on the terminal. The hard lockup that happens also does not put anything to the screen.
ubuntu@tegra-ubuntu:~$ [ 152.360179] vgaarb: this pci device is not a vga device
[ 153.374352] vgaarb: this pci device is not a vga device
[ 153.482586] vgaarb: this pci device is not a vga device
Ok so I got a serial cable setup and monitored the output.
When the machine gliches and logs out and sends me back to the login screen this is on the terminal. NOTE: this only happens the first time. All the other times I don’t get anything on the serial port. Also when it hard locks up nothing is sent.
[ 16.246395] vgaarb: this pci device is not a vga device
[ 16.451713] vgaarb: this pci device is not a vga device
ubuntu@tegra-ubuntu:~$ [ 152.360179] vgaarb: this pci device is not a vga device
[ 153.374352] vgaarb: this pci device is not a vga device
[ 153.482586] vgaarb: this pci device is not a vga device
As you probably noticed, you have a shell also over the serial (you can see the “ubuntu@tegra-ubuntu:” prompt). You can type e.g. “ls” in there. Is that shell also hang (i.e. you can’t type anything) when the system locks up?
Unfortunately I’m running out of ideas. I haven’t seen a situation where e.g. the numlock led on the USB led doesn’t toggle when you press the numlock key but still you have a working shell (i.e. the kernel is running fine).
One of my Jetson boards just started exhibiting problems that sound somewhat related to the ones described here. In my case, X starts, but unity crashes immediately, leaving no window manager. If I launch an xterm from a shell on the serial console by setting the DISPLAY environment variable, I can get an xterm up on the X11 display and use the console. If I look within .xsession-errors, I see these:
ubuntu@tegra-ubuntu:~$ cat .xsession-errors NvRmMemInit failed
Error: Can’t open /dev/nvhost-ctrl
*** Error in `/usr/lib/nux/unity_support_test’: double free or corruption (!pre*
Aborted
Script for ibus started at run_im.
Script for auto started at run_im.
Script for default started at run_im.
I see these kinds of messages on both the board that’s misbehaving, and on my other board at home that has (thus far) been working fine:
[ 1242.302178] vgaarb: this pci device is not a vga device
I see a fair number of those on both machines, so I don’t think those messages necessarily mean anything in terms of the misbehavior I’m seeing with unity crashing…
I was able to ‘apt-get install twm’ to put twm onto the machine so I can at least have a working window manager until I get past the unity crashes. This made the X11 console usable for the time being.
I just picked up my Jetson TK1 from Micro Center last night. I’m having the same problem as OP, Nvidia driver crashes and I see the Nvidia Beta logo flicker real quick and it drops me back at the unity login screen.
Anyone have any success in contacting anyone or getting the board swapped or fixing the driver bug?
EDIT: Turned out to be the cheap generic USB hub I picked up at Micro Center. I had my Razer Blackwidow Ultimate and Black Mamba plugged in. I unplugged it and switched to my logitech MK270 cheap wireless kb/mouse combo with the single dongle that I use on my OpenELEC Raspberry Pi. All the lag/stuttering and crashes to login screen stopped.
Could you provide as much information as possible about the “cheap generic USB hub”? Does is cause problems only with both the keyboard and the mouse connected or only with one of them?
Also do you see any possibly related messages in the kernel log (either over serial or by running the command “dmesg”)?
EDIT: The USB vendor and product IDs (VID:PID in lsusb) of the hub would be useful information too.
I installed a large Ubuntu patch set on Tuesday 5/27 and the X11 window manager crashes have now gone away and the system is quite a bit more stable than it had been. I would encourage others to run the latest system updates and see if they fix other problems.
I am using mechanical keyboard(cooler master quickfire) and mouse (logitech G500) with a generic usb hub. I get the same flashing with the nvidia log before logout.