segmentation fault (core dumped)

I am getting segmentation fault (core dumped) error while running openai/gym examples like cart-pole , lunar-lader, etc. If i remove env.render then it’s working fine.

vi cart.py
import gym
import faulthandler
faulthandler.enable()
env = gym.make(‘CartPole-v0’)
for i_episode in range(20):
observation = env.reset()
for t in range(100):
env.render()
print(observation)
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done:
print(“Episode finished after {} timesteps”.format(t+1))
break

env.close()

Output
$python3 cart.py
Fatal Python error: Segmentation fault
Current thread 0x0000007fa6799010 (most recent call first):
File “/home/nvidia/.local/lib/python3.6/site-packages/pyglet/gl/lib_glx.py”, line 74 in link_GL
File “/home/nvidia/.local/lib/python3.6/site-packages/pyglet/gl/glx.py”, line 440 in
File “”, line 219 in _call_with_frames_removed
File “”, line 678 in exec_module
File “”, line 665 in _load_unlocked
File “”, line 955 in _find_and_load_unlocked
File “”, line 971 in _find_and_load
File “”, line 219 in _call_with_frames_removed
File “”, line 1023 in _handle_fromlist
File “/home/nvidia/.local/lib/python3.6/site-packages/pyglet/gl/xlib.py”, line 16 in
File “”, line 219 in _call_with_frames_removed
File “”, line 678 in exec_module
File “”, line 665 in _load_unlocked
File “”, line 955 in _find_and_load_unlocked
File “”, line 971 in _find_and_load
File “/home/nvidia/.local/lib/python3.6/site-packages/pyglet/gl/init.py”, line 221 in
File “”, line 219 in _call_with_frames_removed
File “”, line 678 in exec_module
File “”, line 665 in _load_unlocked
File “”, line 955 in _find_and_load_unlocked
File “”, line 971 in _find_and_load
File “/home/nvidia/packages/openai/gym/gym/envs/classic_control/rendering.py”, line 23 in
File “”, line 219 in _call_with_frames_removed
File “”, line 678 in exec_module
File “”, line 665 in _load_unlocked
File “”, line 955 in _find_and_load_unlocked
File “”, line 971 in _find_and_load
File “”, line 219 in _call_with_frames_removed
File “”, line 1023 in _handle_fromlist
File “/home/nvidia/packages/openai/gym/gym/envs/classic_control/cartpole.py”, line 150 in render
File “/home/nvidia/packages/openai/gym/gym/core.py”, line 275 in render
File “cart.py”, line 8 in
Segmentation fault (core dumped)

Hi,

Segmentation fault is usually triggered by memory-related issue.

Could you monitor your memory usage to see if any OOM issue?

sudo tegrastats

And please also execute your program with cuda-memcheck to check if each memory access is valid.

cuda-memcheck python3 cart.py

Thanks.

sudo ./tegrastats[/b] is working fine.
And also i am not running cuda so:
cuda-memcheck is no use and it’s not working.

I tried debugging with gdb.
$gdb python3
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright © 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type “show copying”
and “show warranty” for details.
This GDB was configured as “aarch64-linux-gnu”.
Type “show configuration” for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type “help”.
Type “apropos word” to search for commands related to “word”…
Reading symbols from python3…Reading symbols from /usr/lib/debug/.build-id/d9/15d7e8e05672ddea91e8185899af452d78b413.debug…done.
done.
(gdb)
(gdb) run cart.py
Starting program: /usr/bin/python3 cart.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/aarch64-linux-gnu/libthread_db.so.1”.
[New Thread 0x7fb650a1f0 (LWP 8543)]
[New Thread 0x7fb5d091f0 (LWP 8544)]
[New Thread 0x7fb35081f0 (LWP 8545)]
[Thread 0x7fb35081f0 (LWP 8545) exited]
[Thread 0x7fb5d091f0 (LWP 8544) exited]
[Thread 0x7fb650a1f0 (LWP 8543) exited]

Thread 1 “python3” received signal SIGSEGV, Segmentation fault.
0x0000007fb12f1670 in ?? () from /usr/lib/aarch64-linux-gnu/libGLX.so.0

Something may be attempting to use the wrong version of libGLX.so. Do you see all “ok” with:

sha1sum -c /etc/nv_tegra_release

Yes, I already tried it.
Here is the output
$ sha1sum -c /etc/nv_tegra_release
/usr/lib/aarch64-linux-gnu/tegra/libnvmmlite_utils.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvphsd.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libtegrav4l2.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvtestresults.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmm_parser.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvfnetstorehdfx.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvtvmr.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvtracebuf.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvfnet.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvidia-egl-wayland.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcamlog.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_generic.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmedia.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvphs.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvargus.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnveglstreamproducer.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcamv4l2.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcapture.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvdla_runtime.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_graphics.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvargus_socketclient.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnveventlib.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_il.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_force.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvos.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcameratools.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmmlite_image.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvavp.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmm_utils.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvrm_graphics.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvargus_socketserver.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvll.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_camera.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcolorutil.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvdc.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libsensors.hal-client.nvs.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvddk_vic.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvosd.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcam_imageencoder.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvisp_utils.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_gpucompute.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmmlite.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvdla_compiler.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_boot.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvrm_gpu.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvomxilclient.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvodm_imager.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnveglstream_camconsumer.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvimp.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libsensors_hal.nvs.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmm.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvparser.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_tbc.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmm_contentpipe.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvwinsys.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvapputil.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvtx_helper.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvtnr.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_spincircle.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvcamerautils.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvddk_2d_v2.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvjpeg.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvgov_ui.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvfnetstoredefog.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvexif.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvscf.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvmmlite_video.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libsensors.l4t.no_fusion.nvs.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvomx.so: OK
/usr/lib/aarch64-linux-gnu/tegra/libnvrm.so: OK
/usr/lib/aarch64-linux-gnu/libv4l/plugins/libv4l2_nvvidconv.so: OK
/usr/lib/aarch64-linux-gnu/libv4l/plugins/libv4l2_nvvideocodec.so: OK
/usr/lib/xorg/modules/extensions/libglxserver_nvidia.so: OK
/usr/lib/xorg/modules/drivers/nvidia_drv.so: OK

So just to re-emphasize this error:

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x0000007fb12f1670 in ?? () from /usr/lib/aarch64-linux-gnu/libGLX.so.0

Looks like you have the correct libGLX.so…and so perhaps your python program wants a different version? Don’t know. I don’t know much about python and your particular program, but if it isn’t an outright bug in the program doing something wrong, then it is instead probably using a different release version (e.g., it isn’t linked correctly, or else it is using the wrong standard). You might want to investigate the required versions of each component.

I followed these steps for gym installation:

sudo apt install python3-pip
git clone https://github.com/openai/gym
cd gym/
sudo apt install -y python3-dev zlib1g-dev libjpeg-dev cmake swig python-pyglet python3-opengl libboost-all-dev libsdl2-dev libosmesa6-dev patchelf ffmpeg xvfb

sudo apt-get install gcc gfortran python3-dev libopenblas-dev liblapack-dev cython

Thanks

Someone else will have to answer if those packages are valid, but just to speculate, perhaps SDL or Mesa expect something different than provided by the NVIDIA libGLX.so. If not, then probably the code itself has done something wrong.

Hi,

Could you try this installation script for gym:
https://github.com/dusty-nv/jetson-reinforcement/blob/master/CMakePreBuild.sh

We have verified this on Jetson and it should work normally.
Thanks.

Normal pip is not working it throws the error:
pip --version
Traceback (most recent call last):
File “/usr/bin/pip”, line 9, in
from pip import main
ImportError: cannot import name main

pip3 is installed properly.

Hi,

Try this:

Edit file ‘/usr/bin/pip’

diff --git a/pip b/pip
index 56bbb2b..62f26b9 100755
--- a/pip
+++ b/pip
@@ -6,6 +6,6 @@ import sys
 # Run the main entry point, similarly to how setuptools does it, but because
 # we didn't install the actual entry point from setup.py, don't use the
 # pkg_resources API.
-from pip import main
+from pip import __main__
 if __name__ == '__main__':
-    sys.exit(main())
+    sys.exit(__main__._main())

Update to pip3 for your use case.

Thanks.

Issue mentioned in comment #1 is fixed by following changes:

sudo pip3 install pyglet==1.3.1
sudo sed -i 's/_have_getprocaddress = True/_have_getprocaddress = False/' /usr/local/lib/python3.6/dist-packages/pyglet/gl/lib_glx.py

Thanks.

Thanks, It worked.