Superpixels freespace segmentation for autonomous data collection

Hey everyone,
I’m trying to get a freespace DNN to work in my environment (following this tutorial), and need autonomous real-world image freespace labeling because I can’t work with simulation.

I need help with the following problems I’m facing :

  • I can’t find documentation on isaac::superpixels::RgbdSuperpixelFreespace component, and can’t seem to obtain good freespace segmentation results. I’m especially wondering how to tweak height_angle and height_tolerance parameters, and what’s their role.

  • From what I understand, this component also needs pose between camera and robot, the default one being [0.270598, -0.653281, 0.653281, -0.270598, 0.0, 0.0, 0.775]. What does this correspond to exactly? How could I tweak this for usage with Kaya?

  • When I try to run the training script on a prerecorded log containing depth and color image from the Realsense camera, the sample buffer successfully fills up, but I get this error :
    Number of samples in sample buffer: 0
    Number of samples in sample buffer: 1
    2020-06-22 14:48:22.626 ERROR ./messages/image.hpp@74: Image element type does not match: actual=1, expected=2
    This doesn’t crash the program but I still would like to fix it.

Can somebody help me with that?

Also I noticed that in packages/freespace_dnn/apps/freespace_dnn_data_annotation.subgraph.json , value for “label_invalid” parameter was originally 0.2, but this parameter expects an int, which was also giving me an error.

Sill stuck on this, I really would need some help.
I’d like at least some documentation on RgbdSuperpixelFreespace component…

Hi there,

Sorry for the delay. Will make sure to get back to you soon!

1 Like

Hello atorabi,
Any update? :-)

Hello, thank you for the interest in freespace segmentation and apologies for the delay! Please find the answers to your queries below -

  1. The superpixels are transformed into the ground coordinate frame by assuming that the ground plane will be Z = 0 in the ground coordinate frame.
    The height_tolerance parameter is a threshold on the height (z axis) between the superpixel position and ground plane to decide if a superpixel is part of the ground plane. This helps us provide some flexibility in deciding which pixels are part of the ground plane.
    The height_angle parameter helps increase the height_tolerance based on the distance of the superpixel from the robot. The sine of this angle is added to the height_tolerance.

  2. The default pose was given as an example pose transformation between Carter and its camera. The first 4 values in the pose corresponds to the rotation transform between the camera and the robot (in quaternion form), while the last 3 correspond to the translation (X, Y and Z). Isaac SDK also supports a more straightforward form of specifying the pose in the following format-
    “PoseInitializer”: {
    “lhs_frame”: “camera”,
    “rhs_frame”: “robot”,
    “pose”: {
    “translation”: [1.0, 0.0, 0.0],
    “rotation”: {
    “yaw_degrees”: 90.0
    In this way, the translation values and rotation angles can be directly filled in for Kaya.

  3. Would you mind sharing if you’ve seen this affect any part of the application, for example, the viewers in Sight? This would help us narrow down the source!

thanks for your answer.

It doesn’t seem to affect any other part of the app (packages/freespace_dnn/apps:freespace_dnn_training), there is no other errors, the app creates network checkpoints regularly and ends when there is no more image in the replay file. In Sight, I can see the images used for training (color - depth - segmentation). I have no idea if the training is actually taking place, I didn’t test the resulting networks as I would like to have freespace segmentation working first for correct annotation.

EDIT : I forgot that the app was opening a tensorflow instance, I went and checked it : I can correctly see input_image, however ground_truth is completelly black! So apparently the problem is linked to this, and training is not taking place…

Also, as you can see I have not managed to make segmentation work, even after changing the pose. Note that I don’t have Kaya built at the moment, I’m using the Realsense on a small tripod, but once it will be used on Kaya, pose should be similar.

(freespace segmentation corresponds to the green pixels in segmentation_viewer window)

Do you think these results are because of wrong pose, or wrong freespace parameters (height_angle, height_tolerance, or maybe normal_threshold?)

an update on this topic :

  • I managed to get good superpixels freespace segmentation, by using the folowing pose :
"left_rgb_pose": {
        "lhs_frame": "camera",
        "rhs_frame": "robot",
        "pose": {
          "translation": [1.0, 0.0, 0.0],
          "rotation": {

in packages/freespace_dnn/apps/freespace_dnn_data_annotation.subgraph.json for the camera pose. I did not need to change height_tolerance and height_angle parameters.

  • However, I still have the error :
    ERROR ./messages/image.hpp@74: Image element type does not match: actual=1, expected=2
    when launching the training script (bazel run packages/freespace_dnn/apps:freespace_dnn_training).

I can see the superpixel segmentation in Sight, but I can’t see “ground_truth” image when openning tensorboard for the training, which probabely means the segmentation images are not properly sent to the isaac::ml::SampleAccumulator node.

I looked up in Isaac’s code (engine/core/tensor/element_type.hpp), and apparently Image type 1 is UInt8 , and Image type 2 is UInt16.
So there is something wrong somewhere with the type of the segmentation image.

Finally I looked up the content of the SegmentationCameraProto message sent by isaac::superpixels::SuperpixelImageLabeling component in packages/freespace_dnn/apps/freespace_dnn_data_annotation.subgraph.json which does the ground truth segmentation, and all of its attributes are empty (InstanceImage, Labels, Pinhole), except for LabelImage, which surely contains the segmentation I can see in Sight.
Maybe that’s why it is not working??
Is there any tool I could use to see which message and which attribute generates the error?

I need to get this working by the end of next week, overwise I will have to use Superpixel freespace segmentation, but I really don’t want to do that, I want to use a neural network for this.

Help please :)

I have figured what was wrong : I used the same freespace_dnn_training config file than for a simulation example.
However, using real data and super pixels segmentation in my case, the ground truth segmentation does not contain the labels, as I previously mentioned.

The error that I get is from the segmentation viewer, which probably does not like that the SegmentationCameraProto message has no label or instance image, maybe. But basically I don’t care about this message anymore.

I modified the freespace_dnn_training_unity3d_rng_warehouse.config.json config parameters for segmentation_encoder node from this :

"training.segmentation_encoder": {
    "SegmentationEncoder": {
      "class_label_names": ["floor"],
      "offset": 1

to this :

"training.segmentation_encoder": {
    "SegmentationEncoder": {
      "class_label_indices": [1],
      "input_mode": "NoLabelsAvailable",
      "offset": 0

In my superpixel freespace segmentation subgraph, I chose to label freespace to 1, and obstacle to 0.

With this, I can now see the ground truth image in TensorBoard, and the loss varies!! Yay!

BUT I have a new problem : the model does not learn, and the loss does not go down. It simply loops, as the training images loops (I have turned on looping in the replay node).
What could be the reason? I’m not sure I configured the segmentation encoder correcty, or that the parameters for the training script are good :

  "app_filename": "packages/freespace_dnn/apps/",
  "batch_size": 4,
  "class_weight": 10,
  "cols": 512,
  "data_points": 4000000,
  "gpu_memory_usage": 0.6,
  "initial_sample_count": 20,
  "learning_rate": 1e-3,
  "load_from_checkpoint": false,
  "num_classes": 1,
  "num_gpus": 1,
  "rows": 256,
  "save_every": 50,
  "sleep_duration": 1,
  "steps_per_eval": 1,
  "summary_every": 10,
  "timeout_duration": 3,
  "train_logdir": "/tmp/path_segmentation/",
  "use_mirrored_strategy": false

Anyone knows what could be the issue?

Same problem here that the loss is not converging… Did you solve this?

Hey! Sorry if what I’m saying is not really clear, it was a while ago and I’m not on this project anymore.

I did not really solve it, but I went around the error by training as if I had 3 classes.
I would have liked to use 2 classes (one for freespace, one for non-freespace), but it caused errors with dimentionnality of input tensors I think.

Concretlely, I modified freespace_dnn_training.config.json :

"num_classes": 3,

and in freespace_dnn_training_realdata.config.json :

"class_label_indices": [0, 1],

Using 3 classes, the training actually works and the loss goes down.

In my applications using the network, I would only use the output segmentation image channel corresponding to the freespace segmentation…
This is an horrible solution but I didn’t have time to look more into this.