Question on GraspGen Output Validity with Custom Parallel Gripper (xArm Lite 6)

Hello NVIDIA GraspGen developers and community,

I’m using the GraspGen framework for 6-DOF grasp generation and am seeing unexpected results for the predicted grasp poses. I’d like to confirm if my setup and usage are appropriate for the framework.

1. My System and Hardware

Component Detail
GPU NVIDIA GeForce GTX 1660 Ti (6GB VRAM)
CUDA Version 12.8
Camera RealSense D435i
Robot System xArm Lite 6 Robot Arm
Gripper Custom Lite Gripper (a parallel, on/off type gripper)

2. My Workflow and Observation

  1. Input Preparation: I successfully capture the scene using the RealSense D435i, and then accurately perform segmentation to isolate the target object (testing with banana and spoon). I pass this object point cloud to GraspGen.

  2. GraspGen Execution: The framework runs completely without any error messages or warnings (no CUDA out of memory or runtime issues).

  3. Output Problem: GraspGen calculates a set of potential grasp poses. I select the Best 1 pose. However, this pose is visually strange and inappropriate for grasping the object—it’s not a suitable, safe, or plausible starting pose. (I will attach an image illustrating the bizarre pose the framework suggests).

  4. Goal: My current concern is not the execution (MoveIt), but the quality of the grasp pose itself. The predicted pose is so unusual that robot planning cannot even begin with a “suitable starting position.”

3. Core Questions on GraspGen Usage

  1. Gripper Model Matching: Since I am using a custom parallel gripper (Lite Gripper), I used the configuration file for a similar standard parallel gripper model (e.g., Robotiq-2F-140 or Franka Panda) provided in the GraspGen repository as a starting point. Is it necessary to fine-tune the model or perform additional calibration/setup even when just using the inference mode with a custom gripper that is geometrically similar to the provided models?

  2. Object Generalization: GraspGen is trained on large synthetic datasets. Are objects like a banana (highly curved, deformable shape) or a spoon (thin, high aspect ratio) particularly challenging for the pre-trained model? If so, are there specific preprocessing steps (e.g., density, normalization) that should be double-checked?

  3. Output Confidence: Given that the framework runs without explicit errors, does the Best 1 pose metric inherently account for the gripper type and object geometry to guarantee a plausible start pose? Or am I missing a critical inference parameter that filters out geometrically impossible grasps?

Any advice on validating the GraspGen configuration for a custom parallel gripper and non-standard objects would be highly appreciated. Thank you!