World frame to pixel frame

Hi, I’m trying to project a 3D point in world frame to pixel frame. I assign red spheres in object above the table and take a picture from plcaed camera. However, the projected pixel result colored in red is not aligned with assigned spheres. The result is shown in

The code to project 3D to 2D is as following:

def get_kp_2D(kps, camera_pose, camK):
    """ transform 3D to 2D points


    kps_2d = []
    camK = np.concatenate((camK,np.array([[0.0,0.0,0.0]]).T),1)
    for k in range(len(kps)):
        kp_pose = np.append(kps[k][:3],1.0)
        # transform 3d to 2d
        camera_pose = pose_to_4x4(camera_pose)
        object_in_cam =, kp_pose.T)
        z = object_in_cam[2]
        kp_in_cam =, kp_pose.T)
        kp_2d = 1/z * camK, kp_in_cam)
        kp_2d = kp_2d.round().astype(np.int32)
        print("kp 2D:", kp_2d)
    kps_2d = np.array(kps_2d)
    return kps_2d

Hi @newuhe

Could you provide more details about the setup, e.g.: where the involved matrix come from, a minimal working example, etc…? Thanks for replying. The camera_pose is from

def get_camera_pose(self):
        """Get camera pose.

            camera_pose: 7d array
        stage = omni.usd.get_context().get_stage()
        prim = stage.GetPrimAtPath(self._viewport_window.get_active_camera())
        pose = omni.usd.get_world_transform_matrix(prim)    
        rot_mat = np.array(pose.ExtractRotationMatrix())     
        trans = np.array(pose.ExtractTranslation())
        camera_pose_matrix = np.zeros([4,4])
        camera_pose_matrix[:3,:3] = rot_mat
        camera_pose_matrix[:3,-1] = trans
        camera_pose_matrix[-1,-1] = 1
        # rotate 180 degree in x axis for convention
        camera_pose =
            pose_to_4x4(np.array([0.0, 0.0, 0.0, 180.0 / 180.0 * np.pi, 0.0, 0.0])),
        camera_pose[:3,-1] = camera_pose[:3,-1] * get_stage_units()
        return camera_pose

And the camK is from

def get_camK(self, flag="mat"):
        """Get camera intrinsics in for mat 3x3 matrix or list of size 4.

            flag: by default return 3x3 matrix, if set to list, return size-4 list, if set to
                    raw, return raw realsense data.
            camera intrinsic.
        camera_params = self._sd_helper.get_camera_params(self._viewport_window)
        near, far = camera_params["clipping_range"]
        width = self.config.width
        height = self.config.height
        horiz_aperture = camera_params["horizontal_aperture"]
        vert_aperture = horiz_aperture * height / width
        focal_length = camera_params["focal_length"]

        stage = omni.usd.get_context().get_stage()
        prim = stage.GetPrimAtPath(self._viewport_window.get_active_camera())
        focal_length = prim.GetAttribute("focalLength").Get()
        horizontal_aperture = prim.GetAttribute("horizontalAperture").Get()
        vertical_aperture = prim.GetAttribute("verticalAperture").Get()
        width, height = self._viewport_window.get_texture_resolution()
        fx = width * focal_length / horizontal_aperture
        fy = height * focal_length / vertical_aperture
        cx = width * 0.5
        cy = height * 0.5
        print("real camK:", np.array([[fx, 0.0, cx], [0.0, fy, cy], [0.0, 0.0, 1.0]])) 
        camK = np.array([[fx, 0.0, cx], [0.0, fy, cy], [0.0, 0.0, 1.0]])

        return camK

The kp_poses are fetched from

kp_poses = []
for i in range(4):

def get_object_pose(self, prim_path):
        """Get object pose.

            prim_path: prim path of object

            pose: 7-dim array of position and orientation quarternion
        prim = get_prim_at_path(prim_path)
        xprim = XFormPrim(prim.GetPath())
        position, orientation = xprim.get_world_pose()
        position *= get_stage_units()
        pose = np.concatenate((position, orientation))
        return pose

The problem may be the camera has distorsion?

Hi @newuhe

Sorry for my late response…
Could you please share the whole script (the functions are class methods and the pose_to_4x4 method is not defined above) and the flatten .usd for testing?