cuFFT callback, offset with multi-GPU

In a single-GPU application the offset passed to the callback of the inverse transform is the wave number. For a multi-GPU application, the docs say: “For multi-GPU transforms, the index passed to the callback routine is the element index from the start of data on that GPU, not from the start of the entire input or output data array”. How do I reliably recover the wave number in a multi-GPU application?