Image preprocess question

deepstream provide image preprocess method: y = net-scale-factor * (x - offsets[c]).
But I need the “std” such as std=[0.229, 0.224, 0.225],Then the formula: y = (x - mean[c] ) / std[c] .
I try to use the middle value 0.225, then: y = (1 / 255 / 0.225 ) * (x - mean[c] ),But the results are quite different.
What should I do?

Hi,

Please share where you see the formulas. Looks specific to certain models?

Hi,

1.
Assuming uniform distribution for an image(the probability of each value is equal), the std is roughly 74.05 with mean = 127.5.

2.
If convert the RGB value to [0,1], the std is near 0.578 with mean = 0.0.

v’ = (v-127)/128

3.
So for 0.229:

| v’ - m’ | ^ 2 ~ (0.578)^2
| v’’ - m’’ | ^2 ~ (0.229)^2

v’’ ≈ v’ / (0.578) * (0.229)
v’’ ≈ ( v -127 ) / 128 / 0.578 * 0.229

Mean = 127
net-scale-factor = 1/128/0.578*0.229 = 0.003095

Thanks.

Hello, you didn’t fully understand what I mean. My std has three different values : [0.229, 0.224, 0.225].
You only transform calculated one. The deepstream accept input only one “net-scale-factor” value, you can’t calculate the other two values(0.225, 0.224) meanwhile and input to deepstream.
Can deepstream solve this problem at present ?

Hi,

The std difference between channel is small so maybe set an average value will be fine.
The other two value you can calculate as a similar way.

To support different scale value on the channel, you can modify the source here directly.

/opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_conversion.cu

Ex. NvDsInferConvert_C3ToP3Float

__global__ void
NvDsInferConvert_CxToP3FloatKernel(
  ...)
{
    unsigned int row = blockIdx.y * blockDim.y + threadIdx.y;
    unsigned int col = blockIdx.x * blockDim.x + threadIdx.x;

    if (col < width && row < height)
    {
        for (unsigned int k = 0; k < 3; k++)
        {
            outBuffer[width * height * k + row * width + col] =
                scaleFactor * inBuffer[row * pitch + col * inputPixelSize + k];
        }
    }
}

Thanks.