Instruction types

Regarding the bit-convert instructions, I don’t see a clear reason about “bit”-“conversion” and all I see are some type conversions. For example, float to integer (F2I) and others.

What about these?
BFE Bit Field Extract
BFI Bit Field Insert

Moreover, I see texture instructions in [1] for Maxwell, but there is no metric for that in nvprof.
Can someone shed a light on that?

[1] https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#maxwell-pascal

It’s not clear to me what you are asking. What does ‘clear reason about’ mean?

F2I and I2F are instructions that perform real conversions, they do not facilitate type re-interpretation of bit patterns. Since GPUs use the same 32-bit registers for floating-point and integer data, type-reinterpretation is for free (a no-op at machine code level).

BFE and BFI are integer instructions. Are you asking what their exact semantics are? PTX has instructions bfe and bfi that correspond to these machine instructions (one-to-one, best I know), see their descriptions in the PTX manual.

Are you asking whether there is a profiler metric that returns a count of texture instructions (TEX*)? I can’t find one. There are various texture related performance metrics which seem much more relevant to programmers optimizing code. A counter for texture instructions either does not exist in the hardware, or is not exposed in the profiler [possible reasons: nobody finds it useful, nobody has requested it, the HW counter is unreliable]

So, i was expecting about bit flip for “bit”-“conversion” expression. Kind of 2’s complement or similar things in mathematical calculations.

Assume there is a 32 bit pattern such as 0011_1111_1100_0000_0000_0000_0000_0000.
By type re-interpretation of bit patterns, you are saying that we have to determine the sign, exponent and fraction bits or interpreting that as a signed integer. Right?
And you are insisting that I2F doesn’t do that. Right?
Then by real conversions, you are saying that with F2I, such floating point number (s=0, exp=127, frac=1) which is (1.1*2^0=1.1) is converted to integer which is 1. Right?

I just wanted to be sure that whether these bit related instructions are classified as bit-convert instructions or not.

Yes.

That would be a NOT (inversion, one’s complement) or a NEG (negation, two’s complement). Nothing to do with F2I or I2F. NOT and NEG are pre-processing options on various GPU integer instructions or are resolved implicitly (e.g. as part of LOP3), so they do not appear as separate instructions in disassembled code.

reinterpret_as_float (0x3f800000) = 1.0f // this resolves to a no-op at GPU machine code level
convert_to_float (0x3f800000) = 1.06535322e+9f // this is a real conversion of the kind I2F performs

The very instruction tables you pointed to in your original post show BFE and BFI under the heading “integer instructions”. What is unclear about this?

OK. For the second, you are actually using I2F and F2I is what I wrote. Do you confirm that. I want to be sure that I have understood correctly.

As a side note, that number should be 1.5 and not 1.0 according to IEEE since the fraction is 1000… we know that IEEE drops the 1 in left of the point. So, an implicit 1 should be taken into account which leads to 1.1*2^0
Was that a typo or something else that I am missing. Appreciate if you clarify that.

Well, you can of course also do it in reverse:

reinterpret_as_int (1.0f) = 0x3f800000 // this resolves to a no-op at GPU machine code level
convert_to_int (1.06535322e+9f) = 0x3f800000 // this is a real conversion of the kind F2I performs

I simply picked 1.0f for my examples, I didn’t look at your bit pattern until now. Your bit pattern is 0x3fc0000, which is 1.5f.

No need to explain IEEE-754 floating-point formats; I spent years working on the design of floating-point units for x86 processors and writing mathematical software (including low-level emulation code) for GPUs, x86, ARM, SPARC, and PowerPC platform in a professional capacity.