I define my own section file. The raw metrics work correctly and print well. But as to the customized metrics, e.g. derived__double_floating_point_sum
, they don’t output proper result as expected.
axpy(float, float*, float*), 2022-Oct-08 11:18:27, Context 1, Stream 7
Section: Arithmetic Intensity
---------------------------------------------------------------------- --------------- ------------------------------
Sum. double Floating-point operation (!) n/a
Bytes from DRAM Kbyte 3.33
Bytes from L1 byte 64
Bytes from L2 Kbyte 33.54
Double floating-point Add inst 0
Double floating-point Fma inst 0
Double floating-point Mul inst 0
Single floating-point Add inst 0
Single floating-point Fma inst 0
Single floating-point Mul inst 4
Half floating-point Add inst 0
Half floating-point Fma inst 0
Half floating-point Mul inst 0
---------------------------------------------------------------------- --------------- ------------------------------
In this case, the cutomization-defined metric derived__double_floating_point_sum
outputs (!)n/a
. Does anyone have idea about it ? Thanks in advance.
The section file as follow:
Identifier: "Arithmetic_Intensity"
DisplayName: "Arithmetic Intensity"
Description: "Overview of arithmetic intensity and its raw metrics"
MetricDefinitions{
MetricDefinitions{
Name: "derived__double_floating_point_sum"
Expression: "sm__sass_thread_inst_executed_op_dadd_pred_on.sum + sm__sass_thread_inst_executed_op_dmul_pred_on.sum + sm__sass_thread_inst_executed_op_dfma_pred_on.sum * 2"
}
MetricDefinitions{
Name: "derived__single_floating_point_sum"
Expression: "sm__sass_thread_inst_executed_op_fadd_pred_on.sum + sm__sass_thread_inst_executed_op_fmul_pred_on.sum + sm__sass_thread_inst_executed_op_ffma_pred_on.sum * 2"
}
MetricDefinitions{
Name: "derived__half_floating_point_sum"
Expression: "sm__sass_thread_inst_executed_op_hadd_pred_on.sum + sm__sass_thread_inst_executed_op_hmul_pred_on.sum + sm__sass_thread_inst_executed_op_hfma_pred_on.sum * 2"
}
MetricDefinitions{
Name: "derived__floating_point_sum"
Expression: "sm__sass_thread_inst_executed_op_dadd_pred_on.sum + sm__sass_thread_inst_executed_op_dmul_pred_on.sum + sm__sass_thread_inst_executed_op_dfma_pred_on.sum*2 + sm__sass_thread_inst_executed_op_fadd_pred_on.sum + sm__sass_thread_inst_executed_op_fmul_pred_on.sum + sm__sass_thread_inst_executed_op_ffma_pred_on.sum*2 + sm__sass_thread_inst_executed_op_hadd_pred_on.sum + sm__sass_thread_inst_executed_op_hmul_pred_on.sum + sm__sass_thread_inst_executed_op_hfma_pred_on.sum*2"
}
}
Header{
Metrics{
Label: "Bytes from DRAM"
Name: "dram__bytes.sum"
}
Metrics{
Label: "Bytes from L1"
Name: "l1tex__t_bytes.sum"
}
Metrics{
Label: "Bytes from L2"
Name: "lts__t_bytes.sum"
}
Metrics{
Label: "Double floating-point Add"
Name: "sm__sass_thread_inst_executed_op_dadd_pred_on.sum"
}
Metrics{
Label: "Double floating-point Mul"
Name: "sm__sass_thread_inst_executed_op_dmul_pred_on.sum"
}
Metrics{
Label: "Double floating-point Fma"
Name: "sm__sass_thread_inst_executed_op_dfma_pred_on.sum"
}
Metrics{
Label: "Single floating-point Add"
Name: "sm__sass_thread_inst_executed_op_fadd_pred_on.sum"
}
Metrics{
Label: "Single floating-point Mul"
Name: "sm__sass_thread_inst_executed_op_fmul_pred_on.sum"
}
Metrics{
Label: "Single floating-point Fma"
Name: "sm__sass_thread_inst_executed_op_ffma_pred_on.sum"
}
Metrics{
Label: "Half floating-point Add"
Name: "sm__sass_thread_inst_executed_op_hadd_pred_on.sum"
}
Metrics{
Label: "Half floating-point Mul"
Name: "sm__sass_thread_inst_executed_op_hmul_pred_on.sum"
}
Metrics{
Label: "Half floating-point Fma"
Name: "sm__sass_thread_inst_executed_op_hfma_pred_on.sum"
}
Metrics{
Label: "Sum. double Floating-point operation"
Name: "derived__double_floating_point_sum"
}
}