Adding the default python profiler to the execution context works fine:
context.profiler = tensorrt.Profiler()
But this only prints the profiling information (time for each layer) to the console. Is there a way to programmatically get the computation time for each layer? I tried to write my own profiler by subclassing tensorrt.Profiler, but it seems that the report_layer_time() function is not even called?
class CustomProfiler(trt.Profiler):
def __init__(self, name):
super().__init__()
self.name = name
self.layers = {}
def report_layer_time(self, layer_name: str, ms: float):
print('Report layer {} = {}'.format(layer_name, ms))
self.layers[layer_name] = ms
# In the execution context
context.profiler = CustomProfiler('custom')
The attribute self.layers remains empty after execution. Line 8 is not printed. But it somehow still prints the computation time of each layer after execution (i.e. as if I was using the default profiler tensorrt.Profiler).
I’m looking for a Python implementation. I saw a C++ example in the samples (sampleNMT.cpp) and tried to reproduce it in Python. Here is the implementation in sampleNMT:
struct SimpleProfiler : public nvinfer1::IProfiler
{
struct Record
{
float time{0};
int count{0};
};
virtual void reportLayerTime(const char* layerName, float ms)
{
mProfile[layerName].count++;
mProfile[layerName].time += ms;
}
SimpleProfiler(
const char* name,
const std::vector<SimpleProfiler>& srcProfilers = std::vector<SimpleProfiler>())
: mName(name)
{...}
private:
std::string mName;
std::map<std::string, Record> mProfile;
};
// In main()
std::vector<SimpleProfiler> profilers;
if (gEnableProfiling)
{
profilers.push_back(SimpleProfiler("Host"));
profilers.push_back(SimpleProfiler("Encoder"));
profilers.push_back(SimpleProfiler("Decoder"));
profilers.push_back(SimpleProfiler("Beam shuffle"));
encoderContext->setProfiler(&profilers[1]);
generatorContext->setProfiler(&profilers[2]);
generatorShuffleContext->setProfiler(&profilers[3]);
}
...
encoderContext->execute(...) // Does this automatically call profilers[1].reportLayerTime() and populate profilers[1].mProfile?
Does Line 39 automatically call profilers[1].reportLayerTime() and populate profilers[1].mProfile? If that is the case, I don’t see why my Python implementation is not working. After executing the context, the dictionary self.layers remains empty.