Concurrent Warp Group Execution in FA3: Tensor Core Resource Limitation?

Why is there only one warp group working at the same time in FA3? Are Tensor Core resources not supported for concurrent warp groups? Is this the case for all matrix sizes?