We have a long history of GPU architectures to judge and forecast performance from, nothing is affirmed of course, but it's worth going through the motions to predict where performance will lie given what we already know from past experiences.
Furthermore, Xe-LP still retains the abysmal max 1 primitive per clock rate, and worse yet, it lacks all of the features from DX12U, except hardware RT.
Intel removed hardware scoreboarding from Gen11, which wasn't really that effective there to begin with. Gen11 had one Thread Control unit handling 2 ALUs, each ALU had control over 4 FP32 instructions, so in total each Thread Control unit had access to 8 FP32 instructions, which I would call a pretty weak arrangement to begin with. Intel didn't change this arrangements in Xe-LP, instead it allowed each Thread Control unit to supervise 16 FP32 instructions now, further weakening their already weak position.